reader - PDF Free Download

Thema 2: Modelle und Modellierungssprachen - Quelle 1: Engineering Modeling Languages, Seiten 40 bis 52 - Quelle 2: DSLs in Action, Seiten 28 bis 49

CHAPTER

1

What’s a Model? CONTENTS Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modeling in Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modeling in Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Illustrative Example: Cellular Automata . . . . . . . . . . . . . . . . . . 1.4.1 Cellular Automaton Topology . . . . . . . . . . . . . . . . . . . . . 1.4.2 Cellular Automaton Evolution Rules . . . . . . . . . . . . . . 1.4.3 Modeling Cellular Automata . . . . . . . . . . . . . . . . . . . . . . Semantic Foundations of MDE: the Meaning of Models . . 1.5.1 Basics of Denotational Semantics . . . . . . . . . . . . . . . . . . 1.5.2 Underspecification and Interpretation in the Real World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.3 Operations on Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1 1.2 1.3 1.4

1.5

1.6

2 4 5 9 10 10 11 14 14 15 16 17

his chapter introduces the notion of the model as an abstraction of an aspect of reality built for a given purpose. It also introduces the cellular automata that are used as running examples throughout the first part of this book. It concludes with a discussion on the semantic foundations of MDE. After reading this chapter you will:

T

– have an understanding of the different roles models play in science and engineering; – understand how modeling of software-intensive systems handles problem and solution complexity through separation and abstraction of concerns; – be familiar with cellular automata and how they can be modeled; and

Reader Seite 2

"Engineering Modeling Languages" Seite 40

1

2 Engineering Modeling Languages

– have a basic understanding of the semantic foundations of MDE.

1.1 INTRODUCTION Building a model of some real-world phenomenon seems to be an intrinsic human technique to understand and predict occurrences in the world. Many disciplines use models to help navigate through complex concepts in order to gain a better understanding of their target of study. For example, physicists build models to postulate and examine consequences of the big bang; chemists use models to study atoms and molecules; mathematicians use models to study the abstract nature of numbers, functions, and other concepts; mechanical and electrical engineers build models of engines to predict how they will behave when built; architects build models to analyze structural integrity of buildings, sociologists and psychologists use models to understand human behavior; biologists use models to understand the impact that a phenomena have on a particular species; medical researchers build models to better understand human bodies and the brain; economists use models to describe and predict market behaviors, geologists and climatologists use models to understand the behavior of the Earth and climate; and finally, astronomers use models to better understand the universe. While it may seem to some in the MDE community that modeling was invented by computer scientists or software engineers, clearly modeling has long been an essential mechanism for coping with the complexity of reality. In fact, one can venture to say that models have been used since the dawn of humanity, where early humans used cave drawings to communicate, for example, hunting instructions to novices or to establish communication with the spirit world (e.g., see http:// www.bradshawfoundation.com/clottes/index.php). However, the use of models in software engineering has unique characteristics. First, models are used to provide explicit form to the immaterial nature of software (as will be discussed later, source code can be considered to be a model). Second, the medium for expressing models and the medium for defining software implementations are one and the same, and thus software developers do not have the additional complexity of changing medium when moving from models to implementations. This has led to considerable work on automated model transformations (in particular, code generators) in the MDE domain. However, the common medium does not necessarily imply Reader Seite 3

"Engineering Modeling Languages" Seite 41

What’s a Model? 3

that the task of generating reliable implementations from models is straightforward. In fact, such generation may sometimes be more complex than transforming paper-based artifacts into physical artifacts such as bridges. Software is considered by many to be very malleable (easy to change) and there are no physical laws that restrict what computable behaviors can be implemented, and thus software often takes on many responsibilities that are anticipated to change over the software’s lifetime, even when those responsibilities can be handled by hardware. In addition, large software systems that play critical roles in society and that serve a diverse user base are required to have many qualities, some of which may be at odds with each other. These and other factors related to the growing complexity of software makes generating complete implementations from models challenging. On the other hand, producing implementations without the help of models is a considerably more complex task; that is, it leads to significant accidental complexity. The above indicates that modeling plays a much more important role in software development than in other disciplines, and is (or at least can be) more intricately connected to the final software implementation. Therefore, it is worth answering the following questions in the context of software development: – What is a model? – When and why should models be used? – What is the benefit of modeling? – What are the relationships between models and the real products? – Which tools allow us to successfully use models? – How do we build such tools? To help answer these questions, we compare modeling in software development with modeling in other disciplines because, on the one hand, some phenomena are similar to those found in software development, and on the other hand, we will build more and more software-intensive systems that are engineered in an interdisciplinary manner, that is, systems that involve developers from science, engineering, and other disciplines. Examples of software systems that

Reader Seite 4

"Engineering Modeling Languages" Seite 42

4 Engineering Modeling Languages

span disciplines are the currently upcoming cyber-physical systems [87]. While in science, models are used to describe and predict existing phenomena of the real world, in engineering, models are used to describe a system that is to be developed in the future. Thus engineering models are typically constructive while science models are descriptive. In the following, we more closely examine the way models are used in science and engineering.

1.2 MODELING IN SCIENCE Scientists handle the complexity of the phenomena they are studying through modeling. According to Joshua M. Epstein [36]: The modeling enterprise extends as far back as Archimedes; and so does its misunderstanding. Anyone who ventures a projection, or imagines how a social dynamic— an epidemic, war, or migration—would unfold is running some model. But typically, it is an implicit model in which the assumptions are hidden, their internal consistency is untested, their logical consequences are unknown, and their relation to data is unknown. But, when you close your eyes and imagine an epidemic spreading, or any other social dynamic, you are running some model or other. It is just an implicit model that you haven’t written down. Of course, to be useful as a means of communication, models have to be made explicit, that is, assumptions about the world must be made explicit, and communicated in a language that can be understood by other stakeholders. According to Jeff Rothenberg [127]: Modeling, in the broadest sense, is the cost-effective use of something in place of something else for some cognitive purpose. It allows us to use something that is simpler, safer or cheaper than reality instead of reality for some purpose. A model represents reality for the given purpose; the model is an abstraction of reality in the sense that it cannot represent all aspects of reality. This allows us to deal with the world in a simplified manner, avoiding the complexity, danger and irreversibility of reality.

Reader Seite 5

"Engineering Modeling Languages" Seite 43

What’s a Model? 5

Stachowiak provides a more detailed definition of modeling [135], in which the three main characteristics of a model are described as follows: – There is an original. – The model is an abstraction of the original. – The model fulfills a purpose with respect to the original. Scientific models are typically used to understand the real world and predict some aspects of the real world: using Newton’s laws of gravitation we can predict the time it will take for an apple to fall from a tree. That led the philosopher K. Popper to the characterization of scientific theories as falsifiable models [121], i.e., models that can be compared to some observable reality to assess whether and where they fit. Thus the original is part of this world. Scientists abstract away from complex details and typically the models they construct only hold within certain boundaries that need to be explicitly understood too. For example, Newton’s laws of gravitation only hold near the surface and for objects of certain sizes. Some models are known to be wrong in the general case, but still explain certain phenomena quite well, e.g., Kepler’s geocentric model of the solar system. Abstraction always means that certain properties are lost while (hopefully) the relevant ones are captured in enough detail to fulfill the model’s purpose. Whether a model is helpful can therefore only be answered with knowledge about its purpose.

1.3 MODELING IN ENGINEERING Like other scientists, engineers (including software engineers) use models to address complexity. However a key difference is that the phenomenon (engine, process, software, building) they model generally does not exist at the time the model is built, because the engineers’ goal is to build the phenomenon according to the model. That is, the model acts as a blueprint. In this book we distill the various notions of models to obtain the following short definition of model that is applicable to software development: Definition 1.1 (Model) A model is an abstraction of an aspect of reality (as-is or to-be) that is built for a given purpose. Reader Seite 6

"Engineering Modeling Languages" Seite 44

6 Engineering Modeling Languages

As an aside, the word model was knowingly used for the first time for 1:10 versions of cathedrals in medieval Italy. Such a 1:10 miniature could still be analyzed, e.g., to walk in and see the light flow and thus customers could understand what they would buy. The building’s stability could only be roughly estimated, but that was still better than nothing. In engineering, one wants to break down a complex system into as many models as needed in order to address all the relevant concerns in such a way that they become understandable, analyzable and finally can be constructed. This separation of concerns engineering principle has stood the test of time. In the remainder of this section we will focus on how modeling supports abstraction and separation of concerns principles for complex software systems, that is, complex systems in which software plays a primary role. Complex software systems are often required to interact with elements in their environments, and thus two major types of models can be produced in an engineering project: – Models that each describe a part/abstraction of the software to be built. – Models that each describe a part/abstraction of the environment to be controlled, monitored, responded to, or augmented. This distinction is important. Models that describe the internal structure of a system—for example, software architectures and data structures—are often used to fulfill some constructive purpose. By contrast, there are, for example, models that describe the business process that should be supported by the software, or the plant to be controlled. From a software development perspective, these models describe existing entities. These kinds of models are used to understand how software interacts with its environment. These models can also be used for defining tests to check that the software actually accomplishes its goals with respect to interactions with the environment. As already indicated, models can be used for a variety of purposes that are related to the various activities defined in software development processes. Each activity or phase in a project may be associated with its own kinds of models. For example, requirements models are used to express, organize, and analyze gathered requirements in the requirements phase, while logical architecture models are associated with high-level/logical design Reader Seite 7

"Engineering Modeling Languages" Seite 45

What’s a Model? 7

activities. In early phases it is particularly interesting to model the interplay between a software product and its environment to precisely understand what the final product shall do, as the interfaces between system and environment are typically complex and need to be precisely and correctly defined for the system to be of use. A model in software engineering can have one or more of the following purposes: – The model serves as an exploration of the solution possibilities. – The model is used for construction of the system. – The model serves as a source for tests. – The model helps the customer to understand the final product. – The model is used to generate code. – The model is used to customize the system to be designed. – The model serves as documentation of the system. – The model is used for simulation of the not-yet-existing system. – The model is used to simulate a phenomenon in the real world. Defining models to fulfill the above purposes will be discussed in detail later in this book. However, it is worth describing some aspects of the code generation purpose as it provides further insight into the nature of software models. Due to the immaterial nature of a software system, a model describing a piece of software and the software itself can be more tightly coupled than in any other engineering discipline. If a model is represented digitally, with a predictable form, then it can be used to generate other digital forms, including executable, meaning that the model as a draft of the system can be tightly connected to the system’s implementation. One can even argue that the model is the system, which is a convenient simplification that programmers usually take when they do not distinguish between the source code, the compiled files, and the running system. In fact, executable code in a high-level programming language can quite reasonably be considered a model. So if the model is sufficiently detailed or (even better) the generator is intelligent enough, we can conveniently regard a model to be the system. Reader Seite 8

"Engineering Modeling Languages" Seite 46

8 Engineering Modeling Languages

Note 1.1 Defining a Modeling Language: Metamodeling The core idea of Model Driven Engineering is to also use models for explicitly defining a modeling language: this is called metamodeling and will be discussed in details in the following chapters.

Code generation, generation of testing infrastructure, as well as checking of context conditions and high-level analysis, all make it necessary that the modeling language developers are using be precisely defined. Only with an explicit definition of a model’s syntactic form is it possible for machines to interpret and manipulate the models. The benefits that come from an explicitly defined modeling language are particularly significant in the software engineering discipline, more than any other, due to the advantages that come from tightly coupling the models and the artifacts being built. This is one of the reasons that computer science was one of the first disciplines to come up with explicit, operational language definitions.1 As computers become engineering tools in other domains, there is a growing trend in which domain experts define and use their own languages, referred to as domain-specific (modeling) languages (DSLs), for modeling phenomena unrelated to software development. This trend makes it even more important to have support for explicitly defining modeling languages. Having well-defined languages makes it possible to develop reusable tools that manipulate the models expressed in these languages. The UML [113] is an explicitly defined language, although it can still be argued that it is not yet precisely defined [39]. However the UML consists of a substantial set of sublanguages whose union is general purpose enough to describe large parts and aspects of a system in a relatively accurate manner, and thus can act as a lingua franca in the software development domain. The UML can thus be regarded as a general-purpose modeling language. By contrast, domain-specific modeling languages (DSML) are typically more problem-oriented and allow smaller and more concise 1

Of course, Chomsky’s works on grammars in the field of linguistics or Russell’s attempts at explicitly defining the language of mathematics had strong influence on computer science.

Reader Seite 9

"Engineering Modeling Languages" Seite 47

What’s a Model? 9

Foundation 1.2 Models Created Using the Same Medium as the Thing They Model Usually in science and engineering, a model is created using a different medium than the thing it models (think of a bridge drawing vs. a concrete bridge). Only in software and perhaps linguistics and nowadays also 3D printing is a model created using the same medium as the phenomenon it models. In software, at least, this opens up the possibility of automatically deriving software from its model and thus leads to a more tightly coupled relationship between model and original.

models. However, extra effort is needed to develop a DSML and its accompanying tooling. The overhead of such an undertaking must be considered against the advantages of using a DSML. Models defined in the UML, and those defined in a DSML, can all be seen as the abstraction of an aspect of reality for describing a certain concern. The provision of effective means for handling such concerns makes it possible to establish critical trade-offs early on in the system life cycle. Identifying the abstractions that fit the model’s purpose is not always straightforward. Currently, developers rely on their experience in the application domain to identify suitable abstractions, that is, abstractions that can be effectively manipulated to satisfy a model’s purpose. If this experience is not available, the use of unsuitable abstractions can be a significant source of accidental complexity in a model-based project. The quality of the abstractions captured in models is critical to successful application of modeling in software development.

1.4 ILLUSTRATIVE EXAMPLE: CELLULAR AUTOMATA For illustrating the various ideas presented in this book, we are going to use the running example of cellular automata. It is simple enough to be explained in a few lines, and still rich enough to allow us to illustrate all the concepts presented in this book; it has been shown [47] that some of the rule sets were Turing-complete, which means in essence that they can simulate any computer program. Cellular automata is a simple example used for pedagogical purposes in this book. However,

Reader Seite 10

"Engineering Modeling Languages" Seite 48

10 Engineering Modeling Languages

in the second part of this book, we will see that the various principles that we demonstrate through this simple example can be extended to handle the full complexity of real-world case studies. Cellular automata were originally proposed in the 1940s by Stanislaw Ulam and John von Neumann, and popularized in the 1970s by Conway’s Game of Life, based on a simple two-dimensional cellular automaton. 1.4.1

Cellular Automaton Topology

A cellular automaton consists of a universe of cells organized in some topology (typically a regular grid, in any finite number of dimensions). Each cell is in one of a finite number of states, such as empty or occupied. Defined by the universe topology, each cell has a set of cells called its neighborhood. For regular two-dimensional Cartesian grids, the two most common types of neighborhoods are the von Neumann neighborhood (consisting of the four orthogonally adjacent cells) and the Moore neighborhood (adding the four diagonal adjacent cells to the von Neumann neighborhood). Another point of variation is also the depth of the neighborhood, i.e., whether the neighbors of the direct neighbors (and so on) are considered. The universe can be either finite or infinite. For finite universes (e.g., in two dimensions, the universe would be a rectangle instead of an infinite plane), a problem arises with the cells on the edges, because their neighborhood is different (fewer neighbors). One possible method is to allow the values in those cells to remain constant. Another one is to use a toroidal topology, i.e., in 2-D to use a torus instead of a rectangle (the cells at the top are considered to be adjacent to the cells at the corresponding position on the bottom, and those on the left adjacent to the corresponding cells on the right). 1.4.2

Cellular Automaton Evolution Rules

An initial state for the automaton is selected by assigning a state for each cell. Each new generation is created according to some rule that determines the new state of each cell as a function of the current state of the cell and the states of the cells in its neighborhood. Typically, the rule for updating the state of cells is the same for each cell, does not change over time, and is applied to the whole grid simultaneously,

Reader Seite 11

"Engineering Modeling Languages" Seite 49

What’s a Model? 11

though exceptions are known, such as probabilistic cellular automata and asynchronous cellular automata. As an example, the rules for Conway’s two-state, two-dimensional cellular automaton called the Game of Life are as follows: If a cell has 2 black neighbors, it stays the same. If it has 3 black neighbors, it becomes black. In all other situations it becomes white. According to Wikipedia2 : Despite its simplicity, the system achieves an impressive diversity of behavior, fluctuating between apparent randomness and order. One of the most apparent features of the Game of Life is the frequent occurrence of gliders, arrangements of cells that essentially move themselves across the grid. It is possible to arrange the automaton so that the gliders interact to perform computations, and after much effort it has been shown that the Game of Life can emulate a universal Turing machine. Of course one may design many variants of rules for evolving cellular automata. 1.4.3

Modeling Cellular Automata

Cellular automata are used as the running example in the following chapters of this book. It is used to illustrate how one can build a modeling environment that allows a modeler to define cellular automata topologies, to design rules, to simulate cellular automata evolution, and more generally to perform a wide spectrum of computations on cellular automata. To provide concrete illustrations of the MDE principles, we will use the Eclipse environment3 as a particular technological environment, and more specifically the Eclipse Modeling Framework (EMF) [140]. The first thing we can remark on is that modeling cellular automata involves taking into account the many variations that exist among such automata, including: – variation in topology (typically a regular grid, in any finite number of dimensions); – different types of neighborhoods (von Neumann, Moore); 2 3

Cf. http://en.wikipedia.org/wiki/Cellular\_automaton Cf. http://eclipse.org

Reader Seite 12

"Engineering Modeling Languages" Seite 50

12 Engineering Modeling Languages

– different depths of the neighborhood; – whether the universe is finite or infinite, with special rules for border cells in finite universes; and – variation in rules for updating the state of cells. Different choices made on each of these variation points will produce a large family of cellular automata. This is typical of modern systems, which increasingly can no longer be shipped with the assumption that one-size-fits-all, and thus frequently take the form of product lines. While the basic vision underlying Software Product Lines (SPL) can probably be traced back to David Parnas’s seminal article [116] on the Design and Development of Program Families, it is only quite recently that SPLs are emerging as a distinct approach for modeling and developing software system families rather than individual systems [106]. SPL engineering embraces the ideas of mass customization and software reuse. This approach focuses on the means of efficiently producing and maintaining multiple related software products, exploiting what they have in common and managing what varies between them. This notion of product lines will be further explored in Chapter 10.

Topology Model

Rules Model

Initial state Model

Figure 1.1

Universe Model Cellular Automata

Neighbor hood Model

Modeling cellular automata: separating concerns.

Since modeling is the activity of separating and abstracting concerns of the problem domain, an activity often called analysis, it is

Reader Seite 13

"Engineering Modeling Languages" Seite 51

What’s a Model? 13

not surprising that it typically produces at least one model per concern. In the cellular automata example, we would thus find a model for describing the topology, another one for the universe, and yet others describing the notion of neighborhood, the initial state of the cellular automata and the rules (see Figure 1.1). For instance, in a topology model we would find its number of dimensions (1, 2, 3 or more) and whether it is a grid or a torus. These models may be expressed with a general-purpose modeling language such as the UML, or with a DSML when available. If modeling is separating concerns, the design process can be characterized as a weaving4 of these model aspects into a detailed design model (also called the solution space). This is not a new idea: it is what designers have been doing forever. Most often, however, the various aspects are not explicit, or when there are, they are expressed as informal descriptions. So the task of the designer is to do the weaving in her head more or less at once, and then produce the resulting detailed design as a big tangled program (even if one decomposition paradigm, such as functional or object-oriented, is used). While it works pretty well for small problems, it can become a major headache for larger ones. The real challenge here is not how to design the system to take a particular aspect into account: there is a wealth of design know-how for that, often captured in the form of design patterns. Taking into account more than one aspect at the same time is a little bit trickier, but many successful large-scale projects are there in industry to show us that engineers do ultimately manage to sort it out (most of the time). The real challenge in a product-line context is that the engineer wants to be able to change her mind as to which version or which variant of any particular aspect she wants in the system. And she wants to do it cheaply, quickly, and safely. For that, repeating by hand the tedious weaving of every aspect is not an option. We do not propose to make this problem go away, but to mechanize and make repeatable the process that experienced designers already follow by hand. The idea is that when a new product has to be derived from the product-line, we can automatically replay most of this design process, just changing a few things here and there. This is really what model-driven design (MDD) is all about. 4

Some also say: combining

Reader Seite 14

"Engineering Modeling Languages" Seite 52

Learning to speak the language of the domain This chapter covers ■

What a DSL is

■

The benefits a DSL offers, both to business users and to solution implementers

■

The structure of a DSL

■

Using well-designed abstractions

Every morning on your way to the office, you pull your car up to your favorite coffee shop for a Grande Skinny Cinnamon Dolce Latte with whip. The barista always serves you exactly what you order. She can do this because you placed your order using precise language that she understands. You don’t have to explain the meaning of every term that you utter, though to others what you say might be incomprehensible. In this chapter, you’ll look at how to express a problem in the vocabulary of a particular domain and subsequently model it in the solution domain. The implementation model of this concept is the essence of what is called a domainspecific language (DSL). If you had a software implementation of the coffee shop example where a user could place an order in the language that they use every day, you would have a DSL right there.

3

Reader Seite 15

"DSLs in Action" Seite 28

4

CHAPTER 1

Learning to speak the language of the domain

Every application that you design maps a problem domain to the implementation model of a solution domain. A DSL is an artifact that forms an important part of this mapping process. You’ll look more closely at the definition of what a DSL is a bit later. First, you need to understand the process that makes this mapping possible. For this mapping to work, you need a common vocabulary that the two domains share. This vocabulary forms one of the core inputs that lead to the evolution of a DSL. A good abstraction is essential to a well-designed DSL implementation. If you want to dig deep into the subject of well-designed abstractions, appendix A has a detailed discussion about the qualities to look for. A good plan of attack is to skim the appendix now, then continue reading this chapter. Section 1.7 contains basic information about abstractions, but appendix A is much more detailed.

1.1

The problem domain and the solution domain Domain modeling is an exercise that helps you analyze, understand, and identify the participants involved in a specific area of activity. You start with the problem domain and identify how the entities collaborate with each other meaningfully within the domain. In the earlier example of the coffee shop, you placed your order in the most natural language of the domain, using terminology that mapped closely to what the barista understands. Terminology forms the core entity of the problem domain. The barista could easily figure out exactly what she needed to serve you to fulfill your request because you’re both familiar with the required terminology.

1.1.1

The problem domain In a domain modeling activity, the problem domain is the processes, entities, and constraints that are part of the business that you’re analyzing. Domain modeling, also known as domain analysis (see [1] in section 1.9), involves the identification of all the major components of the domain and how they collaborate. In the example you began with, the barista knew all the entities like coffee, whipped cream, cinnamon, and nonfat milk that formed her problem domain model. When you analyze a more complex domain like a trading and settlement system for financial brokers, securities, stocks, bonds, trade, and settlement are some of the components that belong to the problem domain. Along with these components, you’ll also study how securities are issued, how they’re traded in stock exchanges, settled between various parties, and updated in books and accounts. You identify these collaborations and analyze and document them as artifacts of your analysis model.

1.1.2

The solution domain You implement a problem domain analysis model in terms of the tools and techniques offered by the solution domain. The barista could map your order to the procedure that she needed to follow to serve your Grande Skinny Cinnamon Dolce

Reader Seite 16

"DSLs in Action" Seite 29

The problem domain and the solution domain

5

Latte. The process she followed and the tools she used formed parts of her solution domain. When you’re dealing with a larger domain, you might need more support from your solution domain in terms of the tools, methodologies, and techniques that it needs to offer. You need to map the problem domain components into appropriate solution domain techniques. If you use an object-oriented methodology as the underlying solution platform, then classes, objects, and methods form the primary artifacts of your solution domain. You can compose these artifacts to form larger ones, which might serve as better representations of higher-level components in your problem domain. Figure 1.1 illustrates this first step in domain modeling. As you move along, you’ll flesh out the process of how to get to the solution domain by using techniques that domain experts can understand throughout the lifecycle of transformation. The primary exercise involved in domain modeling is mapping the problem domain to artifacts of the solution domain, so that all components, interactions, and collaborations are represented correctly and meaningfully. To do this, you first need to classify domain objects at the proper level of granularity. When you correctly classify domain objects, each object of the problem domain is visible in the solution domain, with its proper structure and semantics. But your map can be only as good as the language of interaction between the domains. A solid interaction requires that the problem domain and the solution domain share a common vocabulary.

Figure 1.1 Entities and collaborations from the problem domain must map to appropriate artifacts in a solution domain. The entities shown on the left (security, trade, and so on) need corresponding representations on the right.

Reader Seite 17

"DSLs in Action" Seite 30

6

1.2

CHAPTER 1

Learning to speak the language of the domain

Domain modeling: establishing a common vocabulary When you start an exercise in domain modeling, you start with the problem domain that you’re going to model. You need to understand how the various entities of the domain interact among themselves and fulfill their responsibilities. While you’re figuring all this out, you collaborate with domain experts and with other modelers. Domain experts know the domain. They communicate using the domain vocabulary, and use the same terminology when they explain domain concepts to the outside world. The modelers know how to represent an understanding of the model in a form that can be documented, shared, and implemented by software. The modelers must also understand the same terminology and reflect the same understanding in the domain model that they’re designing. Sometime back I started working on a project that involved modeling the backoffice operations of a large financial brokerage organization. I wasn’t a domain expert, and I didn’t know much about the details and complexities involved in the practices of the securities industry practices. Now, after working in that domain for quite a while, I think it’s similar enough to other domains that you might deal with to model most of my examples and annotations in this book on that domain. The sidebar in this section gives a brief introduction to the domain of securities trading and financial brokerage, which you’ll use as running examples for implementing DSLs. As you progress, I’ll define new concepts wherever applicable and focus on the relevant details only when necessary. If you’re not familiar with what goes on in a stock exchange, don’t panic. I’ll give you enough background in the sidebars to help you understand the basic concepts of what you model. On the first day of our requirements analysis meeting, the domain specialists of the financial industry started talking about coupon bonds, discount bonds, mortgages, and corporate actions. These terms were part of the usual terminology that a

Financial brokerage systems: a background The business of financial brokerage starts with a trading process. This process involves the exchange of securities and cash between two or more parties, referred to as the counterparties of the trade. On a certain date, the counterparties promise to make the trade (this date is referred to as the trade date) at a place known as the stock exchange, based on an agreed upon price, known as the unit price. The securities, which form one leg of the exchange process (the other being cash), can be of several types, such as stocks, bonds, mutual funds, and a host of other types that can have a hierarchy of their own. There are, for example, several types of bonds, like coupon bonds and discount bonds. Within a certain number of days of the promise to trade, the exchange is made by transferring the ownership of funds and securities between the counterparties; this exchange is known as the settlement process. Each security type has its own lifecycle of trade, execution, and finalization, and passes through a series of state changes in the course of the trading and settlement process.

Reader Seite 18

"DSLs in Action" Seite 31

Domain modeling: establishing a common vocabulary

7

brokerage specialist uses to communicate, but I didn’t know what they meant. Also, lots of terms were being used synonymously. The terms discount bond and zero coupon bond are synonymous, and they were being used interchangeably by different domain experts in different contexts. But because these terms were unknown to me, confusion reigned. Not all of us were specialists in the financial industry, and we soon realized that we needed to share a common vocabulary to make the knowledgesharing sessions more meaningful. Not only did we collaborate in terms of the common domain vocabulary, we also made sure that the model we designed and developed spoke the same language—the natural language of the domain.

1.2.1

Benefits of a common vocabulary A common vocabulary, shared between the stakeholders of the model, serves as the binding force that unifies all artifacts that are part of the implementation. More importantly, with the common vocabulary in place, you can easily follow the path of features, functions, and objects across all phases of the project delivery cycle. The same terms that the modeler uses for documenting use-cases appear as module names in programs, entity names in data models, and object names in test cases. In this way, a common vocabulary bridges the gap between the problem domain and the solution domain. Creating a common vocabulary might take more time up-front than you’re initially willing to spend, but I can almost guarantee that you’ll save yourself a lot of redoing in the long run. Let’s look at some of the tangible benefits that a common vocabulary offers. SHARED VOCABULARY AS THE GLUE

During the requirements analysis phase, a shared vocabulary serves as the common bridge of understanding between the modelers and the domain experts. All your discussions are more succinct and effective. When Bob (who’s a trader) talks about interest accrual for bonds, Joe (who’s a modeler) knows that Bob is referring specifically to coupon bonds. COMMON TERMINOLOGY IN TEST CASES

The common vocabulary can also serve as the basis for developing test cases. Then, the domain expert group can verify these test cases. A sample test case from my earlier project on brokerage system implementation reads: For a zero coupon bond issued by Trampoline Securities with a face value of USD 10,000 and a primary value date of 15th May 2001 at a price of 40%, the investor will have to pay USD 4,000 at issue launch. The test case makes perfect sense to the modeler, the tester, and the domain specialist who’s reviewing it, because it uses terminology that forms the most natural representation of the domain language. COMMON VOCABULARY DURING DEVELOPMENT

If the development team is using the same vocabulary to represent program modules, the resulting code is also going to speak the same domain language. For example, if you talk about modules like bond trading and settlement of securities, when you write code, you’ll use the same vocabulary to name domain entities.

Reader Seite 19

"DSLs in Action" Seite 32

8

CHAPTER 1

Learning to speak the language of the domain

Figure 1.2 The problem domain and the solution domain need to share a common vocabulary for ease of communication. With this vocabulary, you can trace an artifact of the problem domain to its appropriate representation in the solution domain.

Developing and sharing a common vocabulary between the problem and solution domains is the first step in our march toward the solution domain. Let’s update figure 1.1 with this common glue that binds the domains together to come up with figure 1.2. You know that the developers and the domain experts need to share a common vocabulary, but how will the language be mapped? How does the domain expert understand the model that the developers are generating? This communication problem is a common one in any software development ecosystem. Looking at figure 1.2, you’ll realize that the domain experts are in no way equipped to understand the technical artifacts that currently populate the solutiondomain model. As systems increase in complexity, the models get bloated and the communication gap keeps on widening. The domain experts don’t need to understand the complexities that surround an implementation model; they need to verify whether the business rules being implemented are correct. Ideally, the experts themselves would write test scripts to verify the correctness and comprehensiveness of the domain rules’ implementation, but that’s not a practical solution. What if you could offer the experts a communication model that builds on the common vocabulary and rolls off everyone’s tongue with the same fluidity that a domain person uses in his everyday business practice? You can. This is the moment when the DSL enters the picture!

1.3

Introducing DSLs Joe, the IT head for the hypothetical company Trampoline Securities, had no idea what Bob, the trader, was up to as he leaned over Bob’s shoulders and took a sneak

Reader Seite 20

"DSLs in Action" Seite 33

Introducing DSLs

9

peek at his console. To his amazement, Joe discovered that Bob was busy typing commands and statements in a programming environment that he thought belonged exclusively to the members of his development team. Here’s the fly-on-the-wall record of their conversation: ■ ■ ■ ■ ■

■

■

■

Joe: Hey Bob, can you write programs? Bob: Yeah, sort of, in our new TrampolineEasyTrade system. Joe: But, but, you’re a trader, right? Bob: So? We use this software for that, too. Joe: You’re supposed to be using the software, not programming in it! The product isn’t even out of the development labs. Bob: But I thought it’d be great if I could write some tests for the software that I’ll be using later. That way, I can pass on my inputs to the development team way early in the sprint. Being part of this exercise makes me feel like I’m contributing more. I have a much better feel for what’s being developed. And I can check to see if my use cases are working, too. Joe: But that’s the responsibility of the development team! I sit with them every day. I’ve got tools in place to check code coverage, test coverage, and a bunch of other metrics that’ll guarantee that what we deliver is the best it can be. Bob: As far as knowing about financial brokerage systems is concerned, who do you think understands the domain better? Me? Or your set of tools?

Ultimately Joe had to admit that Bob, who’s an expert in the domain of financial brokerage systems, was better equipped to verify whether their new offering of the trading platform covered the functional specs adequately and correctly. What Joe couldn’t understand is how Bob, who isn’t a programmer, could write tests using their testing framework. As a reader, you must also be wondering. Look at the following listing, which shows what Bob had up on his console. Listing 1.1 Order-processing DSL place orders ( new Order to buy(100 sharesOf limitPrice 300 allOrNone using premiumPricing, new Order to buy(200 sharesOf limitOnClosePrice 300 using premiumPricing, new Order to buy(200 sharesOf limitOnOpenPrice 300 using defaultPricing, new Order to sell(200 bondsOf limitPrice 300 allOrNone

Reader Seite 21

"IBM")

"CISCO")

"GOOGLE")

"SUN")

"DSLs in Action" Seite 34

10

CHAPTER 1

Learning to speak the language of the domain

using { (qty, unit) => qty * unit - 500 } )

Looks like a code snippet, right? It is, but it also contains language that Bob usually speaks when he’s at his trading desk. Bob’s preparing a list of sample order-creation scripts that place orders on securities using various pricing strategies. He can even define a custom pricing strategy on his own when he places the order. What’s the language that Bob’s programming in? It doesn’t matter to him, as long as he gets his work done. To him, it’s the same language that he speaks at his trading desk. But let’s determine how what Bob is doing differs from the run-of-the-mill coding that we do every day in our programming jobs: ■

■

■

■

The vocabulary of the language that Bob is using seems to correspond closely with the domain that he belongs to. In his day job at his trading desk, he places orders for his clients using the same terminology that he’s writing directly into his test scripts. The language that he’s using, or the subset of the language that you see on his console, doesn’t seem to apply outside the domain of financial brokerage business. The language is expressive, in the sense that Bob can clearly articulate what he wants to do as he steps through the process of creating a new order for his client. The language syntax looks succinct. The syntactic complexities of the high-level languages you usually program in have magically disappeared.

Bob is using a domain-specific language, tailor-made for financial brokerage systems. It’s immaterial at this point what the underlying language of implementation is. The fact that the underlying language isn’t obvious from the code in listing 1.1 indicates that the designer successfully created an expressive language for a specific domain.

1.3.1

What’s a DSL? A DSL is a programming language that’s targeted at a specific problem; other programming languages that you use are more general purpose. It contains the syntax and semantics that model concepts at the same level of abstraction that the problem domain offers. For example, when you order your Cinnamon Latte, you use the domain language that the barista readily understands. Abstraction is a cognitive process of the human brain that enables us to focus on the core aspects of a subject, ignoring the unnecessary details. You’ll talk more about abstractions and DSL design in section 1.7. Appendix A is all about abstractions. DEFINITION

Programs that you write using a DSL must have all the qualities that you expect to find in a program that you write in any other computer language. A DSL needs to give you

Reader Seite 22

"DSLs in Action" Seite 35

Introducing DSLs

11

the ability to design abstractions that form part of the domain. In the same way that you can build a larger entity out of many smaller ones in the problem domain, a welldesigned DSL gives you that flexibility of composition in the solution domain. You should be able to compose DSL abstractions just like you compose your functionalities in the problem domain. Now you know what a DSL is. Let’s talk about how it differs from other programming languages you’ve been using. HOW’S A DSL DIFFERENT FROM A GENERAL-PURPOSE PROGRAMMING LANGUAGE?

The answer to the difference is in the definition itself. The two most important qualities of a DSL that you need to remember are: ■ ■

A DSL is targeted at a specific problem area A DSL contains syntax and semantics that model concepts at the same level of abstraction as the problem domain does

When you program using a DSL, you deal only with the complexity of the problem domain. You don’t have to worry about the implementation details or other nonessential elements of the solution domain. (For more discussion about nonessential complexity, see appendix A.) More often than not, people who aren’t expert programmers can use DSLs—if the DSL has the appropriate level of abstraction. Mathematicians can easily learn and work with Mathematica, UI designers feel comfortable writing HTML, hardware designers use VHDL (very-high-speed integrated circuit hardware description language; a DSL used in electronic design automation) to name a few such use cases. Because nonprogrammers need to be able to use them, DSLs must be more intuitive to users than general-purpose programming languages need to be. You write a program only once, but you manage its evolution for many years. For a program to evolve, it needs to be nurtured by people, many of whom may not have been involved in designing the initial version. The key issue is communication, the ability for your program to communicate with its intended audience. In the case of a DSL, the direct audience is neither the compiler nor the CPU, but the human minds that need to understand its behavior. The language needs to be communicative to its audience and allow code snippets that are expressive enough to map to the thought process of the domain modeler. For this to happen, the DSL that you design has to offer the correct level of syntactic as well as semantic abstractions to the user. WHAT’S IN A DSL FOR BUSINESS USERS?

As you’ve learned from the discussion so far, DSLs stand out from normal high-level programming languages in two ways: ■

■

DSLs offer a higher level of abstraction to the user. This implies that you don’t

have to be concerned about the nuances of identifying specific data structures or other low-level details. You can focus on solving the problem at hand. DSLs offer a limited vocabulary that’s specific to the domain it addresses. The fact that it contains nothing extra helps you focus on the problem that you’re

Reader Seite 23

"DSLs in Action" Seite 36

12

CHAPTER 1

Learning to speak the language of the domain

modeling. A DSL doesn’t have the horizontal, spread-out focus of a general-purpose programming language. Both these qualities make DSLs a friendlier tool for the nonprogramming domain expert. Your business analysts understand the domain, which is what a DSL abstracts. With more and more programming languages offering higher levels of abstraction design, DSLs are poised to be a major component in today’s application development ecosystem. Nonprogramming domain analysts will surely have a major role to play here. With a DSL implementation in place, they’ll be able to write test scripts correctly from day one. The idea isn’t to run the scripts immediately, but to ensure that you’ve adequately covered the possible business scenarios in your implementation. When the DSL is designed at an effective level of abstraction, it’s not unusual for domain experts to browse through source code that defines the business logic. They’ll be able to verify the business rules, and provide immediate feedback to developers based on their observations. Now that you’ve seen some of the values that a DSL offers to you as a developer and as a domain user, let’s take a look at some of the commonly used DSLs in the industry today.

1.3.2

Popular DSLs in use DSLs are everywhere. Whether or not you brand them as DSLs, I’m sure you’re using a lot of them in every application that you develop. Table 1.1 lists a few of the most commonly used DSLs. Table 1.1

Commonly used DSLs DSL

Used for

SQL

Relational database language used to query and manipulate data

Ant, Rake, Make

Languages for building software systems

CSS

Stylesheet description language

YACC, Bison, ANTLR

Parser-generator languages

RSpec, Cucumber

Behavior-driven testing language in Ruby

HTML

Markup language for the web

There are a lot more DSLs that you use on a regular basis. Can you identify some of the common characteristics that these languages have? Here are a few: ■

All DSLs are specific to the domain. Each language is of limited expressivity; you can use a DSL to solve the problem of that particular domain only. You can’t build cargo management systems using only HTML. Martin Fowler used the term limited expressivity to describe the most important characteristic of a DSL. In his 2009 DSL Developer’s DEFINITION

Reader Seite 24

"DSLs in Action" Seite 37

Introducing DSLs

13

Conference keynote talk ([3] in section 1.9), Martin mentioned that it’s this limited expressivity that differentiates a DSL from a general-purpose programming language. You can model anything and everything with a general-purpose programming language. With a DSL, you can model only one specific domain, but in a more expressive way. ■

■

■

For each of the languages listed in table 1.1 (and the other popular ones being used), you usually need to use the abstractions that they publish. Barring specific exceptions, you don’t even need to know the underlying implementations of these languages. Every DSL offers a set of contracts that you can use to build your solution domain model. You can compose multiple contracts to build more complex models. But you don’t need to step out of the offered contracts and get down to the implementation level of the DSL. Every DSL is expressive enough to make its intentions clear to the nonprogramming user. The DSL isn’t merely a collection of APIs that you use; every API is concise and speaks the vocabulary of the domain. For every DSL, you can go back to your source file months after you wrote them and immediately be able to figure out what you meant.

It’s a fact that DSL-based development encourages better communication between developers and domain experts. This is its greatest virtue. By using a DSL, a nonprogramming domain expert won’t transform himself into a regular programmer. But with the expressiveness and explicitly communicative APIs that DSLs offer, the domain expert will be able to understand which business rules the abstraction implements and whether it adequately covers all possible domain scenarios. Let’s look at one motivating example of a DSL snippet selected from the list in table 1.1. Consider the following snippet from a Rakefile, which is mainly used to build Ruby-based systems: desc "Default Task" task :default => [ :test ] Rake::TestTask.new { |t| t.libs registered } state registered { unregisterAircraft => new reportPosition reportProblem } } }

Initially, the protocol state machine is in the new state where the only valid message is registerAircraft. If this is received, we transition into the registered state. In registered, you can either unregisterAircraft and go back to new, or receive a reportProblem or reportPosition message, in which case you will remain in the registered state. Reader Seite 70

"DSL Engineering" Seite 464

464

dslbook.org

⌅ Versioning We mentioned above that the system is distributed geographically. This means it is not feasible to update all the software for all parts of the systems (e.g., all InfoScreens or all AircraftModules) at the same time. As a consequence, there might be several versions of the same component running in the system. To make this feasible, many non-trivial things need to be put in place in the runtime system. But the basic requirement is this: you have to be able to mark versions of components, and you have to be able to check them for compatibility with older versions. The following piece of code expresses the fact that the DelayCalculatorV2 is a new implementation of DelayCalculator. newImplOf means that no externally visible aspects change, which is why no ports and other externally exposed details of the component are declared21 . component DelayCalculator { publishes flights { publication = onchange } } newImplOf component DelayCalculator: DelayCalculatorV2

21

For all intents and purposes, it’s the same thing – just maybe a couple of bugs are fixed.

To evolve the externally visible signature of a component, one can write this: component DelayCalculator { publishes flights { publication = onchange } } newVersionOf component DelayCalculator: DelayCalculatorV3 { publishes flights { publication = onchange } provides somethingElse: ISomething }

The keyword newVersionOf allows us to add additional provided ports (such as the somethingElse port) and to remove required ports22 . This wraps up our case study23 . In the next subsection we recap the approach and provide additional guidance.

18.2.2

You cannot add additional required ports or remove any of the provided ports, since that would destroy the "plug-in compatibility". Constraints make sure that these rules are enforced on the model level. 23

Architecture DSL Concepts

⌅ What we did in a Nutshell Using the approach shown here, we were able to quickly get a grip of the overall architecture of the system. All of the above was actually done in the first day of the workshop. We defined the grammar, some important constraints, and a basic editor (without many bells and whistles). We were able to separate what we wanted the system to do from how it would achieve it: all the technology discussions were now merely an implementation detail of the conceptual Reader Seite 71

22

While writing this chapter, I contacted the customer, and we talked about the approach in hindsight. As it turns out, they are still happily using the approach as the foundation of their system. They generate several hundred thousand lines of code from the models, and they have to migrate to a new version of Xtext real soon now :-)

"DSL Engineering" Seite 465

dsl engineering

descriptions given here24 . We also achieved a clear and unambiguous definition of what the different architectural concepts mean. The generally nebulous concept of component has a formal, well-defined meaning in the context of this system. The approach discussed in this chapter therefore recommends the definition of a formal language for your project’s or system’s conceptual architecture. You develop the language as the understanding of your architecture grows. The language therefore always resembles the complete understanding about your architecture in a clear and unambiguous way.

⌅ Component Implementation By default, architecture DSLs are incomplete; component implementation code is written manually against the generated API, using well-known composition techniques such as inheritance, delegation or partial classes. However, there are other alternatives for component implementation that do not use a GPL, but instead use formalisms that are specific to certain classes of behavior: state machines, business rules or workflows. You can also define and use a domain-specific language for certain classes of functionality in a specific business domain. If such an approach is used, then the code generator for the application domain DSL has to act as the "implementor" of the components (or whatever other architectural abstractions are defined in the architecture DSL). The code generated from the business logic DSL must fit into the code skeletons generated from the architecture DSL. The code composition techniques discussed for incomplete DSLs in Section 4.5.1 can be used here25 . ⌅ Standards, ADLs and UML Describing architecture with formal languages is not a new idea. Various communities recommend using Architecture Description Languages (ADLs) or the Unified Modeling Language (UML). However, all of those approaches advocate using existing, generic languages for specifying architecture, although some of them, including UML, can be customized to some degree26 . Unfortunately, efforts like that completely miss the point. We have not experienced much benefit in shoehorning an architecture description into the (typically very limited, as well as too generic) constructs provided by predefined languages – one of the core activities of the approach explained is this chapter is the process of actually building your own language to capture your system’s specific conceptual architecture. Reader Seite 72

465

24

I don’t want to give the impression that technology decisions are unimportant. However, they should be an explicit second step, and not mixed with the concepts. This is particularly important in light of the fact that the system should live for more than 20 years, during which technologies will change more than once.

25

Be aware that the discussion in this section is only really relevant for application-specific behavior, not for all implementation code. Huge amounts of implementation code are related to the technical infrastructure (e.g., remoting, persistence, workflow) of an application. This can be derived from the architectural models, and generated automatically. 26

I am not going to elaborate on the reasons why I think profiles are usually not a good alternative to a DSL. It should become clear from the example above and the book in general why I think this is true.

"DSL Engineering" Seite 466

466

dslbook.org

⌅ Visualization In this project, as well as in many other ones, we have used textual DSLs. We have argued in this book why textual DSLs are superior in many cases, and these arguments apply here as well. However, we did use visualization to show the relationships between the building blocks, and to communicate the architecture to stakeholders who were not willing to dive into the textual models. Figure 18.2 shows an example, created with Graphviz. Figure 18.2: This shows the relationship between the ingredients of the example system described above. It contains namespace, components, interfaces and data structures, as well as the provides and requires relationships between components and interfaces.

⌅ Code Generation Now that we have a formal model of the conceptual architecture (the language) and also a formal description of system(s) we are building – i.e., the models defined using the language – we can exploit this by generating code27 : • We generate an API against which the implementation is coded. That API can be non-trivial, taking into account the various messaging paradigms, replicated state, etc. The generated API allows developers to code the implementation in a way that does not depend on any technological decisions: the generated API hides those from the component implementation code. We call this generated API, and the set of idioms to use it, the programming model.

27

It should have become clear from the chapter that the primary benefit of developing the architecture DSL (and using it) is just that: understanding concepts by removing any ambiguity and defining them formally. It helps you understand your system and get rid of unwanted technology interference. But of course, exploiting these assets further by generating code is useful.

• Remember that we expect some kind of component container or middleware platform to run the components, so we also generate the code that is necessary to run the components (including their technology-neutral implementation) Reader Seite 73

"DSL Engineering" Seite 467

dsl engineering

on the implementation technology of choice. We call this layer of code the technology mapping code (or glue code). It typically also contains the configuration files for the various platforms involved28 . It is of course completely feasible to generate APIs for several target languages (supporting component implementation in various languages) and/or generating glue code for several target platforms (supporting the execution of the same component on different middleware platforms). This nicely supports potential multi-platform requirements29 . Another important point is that you typically generate in several phases: a first phase uses type definitions (components, data structures, interfaces) to generate the API code so you can write the component implementation code. A second phase generates the glue code and the system configuration code. Consequently, it is often sensible to separate type definitions from system definitions into several different viewpoints30 : these are used at different times in the overall process, and also often created, modified and processed by different people. In summary, the generated code supports an efficient and technology independent implementation and hides much of the underlying technological complexity, making development more efficient and less error-prone.

⌅ What Needs to be Documented? I advertise the above approach as a way to formally describe a system’s conceptual and application architecture. So, this means it serves as some kind of documentation, right? Right, but it does not mean that you don’t have to document anything else. The following things still need to be documented31 : Rationales/Architectural Decisions The DSLs describe what your architecture looks like, but it does not explain why it looks the way it does. You still need to document the rationales for architectural and technological decisions. Note that the grammar of your architecture DSL is a really good baseline for such a documentation. Each of the constructs is the result of architectural decisions. So, if you explain for each grammar element why it is there (and why other alternatives have not been chosen) you are well on your way to documenting the important architectural decisions. A similar approach can be used for the application architecture, i.e. the programs written with the DSL. Reader Seite 74

467

28

As a side effect, the generators capture best practices in working with the technologies you’ve decided to use.

29

Even if you don’t have explicit requirements to support several platforms, it may still be useful to be able to do so. For one, it makes future migration to a new platform simpler. Or a second platform may simply be a unit test environment running on a local PC, without the expensive distribution infrastructure of the real system. 30

Viewpoints were discussed in Section 4.4.

31

There are probably more aspects of an architecture that might be worth documenting, but the following two are the most important ones when working with Architecture DSLs.

"DSL Engineering" Seite 468

468

dslbook.org

User Guides A language grammar can serve as a well-defined and formal way of capturing an architecture, but it is not a good teaching tool. So you need to create tutorials for your users (i.e., the application programmers) that explain how to use the architecture and the DSL. This includes what and how to model (using your DSL) and also how to generate code and how to use the programming model (how to fill in the implementation code into the generated skeletons).

18.3

Component Models

As I have hinted at already, I think that component-based software architectures are extremely useful. There are many (more or less formal) definitions of what a components is. They range from a building block of software systems, through something with explicitly defined context dependencies, to something that contains business logic and is run inside a container. Our understanding (notice we are not saying we have a real definition) is that a component is the smallest architectural building block. When defining a system’s architecture, you typically don’t look inside components. Components have to specify all their architecturally relevant properties declaratively (using meta data, or models). As a consequence, components become analyzable and composable by tools. Typically they run inside a container that serves as a framework to act on the runtime-relevant parts of the meta data. The component boundary is the level at which the container can provide technical services such as as logging, monitoring or fail-over. The component also provides well-defined APIs to its implementation to address cross-cutting architectural concerns such as locking32 . I don’t have any specific requirements for what meta data a component actually contains (and hence, which properties are described). The concrete notion of components has to be defined for each (system/platform/product line) architecture separately: this is exactly what we do with the language approach introduced above. Based on my experience, it is almost always useful to start by modeling the component structure of the system to be built. To do that, we start by defining what a component actually is – that is, by defining a meta model for component-based development. Independent of the project’s domain, these meta Reader Seite 75

32

A long time ago I coauthored the book Server Component Patterns. While the EJB-based technology examples are no longer relevant, many of the patterns identified and discussed in the book still are. The book was not written with a MDD/DSL background, but the patterns fit very well with such an approach.

"DSL Engineering" Seite 469

dsl engineering

469

models are quite similar, with a set of specific variation points. We show parts of these meta models here to give you a head start when defining your own component architecture.

18.3.1

Three Typical Viewpoints

It is useful to look at a component-based system from several viewpoints (Section 4.4). Three viewpoints form the backbone of the description.

⌅ The Type Viewpoint The Type viewpoint describes component types, interfaces and data structures33 . A component provides a number of interfaces and references a number of required interfaces (often through ports, as in the example above). An interface owns a number of operations, each with a return type, parameters and exceptions. Fig. 18.3 shows this.

33

Referring back to the terminology introduced in Section 3.3, this viewpoint is independent. It is also sufficient for generating all the code that is needed for developers to implement the application logic.

Figure 18.3: The types viewpoint describes interfaces, data types and components. These are referred to as types because they can be instantiated (in the composition viewpoint) to make up the actual application.

To describe the data structures with which the components work (the meta model is shown in Fig. 18.4), we start out with the abstract concept Type. We use primitive types as well as complex types. A ComplexType has a number of named and typed attributes. There are two kinds of complex types. Data transfer objects are simple (C-style) structs that are used to exchange data among components. Entities have a unique ID and can be persisted (this is not visible from the meta model). Entities can reference each other and can thus build more complex data graphs. Each reference specifies whether it is navigable in only one or in both directions. A reference also specifies the cardinalities of the entities at the respective ends, and whether the reference has containment semantics.

Reader Seite 76

"DSL Engineering" Seite 470

470

dslbook.org

Figure 18.4: The structure of data types of course depends a lot on the application. A good starting point for data definitions is the well-known entity relationship model, which is essentially represented by this meta model.

⌅ The Composition Viewpoint The Composition viewpoint, illustrated in Fig. 18.5, describes component instances and how they are connected34 . Using the Type and Composition viewpoints, we can define logical models of applications. A Configuration consists of a number of ComponentInstances, each referencing their type (from the Type viewpoint). An instance has a number of wires (or connectors): a Wire can be seen as an instance of a ComponentInterfaceRequirement. Note the constraints defined in the meta model:

34

The Composition viewpoint is dependent: it depends on the Type viewpoint. It is also sufficient for generating stub implementation of the system for unit testing and for doing dependency analyses and other checks for completeness and consistency.

• For each ComponentInterfaceRequirement defined in the instance’s type, we need to supply a wire. • The type of the component instance at the target end of a wire needs to provide the interface to which the wire’s component interface requirement points.

Figure 18.5: The Composition viewpoint describes the composition of a logical system from the component types defined in the Types viewpoint.

Reader Seite 77

"DSL Engineering" Seite 471

dsl engineering

⌅ The System Viewpoint The system viewpoint describes the system infrastructure onto which the logical system defined with the two previous viewpoints is deployed (Fig. 18.6), as well as the mapping of the Composition viewpoint onto this execution infrastructure35 .

471

35

In some cases it may make sense to separate the infrastructure definition from the mapping to the Composition viewpoint (for example, if the infrastructure is configured by some operations department). Figure 18.6: The System viewpoint maps the logical system defined in the Composition viewpoint to a specific hardware and systems software configuration.

A system consists of a number of nodes, each one hosting containers. A container hosts a number of component instances. Note that a container also defines its kind – representing technologies such as OSGi, JEE, Eclipse or Spring. Based on this data, together with the data in the Composition viewpoint, you can generate the necessary "glue" code to run the components in that kind of container, including container and remote communication configuration code, as well as scripts to package and deploy the artifacts for each container. You may have observed that the dependencies among the viewpoints are well-structured. Since you want to be able to define several compositions using the same components and interfaces, and since you want to be able to run the same compositions on several infrastructures, dependencies are only legal in the directions shown in figure 18.7. Figure 18.7: Viewpoint dependencies are set up so that the same components (Type viewpoint) can be assembled into several logical applications (Composition viewpoint), and each of those can be mapped to several execution infrastructures (System viewpoint).

18.3.2

Aspect Models

The three viewpoints described above are a good starting point for modeling and building component-based systems. However, in many cases these three models are not enough. Additional aspects of the system have to be described using specific Reader Seite 78

"DSL Engineering" Seite 472

472

dslbook.org

aspect models that are arranged "around" the three core viewpoint models. The following aspects are typically handled in separate aspect models: • Persistence • Authorization and Authentication (for enterprise systems) • Forms, layout, page flow (for Web applications) • Timing, scheduling and other quality of service aspects (especially in embedded systems) • Packaging and deployment • Diagnostics and monitoring The idea of aspect models is that the information is not added to the three core viewpoints, but rather is described using a separate model with a suitable concrete syntax. Again, the meta model dependencies are important: the aspects may depend on the core viewpoint models and maybe even on one another, but the core viewpoints must not depend on any of the aspect models. Figure 18.8 illustrates a simplified persistence aspect meta model. Figure 18.8: An example meta model for a persistence aspect. It maps data defined in the Type viewpoint to tables and columns.

18.3.3

Variations

The meta models described above cannot be used in exactly this way in every project. Also, in many cases the notion of what constitutes a Component needs to be adapted or extended. As a consequence, there are many variations of these meta models. In this section we discuss a few of them36 . Reader Seite 79

36

Consider them as an inspiration for your own case. In the end, the actual system architecture drives these meta models.

"DSL Engineering" Seite 473

dsl engineering

473

⌅ Messaging Instead of operations and their typical call/block semantics, you may want to use messages, together with the well-known message interaction patterns. The example system in this chapter used messaging. ⌅ No Interfaces Operations could be added directly to the components. As a consequence, of course, you cannot reuse the interface’s "contracts" separately, independently of the supplier or consumer components. You cannot implement interface polymorphism. ⌅ Component Kinds Often you’ll need different kinds of components, such as domain components, data access components, process components or business rules components. Depending on this component classification, you can define the valid dependency structures between components (e.g., a domain component may access a data access component, but not the other way round)37 . ⌅ Layers Another way of managing dependencies is to mark each component with a layer tag, such as domain, service, GUI or facade, and define constraints on how components in these layers may depend on each other.

37

You will typically also use different ways of implementing component functionality, depending on the component kind.

⌅ Configuration Parameters A component might have a number of configuration parameters – comparable to command line arguments in console programs – that help configure the behavior of components. The parameters and their types are defined in the type model, and values for the parameters can be specified later, for example in the models for the Composition or the System viewpoints. ⌅ Component Characteristics You may want to express whether a components is stateless or stateful, whether they are threadsafe or not, and what their lifecycle should look like (for example, whether they are passive or active, whether they want to be notified of lifecycle events such as activation, and so on). ⌅ Asynchronicity Even if you use operations (and not messaging), it is not always enough to use simple synchronous communication. Instead, one of the various asynchronous communication paradigms, such as those described in the Remoting Patterns book38 , might be applicable. Because using these

Reader Seite 80

38

M. Voelter, M. Kircher, and U. Zdun. Remoting Patterns. Wiley, 2004

"DSL Engineering" Seite 474

474

dslbook.org

paradigms affects the APIs of the components, the pattern to be used has to be marked up in the model for the Type viewpoint, as shown in Fig. 18.9. It is not enough to define it in the Composition viewpoint39 .

39

If your application uses messaging interfaces instead of the RPC-style interfaces discussed in this section, then an API can be designed that remains independent of the communication paradigm. Figure 18.9: A ComponentInterfaceRequirement may describe which communication paradigm should be applied.

⌅ Events In addition to the communication through interfaces, you might need (asynchronous) events using a static or dynamic publisher/subscriber infrastructure40 . ⌅ Dynamic Connection The Composition viewpoint connects component instances statically. This is not always feasible. If dynamic wiring is necessary, the best way is to embed the information that determines which instance to connect to at runtime into the static wiring model. So, instead of specifying in the model that instance A must be wired to instance B, the model only specifies that A needs to connect to a component with the following properties: it needs to provide a specific interface, and for example offer a certain reliability. At runtime, the wire is "deferenced" to a suitable instance using an instance repository41 .

40

The "direction of flow" of these events is typically the opposite of the dependencies discussed above: this allows them to be used for notifications or callbacks.

41

We have seen this in the example system described above.

⌅ Hierarchical Components Hierarchical components, as illustrated in figure 18.10, are a very powerful tool. Here a component is structured internally as a composition of other component instances. This allows a recursive and hierarchical decomposition of a system to be supported. Ports define how components may be connected: a port has an optional protocol definition that allows for port compatibility checks that go beyond simple interface equality. While this approach is powerful, it is also non-trivial, since it blurs the formerly clear distinction between Type and Composition viewpoints. ⌅ Structuring Finally, it is often necessary to provide additional means of structuring complex systems. The terms busi-

Reader Seite 81

"DSL Engineering" Seite 475

The BRICS Component Model: A Model-Based Development Paradigm For Complex Robotics Software Systems Herman Bruyninckx

Nico Hochgeschwender

Luca Gherardi

Katholieke Universiteit Leuven Leuven, Belgium

Bonn-Rhine-Sieg University of Applied Sciences, Germany

University of Bergamo Bergamo, Italy

Herman.Bruyninckx@ mech.kuleuven.be

[email protected]

Markus Klotzbücher

nico.hochgeschwender@ h-brs.de Gerhard Kraetzschmar

Katholieke Universiteit Leuven Leuven, Belgium

Bonn-Rhine-Sieg University of Applied Sciences, Germany

University of Bergamo Bergamo, Italy

markus.klotzbuecher@ mech.kuleuven.be

gerhard.kraetzschmar@ h-brs.de

[email protected]

ABSTRACT

Davide Brugali

ponent Models

Because robotic systems get more complex all the time, developers around the world have, during the last decade, created component-based software frameworks (Orocos, OpenRTM, ROS, OPRoS, SmartSoft) to support the development and reuse of “large grained” pieces of robotics software. This paper introduces the BRICS Component Model (BCM) to provide robotics developers with a set of guidelines, metamodels and tools for structuring as much as possible the development of, both, individual components and componentbased architectures, using one or more of the aforementioned software frameworks at the same time, without introducing any framework- or application-specific details. The BCM is built upon two complementary paradigms: the “5Cs” (separation of concerns between the development aspects of Computation, Communication, Coordination, Configuration and Composition) and the meta-modeling approach from ModelDriven Engineering.

1.

INTRODUCTION

Software development and integration in robotics is a challenging exercise, which is prone to errors. Today’s robotic systems are getting more complex, requiring the integration of motion controllers on many motion degrees of freedom, multiple sensors distributed over the robot’s body and embedded in its environment, planners and reasoners for ever more complex tasks. Moreover, the need to integrate functional modules into a complete robot control architecture with quality-of-service demands (e.g. real-time constraints) are factors, which are increasing the overall complexity. To tackle the aforementioned challenges, robotics system integrators and developers created component-based software frameworks [4, 31, 6] (such as Orocos [8, 7], OpenRTM [1], ROS [26], OPRoS [19], GenoM [14, 20], SmartSoft [28], and Proteus [16]) to support software development and to promote reuse of “large grained” pieces of robotics software. Generally speaking, component-based development complements the more traditional object-oriented programming in roughly the following ways: it focuses on runtime composition of software (and not compile time linking), it allows multiple programming languages, operating systems and communication middlewares to be used in the same system, and it adds events and data streams as first-class interaction primitives in addition to method calls. Further, the core idea is to use components as building blocks (possibly provided by different “vendors”) and to design application architectures by composing components. Even though the component-based approach fosters the structured development of components, the composition of robot control architectures out of components provided by independent suppliers remains cumbersome. Currently, the ad-hoc development typically leads to components that are hard to compose and to reuse as they violate basic software engineering principles such as separation of concerns (e.g. in ROS, a ROS component, also called node, is responsible for retrieving its own configuration from the parameter server, hence the functionality becomes dependent on the

Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous; D.2.11 [Software Architectures]: Domain-specific architectures

General Terms Design

Keywords Software Architectures, Reusable Software, Robotics, Com-

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC’13 March 18-22, 2013, Coimbra, Portugal. Copyright 2013 ACM 978-1-4503-1656-9/13/03 ...$10.00.

1758

Reader Seite 82

"The BRICS Component Model: A Model-Based Development Paradi…" Seite 1

latter mechanism). Further, even though the mentioned software frameworks are based on the same or similar concepts (see [29] for a survey), most of them do not provide (except for OpenRTM ) an explicit and formalized component model as required to make the step from code-driven to model-driven software development in robotics. Such a component model would allow to describe the primitives available within a software framework and the composition rules with which the primitives can be combined into larger components or even complete applications. The model-free approach prevents interoperability between software frameworks. For instance, it is very difficult to reuse a component developed in Orocos in an OpenRTM application because the terminology in both software frameworks obfuscates the similarities. Even the simplest possible data-flow bridge between ROS and Orocos required substantial manual work [30]. In general, the model-driven development paradigm is not very popular in robotics. One possible explanation of the slow adoption of this approach in robotics is that mainstream developers find “modeling” too time-consuming, too abstract, and too long term, in order to be of immediate use. Furthermore, a major hurdle in every model-based initiative is to provide enough added value in an MDE (model-driven engineering) toolchain, such that developers can adopt MDE in their daily workflow. In this paper we introduce the BRICS Component Model 1 (BCM) to provide developers with as much structure in their development process as is possible without going into any application-specific details. This BCM structure is, on the one hand, rich enough to make a real difference and on the other hand, generic enough to be applicable in all of the above-mentioned robotics software frameworks. The BCM itself is a design paradigm, in that it introduces a methodology. More precisely, we adopt the OMG’s meta-modeling paradigm and introduce platform-independent meta-models with a particular emphasis on separation of concerns. Furthermore, we put much emphasis on the complementary roles of (i) advocating the importance of introducing formal models into the robotics development process (instead of the mostly code-only software frameworks that are now popular), and (ii) well documented “best practices” that solve particular use cases. The medium-term ambition of the BCM developments is that, both together, will suffice to allow developers to first model their components and systems (hopefully starting from a rich model repository with “best practice” sub-systems and components), and then generate code from those models, using the best available implementations in available software frameworks. Overview of the paper. Section 2 reports about related works in the context of component based model and meta model driven engineering. Section 3 explains how metamodeling can be introduced in robotics, with the envisaged balance between concrete formal structure and simplicity. Section 4 introduces the second important part of BCM, the so-called “5Cs” that extend the well know paradigm of the separation of the 4 concerns [27]. In section 5 we discuss a use case which applies the concepts introduced in this paper. Finally, in section 6 we draw the relevant conclusions.

2.

RELATED WORK

We discuss related work in the field of component models in robotics and beyond as well as model-driven engineering in general.

2.1

Component Models

When the robotics community became aware that it makes sense to spend time developing reusable software frameworks2 , the CORBA Component Model (CCM) was a large source of inspiration. However, the advantages of its component model rather quickly lost the battle against the huge learning curve and massiveness of the CORBA standard: it was considered to be way too heavy and all-encompassing for the needs of the robotics community. So, the CCM was not able of establishing itself in mainstream robotics, except for the (at least in the Western world) less popular OpenRTM, OPRoS, and SmartSoft, and, to a smaller extent, Orocos which has had industrial users since its inception. Nevertheless, the most recent “Lightweight” version of the standard [25] has a focus on realtime and embedded systems, and would fit a lot better to the current robotics needs and mindset in the community. However, ROS has conquered most of the community via its low entry threshold, yet of all the robotics software frameworks mentioned in this paper it is least formalized. The BCM introduced in this paper shares parts of its structural model with that of the CCM (components, interacting via methods, data streams and/or events, and composed by ports). The BCM shares this context and goals with many other engineering domains that require lots of online data processing. For example, some large application domains outside of robotics also have seen the need for component-based software frameworks [11]: automotive (with AUTOSAR [9] as primary “component model” standard, and AADL [13] as pioneer modeling language for “computational resources”), aerospace (driving UML-based evolutions such as MARTE [23]), embedded systems [21, 17], and service component architectures in web-based systems [22]. Of the most popular robotics software frameworks in the “Western” robotics community, neither ROS nor Orocos have an explicit and formal component model, in contrast to OpenRTM (“Japan”) and OPRoS (“Korea”), which both use Eclipse [12] as a programming tool (but only to a limited extend as a model-driven engineering tool). However, also the component model in OpenRTM and OPRoS is semantically poorer and less explicit than the BCM’s separation of concerns as explained later.

2.2

Model-Driven Engineering

As major short-term ambition, the BCM wants to introduce into the robotics community the mindset that it is worthwhile to provide fully formal models for all of the system constituents, as well as an explicit model-driven engineering development process, with support of models and development workflows in large-scale toolchain ecosystems such as Eclipse. The motivation for this approach is the success that model-driven engineering [2, 3] has seen in domains where industry (and not academics) is driving the large-scale software development: the paradigmatic belief is that complex systems can only be developed in a maintain-

1

Named after the European research project BRICS, Best Practice in Robotics, (Grant number FP7-231940) in which it is developed.

2 This was around 2000, with GenoM [14], Player-Stage [15], Miro [32], Orocos [8] and SmartSoft [28] as pioneers.

1759

Reader Seite 83

"The BRICS Component Model: A Model-Based Development Paradi…" Seite 2

able and deterministic way if they are first modeled, analyzed and verified abstractly, and only then the code (in a concrete programming language) is generated. The robotics domain is not that far yet, mostly because of the attitude of software developers that they can produce code faster and better in their favorite programming language than via the “detour” of formal models. This observation is, in practice and in the short term, very often a valid one, since a mature toolchain is necessary to attain the benefits of model-driven engineering. Yet significantly less efforts are put into building such tools compared to, for example, the average programming language compilers3 . But as soon as the critical developments of (i) model definitions, (ii) model-to-text code generation, and (iii) MDE toolchain support for all phases of the development process (including the currently poorly supported phases of deployment and runtime reconfiguration!), will be realized, the overall development process is expected to speed up with an order of magnitude. The paradigm of meta-modeling, on which our approach is based (as well as the four level pyramid represented in figure 1), has its origin in the Model Driven Engineering (MDE) domain of software engineering, aiming primarily at improving the process of generating code from abstract models that describe a domain. The MDE terminology for going from a higher to a lower level of specificity or detail in the knowledge of a domain is: from platform-independent to platform-specific, by adding the platform knowledge. The Object Management Group [24] is the main driver of standardization in this context of Model Driven Engineering, for which it uses the trademarked name Model Driven Architecture. A huge software effort is being realized in the Eclipse project ecosystem4 to support all aspects of MDE. The four levels of model abstraction defined by OMG have, in this paper’s context, the following meaning:

M3: The Meta-Meta-Model. On this level we used the ECore meta-metamodel provided by the Eclipse Modeling Framework (EMF). M2: The Meta-Models. On this level we introduced an abstract model and a set of specializations of it. The abstract model is called Component-Port-Connector (CPC). This model is universally present (but most often only implicitly) in all robotics software designs, not just for modeling component-based systems but also in many functionality libraries (control systems, Bayesian networks, Bond Graphs, etc.). This metamodel is not unique to robotics but holds in many engineering system contexts where sub-systems interact with each other through well defined interaction points (“ports”). The CPC is specialized by the component models of robotic software frameworks such as ROS and Orocos, which represents the software frameworkspecific meta-models. M1: The concrete Models. This level represents concrete models of concrete robotic systems. The important difference with M3 and M2 is that, at M1 level, toolchain support is very important to help developers in the complex tasks of concrete system design or component development. A major design issue to improve “user friendliness” is the ability of such a toolchain to provide the above mentioned models with terminology that is familiar to the robotics developer community. The BRICS project supports M1 via its BRIDE toolchain, which provides a tool for designing M1 models and a set of Model-to-Model transformations for transforming a model conforming to the CPC meta-model in a model conforming to the Orocos or ROS meta-models. The meta-models and the graphical tool are implemented using EMF and GMF while the M2M and the M2T transformations by means of Epsilon. The following subsections give more details about the core contributions of this paper, namely the abstract level M2.

M3: the highest level of abstraction, that is, the model that represents all the constraints, or restrictions, that a model has to satisfy without encoding any specific knowledge in a domain. The Eclipse Modeling Framework consortium5 has standardized the M3 level via its ECore meta-meta-model language. M2: the level of the so-called (in MDE speak) platformindependent representation of a domain, introducing models with domain specific names relationships (conforming to the M3-level constraints).

M3

Ecore meta-meta-model

meta meta model conforms to specializes

M1: the level of a so-called platform-specific model. That is, a concrete model of a concrete robotic system, but without using a specific programming language.

Abstract meta model

specializes

CPC abstract component meta-model

M2 specific meta model

Specific meta model

RTT and OROCS component meta-models

conforms to

M0: the level of an implementation of a concrete robotic system, using software frameworks and libraries in particular programming languages.

M1

3.

M0

application model

application model

application model

Robot software system models

Instance of

THE BRICS META-MODEL

Real world systems

The general meta-modeling ideas of OMG are applied to the domain of robotics as illustrated in figure 1: Figure 1: The figure shows how the OMG 4 levels of abstraction are applied in the BCM approach.

3

Hence, the BRICS project is working on an Eclipse-based toolchain, BRIDE, [10]. Deeper discussions about this toolchain are beyond the scope of this paper. 4 http://www.eclipse.org 5 www.eclipse.org/emf

3.1

The CPC Meta-Model

1760

Reader Seite 84

"The BRICS Component Model: A Model-Based Development Paradi…" Seite 3

• Composition rules: Connections represent the interactions between the functionalities and behaviours in two Components, as far as accessible through the Components’ Ports.

– A Component contains zero or more Components. – A Component can be contained in only one Component (different from itself). – A Port belongs to one and only one Component. – A Connection is always between two Ports within the same composite Component.

1

Component

0…*

0…* properties

subComponents

Port portB

Property

Communication

How are results of computations being communicated?

• Composition: while the four “Cs” explained above are all motivated by the desire to decouple the design concerns as much as possible (while avoiding to introduce too many separate concerns), the design concern of Composition models the coupling that is always required between components in a particular system. Roughly speaking, a good Composition design provides a motivated trade-off between (i) composability, and (ii) compositionality. The former is the property of individual components of being optimally ready to be reused under composition; the latter is the property of a system to have predictable behaviour as soon as the behaviours of its constituent components are known.

Figure 2: The UML diagram representing the CPC meta-model.

4.

What functionality is computed?

• Configuration: this functionality allows users of the Computation and Communication functionalities to influence the latters’ behaviour and performance, by giving concrete values to the provided configuration parameters (e.g. tuning control or estimation gains, determining Communication channels and their intercomponent interaction policies, providing hardware and software resources and taking care of their appropriate allocation, etc).

0…*

ports

portA

Computation

System

components

1

What parameters deﬁne the behaviour of all components?

• Coordination: this functionality determines how all components in the system should work together, that is, in each state of the coordinating finite state machine, one particular behaviour is configured to be active in each of the components. In other words, Coordination provides the discrete behaviour of a component or a system.

– A Port can be promoted as a composite port, in order to make it accessible to external components. In the same way properties can be promoted as composite properties.

0…*

Conﬁguration

• Communication: this brings data towards the computational components that require it, with the right quality of service (i.e. time, bandwidth, latency, accuracy, priority, etc).

– A Component contains zero or more Ports.

Connector

When must components change their behaviour?

• Computation: this is the core of a system’s functionality, and it implements (in hardware and/or software) the domain knowledge that provides the real added value of the overall system. This typically requires “read” and “write” access to data from sources outside of the component, as well as some form of synchronization between the computational activities in multiple components.

– Connections form a graph, with Ports always inbetween Components and Connections.

0…*

Coordination

Figure 3: Overview of the 5 concerns.

• Constraints: not all compositions make sense, e.g. connecting three Ports to each other directly. Here is a (not yet exhaustive) list of composition constraints. In our meta-model this constraints are expressed as OCL rules.

connectors

Composition

• Modeling primitives: A system is a collection of components and connectors. Components provide the containers of functionality and behaviour, while the Ports give localized and protected access to the Components’ internals. Components can be configured by means of Properties.

overview of how the “5Cs” are defined in the context of the BRICS Component Model (BCM). A more detailed vision of the revised “5Cs” is given below. How are computations, communications, conﬁgurations, and coordinations interacting?

This is the simplest, most abstract model introduced in this paper. The Component-Port-Connector meta-model (CPC) is represented in figure 2 and is made of the following parts.

SEPARATION OF CONCERNS: THE 5CS

In addition to the model driven approach presented in the previous section, BRICS also promotes the separation of concerns. In order to better apply to the Robotics domain, the well known 4 Concerns (4Cs) presented in the seminal work [27], have been extended by separating the original “Configuration” idea into the more fine-grained “Composition” and “Configuration” aspects. Figure 3 gives an

1761

Reader Seite 85

"The BRICS Component Model: A Model-Based Development Paradi…" Seite 4

responds to the homonymous component used in figure 4. Components are represented by means of boxes and their ports are depicted with different shapes. Squared connectors represent data flow ports, lollipop-socket connectors represent provided and required services, triangular connectors represent events communication. Finally AND-style shapes represent component properties. Below we describe how the 5Cs have been decoupled in the design of the depicted components. Computation. Computational components are concerned with the data processing algorithms required by an application [27]. The components computation is triggered by exchanging information (e.g. algorithm input parameters) by means of data-flow ports and service calls. In the case study the top-level composite component contains two computational components: the Local Navigation and the Map Based Navigation. The first one receives a path as input and is in charge of driving the robot towards that path by avoiding dynamic obstacles. The second component is itself a composite. It receives as input a geometrical goal and produces as output a geometrical, obstacle-free path. This composite component contains two leaf components: the Map Server, which provides the map of the environment by means of a service, and the Path Planner, which is in charge of computing the path. Configuration. Configuration determines how components execute their functionality. These aspects can be configured by setting the values of a set of parameters defined in the components implementation. Parameters can be accessed by means of a mechanism called properties, which can be modified at deployment-time by an apposite infrastructure provided with the software framework. This infrastructure is in charge of deploying the system and configuring the components. In our case study the Path Planner component provides two properties: maxIteration, which defines the maximum number of iterations allowed for the path planning algorithm, and map, which defines which map has to be used. Coordination. Coordination components are in charge of starting and stopping the computational components of a composite. They also define how computational components work together. Each composite component has to contain one and only one coordination component, which interacts with other components by sending and receiving only events. Coordination components are typically implemented as state machines. [5] and [18] present how this is possible by using respectively the Abstract State Machines and the rFSM. In the case study the upper level coordinator is in charge of coordinating the activities of Map Based Navigation and Local Navigation. The lower level coordinator instead coordinates the Map Server and the Path Planner. Communication. Communication deals with the exchange of data [27]. In the case study we have defined which mechanism we use for each connection between components (e.g. data flow, service call, events). The use of a specific software framework allows the use of different implementations for the same communication mechanism. Furthermore the most common software framework allows us to define the communication policy (e.g. SCA binding, Orocos Connection Policy), which defines for example the lock mechanism or the buffer size. Composition. Computational and coordination components, including their configuration and communication, can

Robot Navigation goal map

Map Based Navigation

event

path

Coordinator event Local Navigation

Figure 4: The Robot Navigation composite component modeled with CPC.

5.

CASE STUDY

In this section we present a case study that shows how the BRICS Component Model and the separation of concerns have been applied in order to design an architecture for a robot navigation application. Robot navigation is the ability of a mobile robot to autonomously navigate from its current position towards a goal position, while avoiding dangerous situations such as collisions with obstacles. This ability involves several functionality like the management of the environment representation (i.e. Map Server ), the computation of the optimal path (i.e. Path Planning), the capacity of avoiding moving obstacle and controlling of the physical robot (i.e. Local Navigation). Figure 4 and 5 show how the components of this application are connected and how the architecture is organized as a composition hierarchy. The architecture is modeled using the CPC meta-model and for this reason it is softwareframework independent. More in detail the model depicted in figure 4 shows the top level composite component (the Robot Navigation), while the model depicted in figure 5 the Map Based Navigation composite component, which cor-

Map Based Navigation goal

Map Server

event getMap Coordinator event Path Planner

maxIterations

path map

Figure 5: The Map Based Navigation composite component modeled with CPC.

1762

Reader Seite 86

"The BRICS Component Model: A Model-Based Development Paradi…" Seite 5

be hierarchically organized in composites, which can be then reused as components for creating more abstract composites. As well as standard components also composite components provide ports and properties. Their ports and properties belong to internal components and are promoted in order to be part of the composite interface (in the figures ports and properties promotions are depicted by means of dashed lines). External components can communicate with internal components only by means of ports defined in the composite interface. Also in the case of composite components data flow ports and services are used for computation, events for coordination and properties for configuration. In the case study the composition is used for providing the Robot Navigation and the Local Navigation components.

5.1

From Software Framework Independent to Software Framework Specific

Figure 6: The Map Based Navigation composite component modeled with Orocos.

Thanks to the abstraction level offered by CPC, the architecture model presented above doesn’t have any software framework specific element. One the one hand this is desirable because it improves the flexibility, on the other hand that model cannot be implemented on a real system. However thanks to a set of model-to-model transformations it is possible to transform a model conforming to CPC in a new model which conforms to the component model of a specific software framework. Once the transformation has been applied the developer is in charge of completing the model by filling it with a set of information that are specific for the desired software framework. For example the Robot Navigation model can be automatically transformed into an Orocos-specific model. This new model will require some additional information about the ports directionality, the connection policies and the components implementations, which can be inserted by the developer by means of the apposite property views provided by our tool (BRIDE). Figure 6 depicts a screenshot of BRIDE, which supports both the design of M1 models conforming to CPC, Orocos and ROS and the execution of M2M transformations of these models among each other. The depicted model is the result of a M2M transformation (from CPC to Orocos) applied to the model of the Robot Navigation described above. Components are represented by boxes, which contain properties and operations. Data-flow ports and event ports are represented by blue and yellow boxes (blue boxes are input ports, while yellow boxes are output ports). Each port reports its name on the top and the data type on the bottom. Events port can be distinguished from data-flow ports thanks to a flag, which is visible in the property view.

6.

than one hundred developers (PhD students, and academic and industrial robot software engineers) on various occasions (research camps organized by the BRICS project; or targeted dissemination workshops with selected developers), and the outcome is always positive: these developers quickly get concrete “best practices” from the BCM paradigm, even without full toolchain support or standardization of the “5C” models. However, our ambition of course is that developers adopt the tools developed in BRICS to design and develop future robotic systems such that the impact is measurable. Currently, the BCM concepts can be applied successfully in the design and development of new components in existing “non-BCM-based” software frameworks such as ROS or Orocos, by developers that are sufficiently disciplined to map the BCM and its best practices onto the available component primitives in the software frameworks. However, the other way around will most often fail; that is, it is typically impossible to make a 5C model out of an existing ROS or Orocos system, because those software frameworks are not yet supporting their users to use the 5Cs in a systematic way.

7.

ACKNOWLEDGMENTS

The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement no. FP7ICT-231940-BRICS (Best Practice in Robotics). The authors would like to thank all the partners of the BRICS project for their valuable comments.

CONCLUSIONS

8.

The ambition of the BRICS Component Model is to introduce models as first-class citizens in the robotics software development process. The major reasons are (i) providing explicit and formal models in a domain helps developers to structure their design efforts and outcomes, and (ii) code in concrete programming languages can then be generated from the models via a (semi)automatic toolchain (“modelto-text” transformations, in MDE speak), instead of implementing it manually. So, while code generation from models is essential in the ambition of the BCM, its current state of practice is still rather elementary in this context. However, the concepts of the BCM have already been tried on more

ADDITIONAL AUTHORS

Additional authors: Azamat Shakhimardanov, Jan Paulus, Michael Reckhaus (Bonn-Rhein-Sieg University of Applied Sciences), Hugo Garcia (Katholieke Universiteit Leuven), Davide Faconti (Katholieke Universiteit Leuven) and Peter Soetens (Intermodalics).

9.

REFERENCES

[1] N. Ando, T. Suehiro, and T. Kotoku. A software platform for component based rt-system development: Openrtm-aist. In Proceedings of the 1st International Conference on Simulation, Modeling, and

1763

Reader Seite 87

"The BRICS Component Model: A Model-Based Development Paradi…" Seite 6

[2]

[3] [4]

[5]

[6]

[7] [8]

[9] [10]

[11]

[12] [13]

[14]

[15]

[16]

[17]

[18]

Programming for Autonomous Robots, SIMPAR ’08, pages 87–98, Berlin, Heidelberg, 2008. Springer-Verlag. C. Atkinson and T. K¨ uhne. Model-driven development: a metamodeling foundation. IEEE software, 20(5):36–41, 2003. J. B´ezivin. On the unification power of models. Software and Systems Modeling, 4(2):171–188, 2005. A. Brooks, T. Kaupp, A. Makarenko, A. Oreb¨ ack, and S. Williams. Towards component based robotics. In IEEE/RSJ Int. Conf. Intelligent Robots and Systems, pages 163–168, 2005. D. Brugali, L. Gherardi, E. Riccobene, and P. Scandurra. A formal framework for coordinated simulation of heterogeneous service-oriented applications. 8th International Symposium on Formal Aspects of Component Software (FACS), 2011. D. Brugali and A. Shakhimardanov. Component-based robotic engineering part ii: Systems and models. IEEE Robotics and Automation Magazine, 17(1):100 – 112, 2010. H. Bruyninckx. Open RObot COntrol Software. http://www.orocos.org/, 2001. H. Bruyninckx, P. Soetens, and B. Koninckx. The real-time motion control core of the Orocos project. In Proceedings of the International Conference on Robotics and Automation (ICRA), 2003. A. Consortium. Autosar—AUTomotive Open System ARchitecture. http://www.automationml.org, 2003. B. Consortium. BRIDE—the BRIcs Development Environment. http://www.best-of-robotics.org/bride/, 2012. I. Crnkovi´c, S. Sentilles, A. Vulgarakis, and M. R. V. Chaudron. A classifcation framework for software component models. IEEE Trans. Software Engineering, 37(5):593–615, 2011. Eclipse Foundation. The Eclipse Integrated Development Environment. http://www.eclipse.org. P. H. Feiler, D. P. Gluch, and J. J. Hudak. The architecture analysis & design language (AADL): An introduction. Technical Report CMU/SEI-2006-TN-011, Software Engineering Institute, Carnegie Mellon University, 2006. S. Fleury, M. Herrb, and R. Chatila. Genom: A tool for the specification and the implementation of operating modules in a distributed robot architecture. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS), pages 842–848, 1997. B. Gerkey, R. Vaughan, A. Howard, and N. Koenig. The Player/Stage project. http://playerstage.sourceforge.net/, 2001. Groupe de Recherche en Robotique. Proteus: Platform for RObotic modeling and Transformations for End-Users and Scientific communities. http://www.anr-proteus.fr/. P. Hoˇsek, T. Pop, T. Bureˇs, P. Hnˇetynka, and M. Malohlava. Comparison of component frameworks for real-time embedded systems. In 13th International SIGSOFT Symposium on Component-Based Software Engineering (CBSE 2010), pages 21–36, 2010. M. Klotzb¨ ucher, P. Soetens, and H. Bruyninckx.

[19] [20]

[21]

[22] [23]

[24] [25]

[26]

[27]

[28]

[29]

[30]

[31] [32]

Orocos rtt-lua: an execution environment for building real-time robotic domain specific languages. In International Workshop on Dynamic languages for RObotic and Sensors, page 284289, 2010. Korean Institute for Advanced Intelligent Systems. OPRoS. http://opros.or.kr/. A. Mallet, C. Pasteur, M. Herrb, S. Lemaignan, and F. Ingrand. GenoM3: Building middleware-independent robotic components. In IEEE Int. Conf. Robotics and Automation, pages 4627–4632, 2010. R. Mirandola and F. Pl´ aˇsil. CoCoTA—Common Component Task. In R. M. Andreas Rausch, Ralf Reussner and P. Frantiˇsek, editors, The Common Component Modeling Example. Comparing Software Component Models, volume 5153 of Lecture Notes in Computer Science, pages 4–15. Springer-Verlag, 2008. OASIS. Service Component Architecture. http://www.oasis-opencsa.org/sca. Object Management Group. Modeling and Analysis of Real-time and Embedded systems. http://www.omgmarte.org. Object Management Group. OMG. http://www.omg.org. Open Management Group. CORBA: DDS for Lightweight CCM. http://www.omg.org/spec/dds4ccm/. M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, and A. Y. Ng. Ros: an open-source robot operating system. In Proceedings of the Workshop on Open Source Software held at the International Conference on Robotics and Automation (ICRA)., 2009. M. Radestock and S. Eisenbach. Coordination in evolving systems. In Trends in Distributed Systems. CORBA and Beyond, pages 162–176. Springer-Verlag, 1996. C. Schlegel and R. W¨ orz. The software framework SmartSoft for implementing sensorimotor systems. In IEEE/RSJ Int. Conf. Intelligent Robots and Systems, pages 1610–1616, 1999. A. Shakhimardanov, N. Hochgeschwender, and G. K. Kraetzschmar. Component models in robotics software. In Proceedings of the Workshop on Performance Metrics for Intelligent Systems, Baltimore, USA., 2010. R. Smits and H. Bruyninckx. Composition of complex robot applications via data flow integration. In ICRA, pages 5576–5580. IEEE, 2011. C. Szyperski. Component Software: Beyond Object-Oriented Programming. Addison Wesley, 1998. H. Utz, S. Sablatn¨ og, S. Enderle, and G. Kraetzschmar. Miro—Middleware for mobile robot applications. IEEE Trans. Robotics and Automation, 18(4):493–497, 2002.

1764

Reader Seite 88

"The BRICS Component Model: A Model-Based Development Paradi…" Seite 7

Thema 5: Coordination - Quelle 1: A-Practical-Guide-to-SysML-Third-Edition-The -Systems-Modeling-Language, Seiten 3 bis 6

11.3 Specifying States in a State Machine

275

lifecycle, and finally ends up at idle again, when it may receive a Turn Off signal that causes it to complete its behavior. The notation for state machine diagrams is shown in the Appendix, Tables A.21 through A.23.

11.3 SPECIFYING STATES IN A STATE MACHINE A state machine is a potentially reusable definition of some state-dependent behavior. State machines typically execute in the context of a block, and events experienced by the block instance may cause state transitions.

11.3.1 REGION A state machine can contain one or more regions, which together describe the state-related behavior of the state machine. Each region is defined in terms of states and pseudostates and the transitions between them. An active region has exactly one active state within it at a given time. The difference between a state and a pseudostate is that a region can never stay in pseudostate, which merely exists to help determine the next active state. If a state machine contains a single region, it typically is not named, but if there are multiple regions, they are often named. A state machine with multiple regions may describe some concurrent behavior happening within the state machine’s owning block. This may represent an abstraction of the behavior of different parts of the block, as discussed in Chapter 7, Section 7.5.1. For example, one part of a factory may be storing incoming material, another turning raw material into finished products, and yet another sending out finished goods. The state machine may also include concurrent behaviors—such as a camera being panned and tilted at the same time—that are performed by multiple parts. If the parts’ behaviors are specified, the relationship between the state machine for the parent block and the behaviors of its parts should also be specified. States can also contain multiple regions, as described in Section 11.6.2, but this section describes simple states only (i.e., states with no regions and therefore without nested states). The initialization and completion of a region are described using an initial pseudostate and final state, respectively. An initial pseudostate is used to determine the initial state of a region. The outgoing transition from an initial pseudostate may include an effect (see Section 11.4.1 for a detailed discussion of transition effects). Such effects are often used to set the initial values of properties used by the state machine. When the active state of a region is the final state, the region has completed, and no more transitions take place within it. Hence, a final state can have no outgoing transitions. The terminate pseudostate is always associated with the state of an entire state machine. If a terminate pseudostate is reached, then the behavior of the state machine terminates. A terminate pseudostate has the same effect as reaching the final states of all the state machine’s regions. The termination of the state machine does not imply the destruction of its owning object, but it does mean that the object will not respond to events via its state machine. If a state machine has a single region, it is represented by the area inside the frame of the state machine diagram. Multiple regions are shown separated by dashed lines. The notation for the concepts introduced thus far is as follows: • An initial pseudostate is shown as a filled circle. • A final state is shown as a bulls-eye (i.e., a filled circle surrounded by a larger hollow circle). • A terminate pseudostate is shown as an X.

Reader Seite 90

"A-Practical-Guide-to-SysML-Third-Edition-The-Systems-Modeli…" Seite 3

276

CHAPTER 11 MODELING EVENT-BASED BEHAVIOR WITH STATE MACHINES

11.3.2 STATE A state represents some significant condition in the life of a block, typically because it represents some change in how the block responds to events and what behaviors it performs. This condition can be specified in terms of the values of selected properties of the block, but typically the condition is expressed in terms of implicit state variable(s) for each region. It is helpful to use the analogy that the block is controlled by a switch. Each state corresponds to a switch position for the block, and the block can exhibit some specified behavior in each switch position. The state machine defines all valid switch positions (i.e., states) and transitions between switch positions (i.e., state transitions). If there are multiple regions, each region is controlled by its own switch with its switch positions corresponding to its states. The switch positions can be specified by a form of truth table—similar to how logic gates can be specified—in which the current states and transitions define the next state. Each state may contain entry and exit behaviors that are performed whenever the state is entered or exited, respectively. In addition, the state may contain a do behavior that executes once the entry behavior has completed. The do behavior continues to execute until it completes or the state is exited. Although any SysML behavior can be used, entry and exit behaviors and do behaviors are typically activities or opaque behaviors. A state is represented by a round-cornered box containing its name. Entry and exit behaviors and do behaviors are described as text expressions preceded by the keywords entry, exit, or do and a forward slash. There is some flexibility in the content of the textual expression. The text expression typically is the name of the behavior, but when the behavior is an opaque behavior, the body of the opaque behavior can be used instead (refer to Chapter 7, Section 7.5 for a description of an opaque behavior). Figure 11.2 shows a simple state machine for the Surveillance System, with a single operating state in its single region. A transition from the region’s initial pseudostate goes to the operating state. On entry, the Surveillance System displays that it is operational on all operator consoles, and on exit, it displays a shutdown status. While the Surveillance System is in the operating state, it performs a do activity of its standard function to Monitor Site, which is monitoring the building where it is installed for any unauthorized entry. When in the operating state, a Turn Off signal triggers a transition to the final state, and because there is only a single region, the state machine terminates. stm Surveillance System

Turn Off operating entry/Display "Operating" Status do/Monitor Site exit/Display "Shutdown" Status

FIGURE 11.2 A state machine containing a single state.

Reader Seite 91

"A-Practical-Guide-to-SysML-Third-Edition-The-Systems-Modeli…" Seite 4

11.4 Transitioning between States

277

11.4 TRANSITIONING BETWEEN STATES A transition specifies when a change of state occurs within a state machine. State machines always run to completion once a transition is triggered, which means that they are not able to consume another trigger event until the state machine has completed the processing of the current event. 11.4.1 TRANSITION FUNDAMENTALS A transition may include one or more triggers, a guard, and an effect as described next. Trigger A trigger identifies the possible stimuli that cause a transition to occur. SysML has four main kinds of triggering events. • A signal event indicates that a new asynchronous message corresponding to a signal has arrived. A signal event may be accompanied by a number of arguments that can be used in the transition effect. • A time event indicates either that a given time interval has passed since the current state was entered (relative) or that a given instant in time has been reached (absolute). • A change event indicates that some condition has been satisfied (normally that some specific set of attribute values hold). Change events are discussed in Section 11.7. • A call event indicates that an operation on the state machine’s owning block has been requested. A call event may also be accompanied by a number of arguments. Call events are discussed in Section 11.5. Once the entry behavior of a state has completed, transitions can be triggered by events irrespective of what is happening within the state. For example, a transition may be triggered while a do activity is executing, in which case the do activity is terminated. By default, events must be consumed when they are presented to the state machine, even if they do not trigger transitions. However, events may be explicitly deferred while in a specific state for later handling. The deferred event is not consumed as long as the state machine remains in that state. As soon as the state machine enters a state in which the event is not deferred, the event must be consumed before any others. The event triggers a transition or it is consumed without any effect. Transitions can also be triggered by internally generated completion events. For a simple state, a completion event is generated when the entry behavior and the do behavior have completed.

Guard The transition guard contains an expression that must evaluate to true for the transition to occur. The guard is specified using a constraint, introduced in Chapter 8, Section 8.2, which includes a textual expression to represent the guard condition. When an event satisfies a trigger, the guard on the transition is evaluated. If the guard evaluates to true, the transition is triggered; if the guard evaluates to false, then the event is consumed with no effect. Guards can test the state of the state machine using the operators in (state x) and not in (state x).

Reader Seite 92

"A-Practical-Guide-to-SysML-Third-Edition-The-Systems-Modeli…" Seite 5

278

CHAPTER 11 MODELING EVENT-BASED BEHAVIOR WITH STATE MACHINES

Effect The third part of the transition is the transition effect. The effect is a behavior, normally an activity or an opaque behavior, executed during the transition from one state to another. For a signal or call event, the arguments of the corresponding signal or operation call can either be used directly within the transition effect or be assigned to attributes of the block owning the state machine. The transition effect can be an arbitrarily complex behavior that may include send signal actions or operation calls used to interact with other blocks. If the transition is triggered, first the exit behavior of the current (source) state is executed, then the transition effect is executed, and finally the entry behavior of the target state is executed. A state machine can contain transitions, called internal transitions, which do not effect a change in state. An internal transition has the same source and destination and, if triggered, simply executes the transition effect. By contrast, an external transition with the same source and destination state—sometimes called a transition-to-self—triggers the execution of that state’s entry and exit behaviors as well as the transition effect. One frequently overlooked consequence of internal transitions is that, because the state is not exited and entered, timers for relative time events are not reset.

Transition notation A transition is shown as an arrow between two states, with the head pointing to the target state. Transitions-to-self are shown with both ends of the arrow attached to the same state. Internal transitions are not shown as graphical paths but are listed on separate lines within the state symbol, as shown in Figure 11.9. The definition of the transition’s behavior is shown in a formatted string on the transition with the list of triggers first, followed by a guard in square brackets, and finally the transition effect preceded by a forward slash. Section 11.4.3 describes an alternate graphical syntax for transitions. The text for a trigger depends on the event, as follows: • Signal and call events—the name of the signal or operation followed optionally by a list of attribute assignments in parentheses. Call events are typically distinguished by including the parentheses even when there are no attribute assignments. Although this is a useful convention, it is not part of the standard notation. • Time events—the term after or at followed by the time. after indicates that the time is relative to the moment when the state is entered. at indicates that the time is an absolute time. • Change events—the term when followed by the condition that has to be met in parentheses. Like other constraint expressions, the condition is expressed in text with the expression language optionally in braces. The effect expression may either be the name of the invoked behavior or contain the text of an opaque behavior. When an event is deferred in a state, the event is shown inside the state symbol using the text for the trigger followed by a “/” and the keyword defer. See Figure 11.12 for an example. Transitions can also be named, in which case the name may appear alongside the transition instead of the transition expression. A name is sometimes a useful shorthand for a very long transition expression. Figure 11.3 shows a more sophisticated state machine for the Surveillance System than in Figure 11.2, with all the principal states and the transitions between them. In contrast to Figure 11.2, the initial pseudostate now indicates that the region starts at the idle state. The final state is now also reached from the idle state, but it is still triggered by the receipt of a Turn Off signal. Once processing is complete in the initializing state (refer to Figure 11.14 to view inside the initializing state), a completion event for initializing will be

Reader Seite 93

"A-Practical-Guide-to-SysML-Third-Edition-The-Systems-Modeli…" Seite 6

Thema 7: Model Verification - Quelle 1: An Overview of Program Analysis using Formal Methods, Seiten 4 bis 13 und 27 bis 47

What can it be used for? In the context of DSLs and program verification, what can the approach be used for?

annotations in an IDE, encouraging the user to change the program (presumably in order to make it “more correct”). For example, the user might get a warning that some part of the code is unreachable or an error5 that two guards in a switch-case-statement overlap. In the former case the user might want to delete the dead code, and in the latter case, the user must change one of the guard expressions to remove the overlap. Alternatively, analysis results can also be used by downstream tools directly. For example, one analysis might depend on the result of another one, or a compiler/generator might use the unreachable code information to not generate any binary code for the dead source (e↵ectively removing the source code implicitly).

Real-world use. In real-world systems, what has been done with this approach? Does it scale?

2.

Introduction

2.1

Terminology

Formal Methods The term ”formal methods” is not well-defined. We have heard people use it to denote techniques that rigorously formalize mathematical models of their object of study and use mathematical proofs. Others call the combination of a symbolic program verification methodology (e.g., symbolic execution, weakest precondition calculus or Hoare Logic) with an interactive theorem proving backend a ”formal method”. This combination is optimal in terms of soundness and completeness, these tools are only semi-automatic: they require manual e↵ort as well as extensive training in the used formalisms and mathematical proof, as the user has to manually construct proofs about his programs. When interpreted loosely, the term encompasses most of theoretical computer science, to the degree it is used for verifying the correctness of programs.

2.2

Execution vs. Analysis

For the purpose of this discussion, we consider a program to be an implementation of an algorithm that takes input values, performs a computation, and then produces output values. The program abstracts over these values, representing them as (usually typed) variables. Note that programs are hierarchical; for this discussion, we consider a program to be a function. As developers, we are used to executing programs. Execution means that we supply a particular value for each input variable, and then run the algorithm encoded in the program. The parts of the algorithm that are actually executed may depend on the values (a consequence of conditionals in the code). As a result of execution, we get particular values for output variables. Program analysis, in contrast, attempts to derive properties of a program for all possible input values, in order to characterize the set of all possible output values or to find problems in its internal structure. A brute force way of analysing a program is to actually execute it for all values, one at a time, and then generalize from the collected output values or other data collected during the executions. While this is possible in principle, it only works for trivial programs as the set of inputs may be infinite in general and – even when finite – is often prohibitively large in practice. Program analysis attempts to derive the same results through a more ”clever” approach. In contrast, in runtime analysis (or runtime verification), the analysis is performed at runtime. For example, an instrumented program could collect execution traces, memory allocations, or the execution frequencies of statements, for example, to find out which parts of the program are worth optimizing. Another example would be a state machine, in which, when two transitions are ready to fire at the same time (because their guards overlap), a runtime error is thrown. In both cases, data

Program Verification and Analysis Many people (including us) use the term ”formal methods” as a synonym for ”program verification”, which is that part of theoretical computer science that deals with deriving formal guarantees about a program’s behaviour. Static verification is that part of program verification that deals with deciding program properties for all possible executions of the program (i.e., not by running the program). This is in contrast to runtime verification, which monitors properties of a particular execution at runtime. Program analysis refers to any analysis of a program, including those that do not contribute to correctness. For example, clone detection, timing analysis or style checkers can be seen as program analyses, even though they are not program verifications in the sense that they help with program correctness. These analyses are out of scope of this paper. Thus: In this booklet, we introduce practically useful program analysis and verification techniques and illustrate their use in the context of DSLs. In the remainder of this booklet, when we use the term program analysis, we refer to analyses that help with correctness. We use program verification program analysis interchangeably. Use of Verification Results Once results have been obtained, they can be used in various ways: they can be presented to the user, for example, through source code

5 In

this booklet we gloss over the di↵erences between defect, fault, failure and bug; see [53].

3

Reader Seite 95

"An Overview of Program Analysis using Formal Methods" Seite 4

is collected (or errors are found) only for particular execution of the program, hopefully during (systematic) testing. This is in contrast to the static analysis discussed in this booklet: a static program analysis can prove the absence of particular errors (for all possible executions of the program), whereas a dynamic program analysis can show that some errors actually occur. By this definition, runtime analysis or verification is a misnomer; it should instead be called runtime monitoring or runtime diagnostics.

Type Systems Type systems are well known to programmers; they associate a type with a program node. That type is used to check program correctness beyond the structures defined by the AST7 . For example, while a plus operator structurally works with two arbitrarily structured expressions e1 + e2, from a type system perspective, e1 and e2 must either both be of type number or of type string in a typical language. Typical tasks of a type system are calculation of simple types (the type of 2 is number, deriving types from other types (a variable reference’s type is the type of the referenced variable), computing the supertype of a set of types (the supertype of int and real is real) and checking expressions for type compatibility (the plus example from before). In the vast majority of cases, the type of a program node can be computed by inspecting a single program node (e.g., number literals) or by looking at simple structural relations between program nodes (e.g., function call, or the two children of plus). Type systems are discussed in Sec. 3. Note that often, name analysis, i.e., the challenge of binding names mentioned in the program to its definition, is considered part of the type system as well. However, we do not consider name analysis in this booklet, mainly because our experience is centered around languages built on MPS where name analysis is not an issue.

Value Analysis A value analysis derives properties of values computed in a program. Examples include type checking (what kind of values are possible?), interval analysis (What are the minimum and maximum values?), null analysis (can this value ever be null? Do I need a check to prevent NullPointerExceptions?), or constant propagation (is a condition always true or false? Can the conditional hence be omitted?). Many aspects are principially undicidable statically, i.e., when they are based on input values that are not know before runtime. Thus, such analyses often rely on heuristics and/or program annotations provided by the user6 . Location Analysis Location analysis aims to derive properties of program locations and/or program parts. Examples include reachability (can this part be executed at all ?); the dead code analysis mentioned earlier is an example of this), coverage (for a given set of inputs (tests), are specific parts of the program executed?), timing (how long does it take to execute this part of the program) or resource consumption (how much memory does this function need?). Note that a location analysis might require value information as well. For example, whether a part of a program is executed may depend on whether a condition can ever become true. Also, the worst case execution time of a while loop obviously depends on the (initial) values of the variables referenced in its condition. This booklet focuses on analysis (as opposed to execution) and mostly on value-based analysis. 2.3

Abstract Interpretation In contrast to type systems, which rely mostly on the structure of a program to compute and check types, analyses based on abstract interpretation rely on (variants of the) execution semantics. For example, a data flow analysis that checks for uninitialized reads will ”run”8 the program and see whether any variable may be read before it is ever written; or a constant propagation analysis ”runs” the program to see if conditions in an if statement can be statically determined to always be true or false. We have put the word run into quotation marks because, to make analyses based on abstract interpretation efficient, programs may be run partially (local analyses in a single function), with approximating data types (lattices) or with simplified operational semantics (ignoring exceptions). We discuss abstract interpretation and data flow analyses in Sec. 4.

Overview over di↵erent Approaches

An important part of integrating an analysis into a DSL is choosing the right formal method for the job. As a basis for this decision, we will in this section provide an overview of the most relevant formal methods. For the purposes of this tutorial, we identify the following four major approaches. We provide a quick overview here, and then elaborate in the rest of this booklet. In the book Principles of Program Analysis by Nielson et al. [33] all four are explained in detail.

Constraint Solving Constraints are similar to functions in that they relate various values and variables with each other. A function f(x) = 2 * x computes a value f when given a value x. Functions are executed “from right to left”: when provided values for all arguments, the resulting value(s) is calculated. Functions can call 7 The

abstract syntax tree (AST) is a tree or graph representation of the structure of a program. Notational details such as keywords or whitespace are elided. We provide more detail on the AST and other derived data structures in Sec. 4 and Fig. 5. 8 Technially: it calculates along the control flow paths.

6 Or

information that can be derived from domain-specific abstractions in the program – this is the synergy with language engineering again.

4

Reader Seite 96

"An Overview of Program Analysis using Formal Methods" Seite 5

other functions, but they are still executed in a linear, nested, and perhaps recursive way. Constraints express relationships between variables, for example, 2 * x == 4 * y. A constraint solver computes values for all involved variables (here: x and y) so that the constraint is true, such as x=2 and y=1. The solver can do this even when multiple constraint equations and many variables are involved. Importantly, in contrast to functions, constraints do not imply a ”direction”. This means that solvers do not just ”execute” programs, they search/find solutions to questions that make a program become valid (often called forward evaluation, or just solving). We are primarily concerned with SMT solvers in this booklet. They can do this for complex Boolean expressions, as well as a wide range of other theories, such as numbers or lists. We discuss SMT solving in Sec. 5.

Verification, in contrast, relies on analyses, i.e., the programs’s behavior for all possible executions/inputs (as we have discussed previously). In other words, testing shows the presence of bugs whereas verification proves their absence. As a consequence, testing su↵ers from the coverage problem, which means that we can only be sure that the program behaves correctly for those inputs that are covered by the tests. There are many means of measuring coverage [54] including coverage of input vectors, coverage of all paths/lines in the code, or coverage of all decisions. The software engineering community disagrees over which coverage measure is appropriate. What is an appropriate minimum coverage value for us to be able to trust the code (i.e., be sure that it is correct) depends on the risk associated with faults – always testing everything to 100% coverage is too expensive and hence not feasible. This sounds like verification is always better than testing. However, as we will see in this booklet, verification has limitations regarding computational complexity (things get slow), and complexity regarding the users skills. In particular, writing correct specifications (that specify a program’s behavior for all possible cases) is much more challenging for practitioners than writing test cases, which means that is likely that a specification is wrong or incomplete – thus, voiding the theoretical benefits of verification. Reasons for this include the fact that often, a separate language must be used for specifications, that it is usually more complicated to think about classes of behaviors rather than particular instances, and that practitioners just aren’t used to it. Another di↵erence between testing and verification is the scope: verification ensures that the program is correct according to its specification under some assumptions, e.g. that the hardware is correct. Testing has no assumptions, but the limitation that only one specific execution is considered. It is obvious that verification do not replace testing, but rather complements it. A middle-ground is test case generation: using an analysis, one determines input vectors that satisfy one of the relevant coverage criteria, as well as possible preconditions for the test subject. We then actually run the tests for those vectors. We discuss this briefly in Sec. 5.6.

Model Checking For formalisms that have sidee↵ects, such as state machines or non-functional programming languages, constraints have to consider the evolution/change of state over time; the constraint language, as well as the tools for checking the constraints must be able to handle notion of (logical, discrete) time. Model checking is such a formalism. For example, in a functional language, one might express a postcondition of a function as post: res >= arg1 + arg2. In a language with side e↵ects, one might have to express that a function increments a global variable, and to do this, one has to refer to the old value of that variable: post: global = old(global) + 1. The old(..) notation refers to the value of the global variable before the execution of the function – essentially, it looks back in time. A second example are constraints such as “After the state machine has been in state A, it will always eventually reach state B”. This constrains the sequence of states that are valid for a given state machine. Such constraints can be checked using model checking; model checkers work on transition systems (aka state machines) and verify constraints expressed in temporal logic formulas. The one mentioned above can be expressed as AG((state == A) => AF(state == B)), where AG and AF are operators from temporal logic that quantify over time and execution paths. We discuss model checking in Sec. 6. 2.4

2.5

Testing vs. Verification

Correct-by-Construction

If it is impossible to write a defective program, then there is no need to verify or test that a program is correct. A language/system that allows only the implementation of correct programs is called correct-by-construction. While this is an appealing idea, it has limitations. Let us consider the case of implementing state-based behavior. In C, state-based behaviour might be implemented as a state machine simulated with a switch

Testing refers to executing a (part of a) program for specific input values and asserting that the behaviour (i.e., often the outputs) corresponds to some expectation9 . 9 We

are consciously avoiding the term “specification” here, because usually there are no formal specifications for programs available if testing is the primary method of ensuring the correctness of a program.

5

Reader Seite 97

"An Overview of Program Analysis using Formal Methods" Seite 6

statement. One particular error is to forget the break statement in the cases, leading to unwanted fall through. This problem could be found by analysing the code (ensuring that every path through the code in the cases either ends with a break or a return). Alternatively one can use a C extension that provides language concepts for encoding state machines directly, such as those provided by mbeddr (mbeddr is a set of language extensions for C to facilitate embedded software development [49, 50]). Because the abstractions encoded in the additional language constructs are aligned with the domain (state-based behavior), the programmer does not have to insert the breaks manually. So, regarding this particular error, the use of the C state-machines extension indeed guarantees correctness-by-construction. However, the programmer can still implement a state machine that has missing transitions or a “loop” because the machine executes ✏-transitions10 into the same state over and over again. The state machine language will very likely not prevent the user from these mistakes; thus it is not correct-by-construction for these errors. Similarly, consider the infamous C strcpy function that copies a string from one bu↵er to another. strcpy continues copying characters until it finds a null character in the source bu↵er; if none is found, it copies forever and overwrites memory beyond the (limited size) target bu↵er, leading to security vulnerabilities. Removing strcpy and forcing users to use strncopy, where the length of the bu↵er is given by an additional parameter, is a good idea; however, users can still pass in the wrong length, also leading to an error. So, while the forced use of strncpy avoids some errors, it does not prevent all. In both cases – state machine and strcpy – we can combine correctness-by-construction with an analysis: we can implement an analysis on the state machine abstractions that finds ✏-transitions into the same state and reports them as an error. Similarly, we might want to analyze C programs to see if the length argument to strncpy is correct. It should be obvious that the former analysis is much easier to build than the latter. This suggests that, even though adding domain-relevant concepts (such as state machines) as first-class language constructs does not lead to correctness-by-construction for all possible errors, it likely makes (relevant) analyses simpler. We discuss this synergy between analysis and language engineering in Sec. 2.10. In practice, the distinction between correctness-byconstruction and analysis/verification is blurry. Consider the example above where an IDE-integrated analysis reports ✏-transitions into the same state essentially in realtime. The user can technically write buggy code (in

the sense that the language does not actually prevent him from expressing the error), but because he gets the error immediately (i.e., without invoking a special tool or maybe adding additional specifications to the code), this is just as good as a formalism that syntactically prevents the user from making the error. Thus, we consider IDE-integrated verification techniques to contribute to correctness-by-construction. There is another problem with correctness-by-construction. Very often, correctness-by-construction is achieved through better (higher) abstractions. To execute them, these typically have to be translated to a lower-level implementation. The problem is that this transformation might be buggy11 , i.e., the generated code might not have the same characteristics as the more abstract model that guaranteed the absence of some particular (class of) errors. For example, the generator that transforms the C state machine extensions into low-level C (e.g., switch statements), might “forget” to generate the necessary break statements. While test case generation can help (the test are derived from the model but executed on the generated code, thereby “propagating” the checks down), ensuring transformation correctness is a challenge that is beyond the scope of this booklet. 2.6

Derivation vs. Checking vs. Synthesis

Formal methods can be used several ways; which of those are supported depends on the method itself – in particular, synthesis is not supported by all methods. Derivation The most straightforward one is deriving a property. For example, for an expression 2 + 3.33, the type of + is derived to be the common supertype of 2 and 3.33, which will be float (the type of the 3.33). This is the classic case of program analysis: it computes a property of an expression, declaration or statement and “attaches” it to the corresponding AST node. It can then be shown in an IDE as a error or warning (“this condition is always true”) or it can be exploited in a transformation (a condition that is always true can simply be omitted). Checking Checking compares two properties, one of them typically specified by the user and the other one automatically derived. For example, a variable declaration could be specified as int x = 2 + 3.33; The type of the expression is derived, and then checked against the specification int. In this sense, checking is very much related to derivation, because a check typically first involves the derivation. The check itself is usually relatively simple. 11 This

10 “Automatic”

is true for any compiler; but it is especially critical if the source language suggests that programs are provably correct, because users might not do a lot of testing because of this guarantee

transitions that are triggered without an incoming

event.

6

Reader Seite 98

"An Overview of Program Analysis using Formal Methods" Seite 7

Synthesis Synthesis refers to finding a program that satisfies a specification (the reliance on a specification is a commonality with checking). To continue with the example above, one could write a program that contains a variable declaration int x = ??; where the ?? mark represents a hole in the program that needs to be completed. The task of the formal method is to synthesize a replacement (or completion) for the hole that satisfies the specification. There are two challenges with this. First, the method must be able to actually find solutions; whether this is possible, and how fast, depends on the math that underlies the formalism and the sophisticatedness of the tool. Second, one has to define a sufficiently narrow specification, because the synthesis usually finds the simplest solution (int x = 0 in the example); this solution, while correct, might not be useful. A narrower specification has to be written, and the synthesis repeated. For example, you could write int x = ?? + ??;, which might force the synthesizer to fill the two arguments of the plus operator. 2.7

2 3

x * 2

In this case, the postcondition (i.e., specification of what the function returns) is identical to the implementation of the function. Unless the specification is taken from some interface defined by somebody else, it does not add value12 . However, consider the next example: 1 2 3 4 5 6 7 8

fun sortAsc(l: list): list post l.size == res.size post forall e in l { e in res } post forall i, j: int[0..l.size] { j > i => res[j] >= res[i] } { .. 20 lines of intricate quicksort ... }

Here, the specification concisely expresses that (1) the resulting list res has the same size as the input list l (2) contains the same elements and (3) must be sorted in an ascending order. This is much more concise and understandable than 20 or so lines of Quicksort [19] implementation. It is also generic in the sense that it works for many di↵erent implementations, which is another characteristic of a good specification. You probably think that this specification is current and complete. But in fact, it is not: it does not say anything about the number of occurrences and does also not exclude duplicates. e.g., 1, 1, 1, 2 might be sorted to 1, 2, 2, 2 which would fulfil the specification. Another constraint is necessary. This illustrates nicely that writing correct and complete specifications is not a simple task! Ideally, a specification is declarative, so it itself can be verified for internal consistency13 . For example, if the postcondition would have been written as

Specifications

A specification describes what something does, but not how. Another way of expressing this is that a specification defines a (black-box) contract but presupposes nothing about the implementation. A specification achieves this by precisely describing a property of an artifact (in our case, a piece of code), while not saying anything about other properties. Of course, the boundary between what/contract and how/implementation is arbitrary to some degree: for example, a specification might specify that an array should be sorted, leaving the actual algorithm to the implementation; but a specification might also prescribe the algorithm (e.g., quicksort), but not define how it is implemented in a particular programming language.

1 2

post forall i, j: int[0..l.size] { j == i => res[j] > res[i] }

then this could be identified as inconsistent, because, since res[i] and res[j] refer to the same value, one cannot be greater than the other one.

Good Specifications In some sense, a specification is redundant to the system it specifies: it specifies behavior which the code already implements. A verification can be built so as to check if the implementation fulfils the specification, and if the two do not represent the same (or compatible) behaviors, the verification fails. The reason for the failure can either be an error in the implementation or in the specification. Often one assumes that the implementation is wrong because a good specification is simpler than the implementation (in terms of size or accidental complexity, for example, because it ignores non-functional requirements such as performance), but nonetheless expresses the relevant property. Because it is simpler, it is easier to relate to the original (textual or perceived) requirements, and hence it is less likely to get the specification wrong than the implementation. Let us look at examples: 1

}

Exploiting Redundancy Some specifications cannot be used as the input to a static program analyzer that proves properties for all possible executions of a program (because of the use of a formalism that is not supported by the verifier, because it is too complex or because it is not complete). In this case, we can at least exploit the inherent redundancy. For example, in order to ensure the semantics of a language, we can implement the language 12 Another

reason why the specification might be useful is when the specification language has di↵erent semantics. For example, the implementation, if written in C, might overflow, whereas the specification might not. 13 This relates to theorem proving which relies on a sequence of proofs, which in turn rely on axioms or previously proven rules. Theorem proving is another formal method that can be used to prove programs correct, but it is beyond the scope of this booklet. See [48] and [34] for an introduction.

fun divBy2(x: int) post res = x * 2 {

7

Reader Seite 99

"An Overview of Program Analysis using Formal Methods" Seite 8

The generator contains predefined mappings from the language construct to to-be-generated lower level code, implementing the code for the sort construct trivially. In contrast to program synthesis, the generator would have to contain specific, manually written generator logic for each of these predefined constructs. So synthesis has to add additional information to ”invent” an algorithm whereas generation has the algorithms already add hand, at least in the generator. This means that building generators for specific language constructs is easy (and practitioners do it all the time), and program synthesis is hard (and is done only in very narrow circumstances). In this booklet, most of our use cases for formal methods center around derivation and checking, and only very limited forms of synthesis.

Figure 2. An example of a hierarchical system. with a (simple) interpreter and with a (more complex, fast) code generator. We consider the interpreter the “specification”. We can run all programs with both, and if one fails, we know that there is either a bug in the interpreter or the code generator. Using the simplicity argument from above, we assume the code generator is defective and we can try to fix it. Note that, similar to testing, this approach is not general: we can only prove the absence of bugs for those programs that are actually written and executed in both environments.

2.8

Decomposition Specifications themselves might become big and/or complicated. This leads to two problems. First, the assumption that it is “easy to get right” might be called into question. Second, the verification tool might run out of steam when trying to verify it. Solving this problem relies on a trusted software engineering approach: decomposition. Consider Fig. 2. To verify this system, one starts by verifying B4 using the (assumed to be correct) specifications of the internals (B5, B6). Once B2, B3 and B4 are verified this way, we can verify B1, again, assuming the internals are correct. For each level, we only have to consider the direct next level, but we can ignore the internals of that next level. This limits overall verification complexity, both for the tool, and for the user.

No single formal method excels in every respect (and due to some hard theoretical limitations, it is highly unlikely that there will ever be one that does). We consider the following criteria to be relevant when choosing from or evaluating various formal methods. Automation The degree to which a formal method can be automated is of great relevance for its industrial application. Due to its paramount importance, we discuss in more detail in Sec. 2.9. Soundness Soundness means that a formal method provides a strong correctness guarantee unless it flags an error. When applying a sound formal method to a program and not finding any errors, one can safely consider the program to be correct (relative to the specification, which might or might not be correct itself). In some cases (e.g., bounded model checking), the soundness guarantee depends on configuration parameters (see Sec. 2.13) – the higher the bound, the stronger the guarantee provided. If chosen wrong, the method might seem sound, but isn’t in fact. An alternative definition of soundness is the absence of false negatives. A sound formal method hence cannot ”overlook” or otherwise fail to detect errors present in the program.

Synthesis vs. Code Generation Now that we understand that good specifications are ideally simple and declarative, the challenge of program synthesis becomes obvious: in the example above, you would write: 1 2 3 4 5 6 7

fun sortAsc(l: list): list post l.size == res.size post forall e in l | e in res post i, j: int[0..l.size] | j > i => res[j] >= res[i] { ?? }

Precision We call the probability that a formal method is able to establish a property for a given program it’s precision. For example, due to the approximate nature of their analyses, type systems and data-flow analyses are often not able to prove properties even though they actually hold for the given program. The higher the precision of a method, the fewer false positives it produces. The extreme of precision is completeness, which is the complete absence of false positives and hence the counterpart of soundness.

The task of the the synthesizer is to “fill the hole”, automatically coming up with an implementation that performs sorting of a list, ideally in an efficient way. And this would have to work for any specification you can express with the specification language! This is di↵erent from code generation in the way we use it for DSLs. In that case we would perhaps write: 1 2 3

Evaluation Criteria

fun sortAsc(l: list): list { sort l, ASC; }

Applicability Applicability describes the degree to which a formal method is applicable to di↵erent programming paradigms. Often, there are restrictions re-

Here, sort is a language construct, so the code generator understands the specific semantics of the concept. 8

Reader Seite 100

"An Overview of Program Analysis using Formal Methods" Seite 9

garding certain language features. For example, dependent types [52] and refinement types cannot deal with side-e↵ects and are hence only applicable to functional languages, so this advanced form of type checking cannot be used for imperative languages.

users write the properties. For example, the pre and postconditions in mbeddr components were used rarely, even by experienced developers [50], even though mbeddr supports fully-integrated verification of these contracts through CBMC [22]. Another problem with specification-based analysis is that the verifier will only address those properties that are actually specified by the programmer14 . It is almost always manual work to derive these formal properties from requirements (if clear requirements exist in the first place). This means that specification-based verification has a coverage problem.15

Flexibility The degree to which a formal method can be used to verify di↵erent properties. Some methods come with a predefined set of properties, whereas other methods lets the user define a wide range, or an arbitrary set of properties. Applicability and flexibility are sometimes hard to distinguish: applicability refers to the formalisms/paradigms to which an analysis can be applied. flexibility, in contrast, describes what can be analyzed on programs to which the method is applicable.

Fully automated analyses report errors or warnings on programs written by a user without any additional (analysis-specific) activities performed by the user. For example, for a state machine, a model checker can verify that all states are potentially reachable (dead states are always an error), or an SMT solver can check for a set of Boolean expressions that no two of them are identical. Fully automated analyses can also be seen as specificationbased analyses where the properties are implicit in the language construct that is being verified (see next subsection for more details on this idea). Again, consider model checking: it is an implicit property of a state machine that all states must be always eventually reachable, i.e., the state machine has no dead states. Note however that fully automated analyses might still require configuration or tuning parameters for the analysis backend (see Sec. 2.13).

Performance/Scalability How much time (or instructions) is required for an analysis and how does this depend on the program size or the property’s complexity? Many formal methods su↵er from problems. For some analyses, the problem can be adressed by manually decomposing the structure of the program and the specification in a way that allows modular verification. For other analyses this is not (easily) possible. Unfortunately, any particular formal method involves tradeo↵s between those criteria (for example between precision and scalability or between flexibility and automation). We will discuss the criteria and the involved tradeo↵s for each formal method in their respective sections. 2.9

Interactive Analyses are those where the tool and the end user conduct an interactive, often iterative, backand-forth session. For example, a user might specify a to-be-proven theorem, the tool automatically simplifies the theorem by applying simplifications based on other theorems, but then requires additional specifications or decisions from the user before it can continue with the proof. Interactive verifications are outside the scope of this booklet. In this booklet, we focus on fully automated as well as specification-based analyses.

Degree of automation

Analyses can be grouped with regards to the degree of automation from the perspective of the end users. We emphasize “end user”, because, from the perspective of the language developer, a lot of manual integration work may be required; we discuss this in Sec. 2.11. Specification-based analyses perform the verification of a property automatically, but require the user to express the property that should be checked. The property is specified in a language that is suitable for the particular analysis. The simplest example of this approach is type checking: a user specfies the (expected) type of a variable and the type checker verifies that a value (potentially a complex expression) is compatible with this expected type. Another example is model checking, where a user might specify temporal properties (“after the state machine was in state A, it will always eventually reach state B”), and the model checker tries to prove that the property is true. One challenge in this approach is that it is additional e↵ort for the user to write these properties. While a good design (and integration) of the property specification language helps, it is still often a challenge to make

2.10

Synergies with Language Engineering

Above we have described the need for specifying properties (and getting them correct!) as one of the major problems for the adoption of static analyses. On the other hand, we have also pointed out that analyses that do not require explicit specification of properties because 14 Some

verifiers, such as KeY [3], automatically flag every nonspecified behavior as an error. 15 The coverage program can be seen as smaller because a property is a “test case for all possible executions” and hence more powerful. On the other hand, full coverage of a specification cannot be proven – test coverage can at least be measured.

9

Reader Seite 101

"An Overview of Program Analysis using Formal Methods" Seite 10

has to be lifted back to the DSL for it to make sense in the context of the original program. Note that for reasons of reuse and/or complexity, the transformation might be performed in multiple steps. Note that the transformation and lifting encompasses a semantic/algorithmic aspect and a technical one. In almost all cases, the program as expressed in the DSL has to be represented in the tool’s formalism (e.g., a procedural program has to be represented as a transition system for model checking). Similarly, the results have to be lifted back. This is the semantic aspect. The technical aspect then involves, for example, generating an actual text file and feeding it to an external process, or calling an API of an in-process library. The architecture shown above has a couple of challenges that have to be implemented by the developer of the language and its integrated verifications.

Figure 3. Typical architecture of integrating verification tools. the properties are implicit in the language constructs used to write programs are easier to get used in practice. So, if we assume that the subject language can be changed or extended (as is the case in the context of language engineering and language workbenches), this leads to an obvious conclusion: adapt or extend the subject language with constructs, that imply analysis properties and/or let the user express the properties in a user-friendly and integrated way. An example is a decision table, such as the one shown in Fig. 12. The decision table language construct implies that both the conditions in the row headers and the conditions in the column headers must each be complete (any combination of value must be matched by the table) and free of overlap (any set of inputs must match at most one condition). The lower-level representation of decision tables (a bunch of nested if statements) does not imply this. However, a verifier integrated into the language can perform the corresponding verifications for every decision table it encounters without any additional specification. We revisit this particular example in Sec. 5. 2.11

Maintenance of Derived Structures Whenever the program changes, the derived structures have to be updated. For small, local analyses, the derived structures can be rebuilt completely, for every program change. However, for larger programs, these derived structures must be maintained incrementally. Currently, we do not have a means for such incremental transformations (”shadow models”). If an external tool is used, an additional problem is that the tool typically does not provide an incremental API and requires a completely new model for every verification run16 ; so even if we could maintain the tool’s input model incrementally, the verification would still happen from scratch each time.

Analysis Architecture

Lifting of Results Analysis tools report results on the level of the tool’s abstraction. For example, an SMT solver will report results on the level of the constraints supplied to the solver. However, to be meaningful to the user, the results have to be lifted back to and expressed in terms of the abstractions of the DSL: the gap that is crossed by the creation of the tool-specific structures has to be bridged in the reverse direction. While this can be non-trivial, it can be simplified by suitable intermediate abstractions for which, when they fail, the meaning in terms of the DSL is clear (an example is the solver intermediate language introduce in Sec. 5.2), by clever naming (so that, based on an analysis result, one can link back to the names in the program) or by embedding information in the lower level representation (similar to an asynchronous completion token [8]). If the downtranslation is done in multiple steps, lifting back the results can also be simplified.

Analysis tools are typically quite sophisticated: they embody years or decades of development e↵ort to perform particular analyses reliably and fast. This means that in most cases, one will want to integrate existing verification tools instead of developing your own verifications from scratch. In addition, each verification formalism expects the problem to expressed in a specific way (e.g., model checkers expect the to-be-verified system as a transition system, aka. state machine); and in addition, each tool has its own specific syntax (all model checkers expect a transition system as the input, but each tool has a di↵erent syntax to express that transition system). This leads to the tool integration architecture shown in Fig. 3. The program is expressed in whatever DSL has been chosen for the domain. The specification, if necessary, is also expressed in a suitable DSL, either separate (but relating to) the program, or syntactically integrated. A transformation maps (the relevant parts of) the DSL program to the input of the verifier and forwards it to the external tool. The verifier performs the analysis and reports a result in terms of the verifier’s input formalism and in some tool-specific syntax. This then

16 A

version of CBMC has been built that, before verification of C sources, first di↵s the new input with the previous one and then tries to re-verify only the changed parts. Boogie [23] also supports incremental verification.

10

Reader Seite 102

"An Overview of Program Analysis using Formal Methods" Seite 11

In some cases it might also be feasible to perform the analysis directly on the code, with no translation of the program into another formalism (although, very likely, the analysis tool itself then performs such a translation). An example of this approach is model checking on C level using tools like CBMC. 2.12

This problem is sometimes described as ”turtles all the way down”17 and illustrates that there cannot be a 100% guarantee. At some point someone must have implemented his verifiable programming language in a non-verifiable one, because otherwise there would not be any verifiable languages. Hence there will always be a loophole in any correctness proof. The only thing we can do is to make this loophole smaller though additional manual e↵ort (verifying interpreters or compilers), which is why it is important to know the level of confidence expected from the system in order to choose the amount of e↵ort one should invest in developing it. In practice, static analysis and verification can provide meaningful additional confidence for program correctness (just as testing or reviews can). Verification is not a panacea and should not be trusted blindly, but rather seen as a part of a comprehensive quality assurance approach.

Level of Confidence

One usually applies formal methods in order to gain strong correctness guarantees for subject programs. When designing a verification system for a language, it is hence important to understand what level of confidence is expected from it, since this has direct implications on the methods, components and tools that can be used in its construction and operation (cf. the previous subsection). Tools When deciding on tools, apply the standard metrics from software engineering to judge its maturity: (1) How long has a tool been available? (2) How many people are working on it? (3) Is it still being maintained actively? (4) How many bugs have been found / fixed in it? (5) How large is a tool’s user base?

2.13

Challenges

There is no free lunch, not even in formal verification. In this section we describe some of the challenges involved in using formal methods, from the perspective of the end user. Whether their use makes sense despite the challenges ultimately depends on the correctness guarantees needed for a given DSL. Additional challenges might arise for the developer of the DSL that integrates verification; see Sec. 2.11.

Method When developing a verification method, it is important to realize that the guarantee provided by the system as a whole can only be as strong as the weakest link in the process: for example, when using an existing SMT solver with a strong soundness guarantee, one should place special emphasis on developing the translation from the DSL to the solver’s input language rigorously to avoid introducing unsoundness. When developing a program analysis based on abstract interpretation, it is important to prove one’s abstraction sound with respect to the semantics of the programming language because otherwise the analysis will not provide a meaningful guarantee. When developing an interpreter for a DSL, it is important to ensure that it properly implements the DSL’s semantics because otherwise the analyses one wishes to perform might have a slightly di↵erent understanding of the language’s semantics and thus fail to provide relevant guarantees.

Designing for Verifiability Formal methods have limits to scalability – both in terms of the users’s ability to write and understand specifications (remember: they have to be “obviously” correct) and in terms of the verification tool’s scalability. We have mentioned before that the solution to this challenge is modular/hierarchical verification. However, this only works if one has reasonably small and well-defined modules. The modules and their boundaries have to be designed explicitly. It is very hard to take an existing spaghetti-system and try to verify it. So, similar to test-driven development, a system has to be designed to be verifiable – on the other hand, building a system that is verifiable leads to a well-defined system.

Bootstrapping A particular problem in this respect is bootstrapping: whenever one wants to verify programs in some programming language X, one will need a formal semantics for programming language X. In order to ensure that the interpreter for X implements this semantics to the same level of confidence than the verification itself, one has to formally verify the interpreter. Now, the interpreter is usually written in some other programming language Y. In order to verify it, we hence need a formal semantics for Y and – in order to assure that this does not just move to problem to the next language – we would need to assure that the interpreter for Y is faithfully implementing the semantics for Y, and so on.

Leakiness of Abstractions A core idea of the approach described in this tutorial is to “hide” the verification tool and the involved formalisms behind the DSL. However, sometimes the verification tool requires additional configuration parameters (in addition to the program and the specification). These might be hard to set correctly. For example, for bounded model checking, one has to specify the bounds (i.e., the number of steps that should be considered). If the number is too low, a property violation might not be detected. If the number 17 https://en.wikipedia.org/wiki/Turtles_all_the_way_down

11

Reader Seite 103

"An Overview of Program Analysis using Formal Methods" Seite 12

is too high, the analysis runs (unnecessarily) long. For the end user, finding good values for such parameters can be a challenge. The parameters cannot (always) be set automatically as part of the transformation of the DSL to the verification tool (cf. Fig. 3) because there are no obvious algorithms to derive them from the program. If the parameters a↵ect performance, one approach at automating the process is to use relatively small values in the IDE when the verification runs interactively (low soundness; errors might not be found!), but then run the same analysis with bigger parameter values (higher soundness; more errors can be found) as part of the continuous integration server (one might receive an error from the server even though the local verification runs fine).

Type Checking

3.1

A Recap of Basic Types

The purpose of many static analyses is to associate some value – that represents the result of the analysis – with a program node. The value might be constructed through a more or less sophisticated analysis. Type checking is one of the simpler ways of analyzing a program; the value associated with the program node is called the type, and it represents the data this program node can take (e.g., the value short int means that the node to which the value is assigned can hold numeric integer data from -127 to 127). Type checking is a value analysis, so only expressions, i.e., program nodes that represent a value at runtime, can have types18 . The rules for determining the types are typically one of the following:

Non-Linearity Many verification tools are based on heuristics. This means that, based on the structure of the to-be-verified program and/or the properties, they“guess” or “estimate” the right approach to the verification. This means that, sometimes, a small change to the model or property can have a big impact on the performance (and if used with timeouts, success) of a verification. This situation is known from relational databases, where, depending on the specific formulation of the SQL query, the performance can vary significantly. The reason is that query optimizers are also often heuristics-based. 2.14

3.

• Fixed types for certain language concepts. For exam-

ple, the number literal 1 always has type int, and the type of int is also always int19 . Sometimes the type might depend on internal structure: for example, a number literal with a dot in its value (as in 33.33) might be typed as float instead of int.

• Types may be derived from other types. For example

the type of a variable reference is typically the type of the referenced variable. More complicated type derivations are also common. For example, the type of + in 3 + 2.5 is float, i.e., the more general of the two argument types. To determine which is more general, a type system has to be able to describe subtyping rules, such as int is-subtype-of float.

Further Reading

While mention specific references throughout the booklet, this paragraph mentions a couple of general recommended material. First, if you are somebody who learns from listening, you might want to check out the omega tau episode on specification and proof with Benjamin Pierce [48]. A comprehensive (and free) book on static program analysis is the one by Møller and Schwartzbach [31]. Another one is the book by Nielson et al. [33] mentioned before.

• Types may depend on the program structure, i.e.,

the location of the to-by-typed node inside the AST. For example, a variable reference that points to a variable that has type option also has type option in general (as discussed above). However, if the reference occurs under a node that ensures that the option is not none (e.g., if x then x + 1 else 0, the type could be int.

Languages with static type checking can support type inference. This means that a type does not have to be specified explicitly by the user. So instead of writing var int x = 10; the user can just write var x = 10; and the type system infers the type of x from its init value 10. Explicitly given types can be seen as a specification against which types that are computed from other program structures are checked. Type inference is useful if types become more complicated. For example, it is nice if a type map 0 and c = + ⌧ . Otherwise we know that  0. In line 9 we can say that

b=⌧ a>0^c=a+b _ a0^c=a⇤a d = a/c

3 4 5 6 7 8 9 10

5.1

Introduction

• states that f is satisfiable (SAT) and provides a model

To answer our original question whether a division by zero can occur we can add another constraint c = 0 and solve this set of equations with a constraint solver. If the solver finds a solution (values for and ⌧ , it will report it; this means that for this solution c can be 0. If it cannot find a solution, the solver reports UNSAT, which means here that no division by zero can ever occur. The encoding in the Z3 solver looks as follows:28 2

SMT Solving

Satisfiability Modulo Theories (SMT) solvers are a category of automated theorem provers that, instead of deciding the validity of (mathematical) theorems directly, focus on solving the equivalent satisfiability problem for logical formulas.29 For more information on theorem proving in general, see [9]. Hence, given a logical formula f , an SMT solver either

a=

1

5.

(an assignment of values for all free variables in f such that f is true under this assignment), or

• states that f is unsatisfiable (UNSAT), which means

there is no such model. However, in this case, modern SMT solvers are able to provide a proof or other information that helps locating and understanding the problem (unsat Core30 ).

Contrary to SAT solvers, which only support propositional logic, SMT solvers support full first-order logic along with a number of so-called theories. This means that, depending on the particular tool and its supported theories, it can also deal, for example, with integer/real/float arithmetics, bit vectors or arrays and lists. Note, however, that in general, satisfiability of firstorder logic with integer arithmetics alone is undecidable, which is why SMT solvers follow a heuristics-based approach. It works amazingly well for most formulas, but for some formulas, SMT solvers will inevitably return the result UNKNOWN. Various SMT solvers exist (such as Alloy [29], Choco [21] or Z3 [14]; we use Z3), and they di↵er mainly in which theories they support, how well they are able to scale with the problem size, as well as their particular APIs. Contemporary solvers can solve thousands of equations with tens of thousands of variables in very short time spans (sub-seconds to seconds). However, as another consequence of their heuristic approach, a small change to the input formulas may have a significant impact on the solver’s performance.

(declare-var a Int) (declare-var b Int) (declare-var c Int) (assert (or (and (> a 0) (= c (+ a b) ) ) (and ( 30) ^ ¬(speed > 40) ... implicitly asking the solver to find a value for speed that makes this formula become true. In general, for a decision table with n row headers r1 , ..., rn , the decision table’s rows are complete i↵ the following is unsatisfiable: n ^

i=0

Equality: Are the Ei semantically equivalent, even though they di↵er structurally (think: DeMorgan laws). Examples include refactorings that simplify expressions.

¬ri

While these intermediate abstractions are useful in many contexts, this is not an exhaustive list. A tool that uses such intermediate abstractions as part of its architecture must make sure that the list of abstractions is extensible.

In our example above, the solver is able to satisfy these equations and finds the model speed == 30. The row headers are hence not complete, since the case where speed is exactly 30 is missing. The other validity criterion was overlap-freedom: for any particular set of input values, only one of the expressions must be true. The row headers r1 , ..., rn of a decision table are overlap-free, i↵ for i, j 2 {1, .., n}, i 6= j ^

n ^

i0 =1

i = i 0 ) r i0 ^

n ^

j 0 =1

5.2

Integration Architecture

Intermediate Language A large variety of end-user relevant SMT-based checks can be encoded as one or more of these basic idioms. It is thus worth making them available as first- class citizens. We call those abstractions the SMT Intermediate Language, or Solver Language for short. Fig. 13 depicts the basic idea. Note that the idea of solver-supported languages is not new (we mention Dafny and Rosette later). In particular, Boogie [23] is an intermediate verification language; thus, it has the same goal as our solver language, but is much more sophisticated. The solver language supports the following concepts: variables with types (Boolean, integer and enum, for now), constraints between those variables as well as checks. A check is a particular question asked to the solver; a check is available for each of the abstractions identified above. From a user’s perspective, the integration of the solver language with the actual solver is generic: it can be used for any user DSL. Only the translation from the user DSL to the solver language has to be implemented specifically

j = j 0 ) rj 0

is unsatisfiable. If it is satisfiable, then the model would indicate a) which expressions (i, j) overlap and b) a value assignment that makes both expressions true. As can be seen from the above example, the expressions that must be passed to the solver can become voluminous, and the various negations can make things unintuitive. It thus makes sense to provide intermediate abstractions based on the observation that many user-relevant problems can be phrased in terms of the following core idioms for a list of boolean expressions E1 , ..., En 31 The

use of negation to “drive” the solver in a particular direction is typical.

27

Reader Seite 106

"An Overview of Program Analysis using Formal Methods" Seite 28

Figure 14. Overview assessment of all solver check. This table shows all solver assessments, whether they are ignored (by an annotation), successful or faulty. All solver checks can be run from this one single place through an intention. To provide context, tooltips show the node in question as well as the detailed error message. Clicking onto the node name navigates to the solved node.

Figure 13. Integration architecture of the SMT solver. We translate DSL constructs to abstractions of an intermediate solver language (using the means of the language workbench) and then use solver-native APIs to communicate the intermediate abstractions to the solver itself. for each DSL. Since the abstraction gap is smaller than a direct low-level encoding of the SMT problem, the e↵ort is reduced. In addition, the step from user DSL to solver language can be implemented as a model-tomodel transformation with the means of the language workbench (for example, in MPS), which means that the developer does not have to address the more technical issues such as inter-process communication, timeouts as well as the details of the solver API (we use SMTLIB for the low-level integration [1]).

reminder that a manual check is available (B). This is necessary because, for performance reasons, the solverbased checks are not integrated into the realtime type system checks and must be executed manually. Once the user does this (using Cmd-Alt-Enter or mouse actions), he receives an error annotation if the code is faulty (C). Finally, after fixing the problem and reexecuting the solver check, the error disappears (D). Additional support is available for running all solver-based checks in some scope (file, project) in one go and summarizing the results in a table (Fig. 14). To provide more insight into the reported errors, the solver debugger (Fig. 15) is available. When invoked, it displays the representation of the (relevant part of the) program in the intermediate solver language inlined in the user DSL program32 . The representation highlights all errors as detected by the intermediate layer. Since that language can also be interacted with, the user is free to play directly with this representation of the problem to better understand it on this layer.

Simple Example Consider the following program fragment expressed in a DSL; it is called an alt expression and can be seen as a one-dimensional decision table: if the condition before the arrow is true, the value after the arrow is the result of the expression. 1 2 3

fun decide(a: int) = alt| a < 0 | a == 0 | a > 0

=> => =>

1 | 2 | 3 |

Similar to decision tables, the three conditions should be checked for completeness and overlap-freedom. Using the intermediate abstraction discussed above, this can be expressed as follows with our solver language: 1 2 3 4 5 6 7

5.3

variables: a: int constraints: checks: completeness { a < 0, a == 0, a > 0 } non-overlapping { a < 0, a == 0, a > 0 }

Transformation to the solver language

The intermediate language is independent of the user DSL and the concrete checks that must be performed there, increasing the potential for reuse. On the flip side, a transformation from user DSL to the intermediate language has to be written for every user DSL. This translation can be separated into two steps.

Tool Integration Fig. 16 shows the integration of the solver functionality into the end-user DSL and IDE (in this case, MPS). The user starts by writing code that involves a language construct for which solver support is available, in this case the alt expression (A). He then receives an info message (the grey underline) as a

32 Thus,

for debugging, DSL users have to understand the intermediate language, but not the mapping to the solver itself. Since the intermediate language much more directly expresses the kinds of questions asked to the solver, it is significantly more understandable than the solver code itself.

28

Reader Seite 107

"An Overview of Program Analysis using Formal Methods" Seite 29

based on types and constraints cannot be part of the solution the solver potentially finds. Revisiting Fig. 15, one can see that r is typed as an int and the constraints section limits its values to [0,10]. This constraint comes from the context in which the alt expression is used, and in particular, the type of r. This type is inferred for the declaration of r from its value, which is a call to the function f which has an explicit type that limits the values to [0,10]. However, capturing this context is not always so trivial. Consider the following examples: 1 2 3 4 5 6

type number10: number[0|10] type number5: number[0|10] where it !Gasoline Gasoline => !Diesel

The first task for the SMT solver is to verify that the constraints implied by the feature model (tree) plus those expressed in addition to the tree still allow some selection of features where any one is used; otherwise that feature is “unreachable” and can be removed (or the constraints corrected). The primary use of feature models is to define a configuration space, i.e., a set of constrained options from which the user can choose. We call a configuration any set of assignments fi = true|false for all features fi of a feature model. An SMT solver can now check if such a selection satisfies the constraints expressed by its respective feature model; if not, this configuration is invalid. Essentially, the constraints are conjoined with the true/false assignments and checked for satisfiability. Ultimately, feature models are used to configure other artifacts (represented by other models that in turn represent software or hardware). Once a configuration is applied to a system, we have constructed a variant. There are many ways how this can be done technically, but in almost all cases, parts or fragments of the system model are annotated with so-called presence conditions: a presence condition is a Boolean expression over features of a feature model. For example, a high-performance ECU in the head unit of a car might be annotated with NavSystem || CDPlayer && Radio, meaning that this particular model element is present in all variants where the feature NavSystem or both CDPlayer and Radio are selected. For a system model M , expressed with a language LM , where model elements are annotated with presence con-

type definitions, is nonetheless di↵erent, so a separate context transformation had to be implemented. Medical DSL In a customer project we have been developing a DSL for use in software medical devices. For obvious reasons, the programs written with this DSL should be “as correct as possible”, and so an integration of the solver makes sense.34 In terms of the supported checks, we verified decision tables and decision trees, as usual. In addition, we also checked transitions: the language was based on state machines, and for any given state/event combination, several transitions could be defined with di↵erent guards, i.e., di↵erent constraints over event parameters or state-local variables. Those had to be checked for completeness and overlap. While this is the exact same check as for an alt expression in KernelF, it illustrates the broad applicability of the abstractions identified above, and also highlights the ability of the framework to integrate into di↵erent host DSLs. Variability Variability modeling for product lines relies on features models [13]. A feature describes characteristics/features of the root concept. For example in Fig. 20, the concept Car has features Engine or Radio. The diagrammatic notation expresses constraints between the presence of those features. The filled dot means mandatory (a Car must have an Engine), a hollow cir34 We

have built a prototypical integration, but as a consequence of immaturity of the solver integration framework at the time, and project schedule pressure, we did not finalize this integration. However, this is expected to be done for version 2 of the software.

32

Reader Seite 111

"An Overview of Program Analysis using Formal Methods" Seite 33

the tree, no special definitions are necessary. However, there are typically other constraints in a language. For example, in a language that describes software deployment onto a set of hardware modules, there might be constraints as to how many software components can be deployed to a specific hardware module (we discuss this example below). If this is described via constraints35 , then those constraints can be taken into account as well and checked against the presence conditions and the feature model.

Figure 21. Example program structure used to explain the consistency requirements of variants.

Types More generally, any constraint expressed by the language, and in particular, typing rules, if described as constraints, can be taken in account when verifying the correctness of the presence conditions. However, types are typically not expressed as constraints, and in addition, for real-world sized programs, this will start to lead to performance considerations. So this is still a research topic; we will discuss it below.

ditions that refer to a feature model FM , the following condition must be true: for all variants allowed by FM (i.e., for all feature selections that are valid with respect to the constraints expressed by FM ) all the constraints defined by LM must hold for M . Consider Fig. 21 as an example. Nesting of boxes represents containment, i.e., syntactic nesting. Arrows represent references, dotted arrows optional references (i.e, they can be null, or unset). You can imagine the boxes as components or component instances and the lines as connectors; you can also look at the boxes as syntax nodes in a textual language, e.g., C could be a variable declaration and A a reference to this variable. Now consider attaching a presence condition to each of the boxes; we denote as PT the presence condition for a box T . Let us now look at what properties can be verified using an SMT solver.

Note that all of these check must take the constraints from the feature model into account, i.e., we have to conjoin all the constraints from the feature model with those derived from the program/model structure. There are various flavours of feature models; one particular extension of the basic formalism uses feature attributes. For example, the Engine feature in Fig. 20 might have an attribute horsepower: int. Constraints on the feature model might involve those attributes (e.g., Diesel => Engine.horsepower == 140 || Engine.horsepower == 170) and variants will specify values for each attribute of each selected feature. Since SMT solvers support integer arithmetics, the attributes and their values in variants can be part of constraints easily.

Valid Tree From the nesting structure, we can immediately see that the presence condition for a nested box is the conjunction of its own presence condition and of all its parents (a syntactically nested box can only be in the program if all its parents are in the program as well). We call this conjunction the effective presence condition ET . The first consistency check we can perform via a solver is to verify that, for every program node T that has a presence condition, the e↵ective presence condition ET is not empty; in other words, it is possible to select a set of features so that T is present in at least one variant of the system. Otherwise, T would be dead code and could be removed.

5.5

Checking vs. Finding Solutions

So far, we have used the solver mostly to check properties of a program (or model): we have submitted a couple of formulas and asked the solver whether they are satisfiable. If they are not, the solver answers UNSAT. So, for example, for the following set of equations 2 ⇤ x == 3 ⇤ y

Referential Integrity Referential integrity refers to the fact that a reference cannot point to “nothing”. This means that the ERef ⇢ ET arget : whenever the reference is in the variant, the target must be in the variant as well. For optional references, i.e., those that are allowed to be null, this constraint does not hold. For bidirectional references the two e↵ective presence conditions have to be equivalent. Both equivalence and subset are among the idiomatic solver checks and are thus easy to check.

x + 2 == 5

is satisfiable by the model x = 3, y = 2. As we can see, finding a model entails assigning values to those variables in the equations that do not yet have a value (i.e., where the constraints still leave some assignments possible). Thus, we can use the solver to find solutions for a problem, not just to check the validity of an existing solution. For example, for the feature model example we could ask the solver to “find any variant of the car that

Other Structural Constraints The above constraints can be derived automatically from the structure of

35 In

practice this is often either implicit in the code or checked using procedural/imperative/functional code.

33

Reader Seite 112

"An Overview of Program Analysis using Formal Methods" Seite 34

has a Diesel engine and a NavSystem, if there is one”. We will exploit this in the upcoming examples. 5.6

A more practical example of optimization is the following. Consider again the car example shown in Fig. 20. Let’s assume we wanted to find the cheapest variant that contains a TrailerHitch. We could do this as follows:

Iterative Solving

Iterative Solving and Optimization Consider the equation 2 ⇤ x == 3 ⇤ y, where the variables x and y are of type number[0|100]. Let us further assume we want to find the largest possible values for x and y. Here is what we might send to the solver:

• We assign a price attribute/constant to each fea-

ture (e.g., NavSystem.price = 5000). Conceptually it can be seen as function int price(Feature); SMT solvers typically support such functions directly.

2 ⇤ x == 3 ⇤ y

x y

• In order to only count the price of selected features,

0 ^ x  100

we can add another function int effPrice(Feature f) = (f ? f.price : 0), which returns 0 as the price for non-selected features.

0 ^ y  100

A valid solution would be x = 0, y = 0. It is likely that the solver finds this one, because it represents a kind of extreme case that is likely to be found first by the heuristics-based search. However, these are certainly not the largest possible values for x and y to meet the original equation. To drive the solver to less trivial (and in this case, larger) solutions, we restrict the solution space to exclude this (valid, but unwanted) solution by adding the constraints x > 0 and y > 0, and run the solver again:

• We define a variable totalPrice that is defined as

the sum of all e↵ective prices of all features, i.e., totalPrice = effPrice(Engine) + effPrice(NavSystem) + ....

• We then run the solver iteratively for the variable

totalPrice, driving it down by consecutively adding totalPrice < previouslyFoundTotal.

Test Case Generation Testing functions (and similar “things with argument lists”) involves invoking the function with particular values for arguments and then asserting that the result is correct. A core challenge is to come up with a varied set of argument values that “sufficiently” tests the function. We can use a solver in a similar way as for the optimization case: the subsequent sets of value assignments serve as test vectors. In contrast to the optimization case where we try to “push” the solver in a certain direction by adding constraints that use arg > prevSolution, in the test case generation we just use arg != prevSolution. However, there are a couple of additional considerations. If the argument list is unconstrained, i.e., it is just a list of typed arguments, it is relatively easy to find values for those arguments. However, it is also likely that some of these combinations are intentionally invalid: for example, the function implementation might check for this invalid combination and throw an exception. Examples include

2 ⇤ x == 3 ⇤ y

x y

0 ^ x  100 0 ^ y  100

x>0 y>0

Next, the solver might find x = 3, y = 2; add that to the constraints, and try again: 2 ⇤ x == 3 ⇤ y

x y

0 ^ x  100 0 ^ y  100

x>0 x>0 x>3 y>2

By repeatedly running the solver, and by adding the n-th solution to the constraints before the next run, one can drive the solver to better solutions (in this case, better means bigger, because we want to find a maximum). The iterative process is stopped when either the solver does not find a next solution (UNSAT), or when it takes longer than some user-defined timeout. In either case, the last found solution is the best one that could be found by the solver in the allotted time. This optimization process based on declaring the previous solution invalid can be automated in another intermediate language construct (not done yet).

1 2 3 4 5

int divide( int a, int b ) { ... if ( b == 0 ) throw InvalidDivident(); ... }

6 7 8 9 10 11

int sendToOneOf( Receiver r1, Receiver r2 ) { ... if ( r1 != null && r2 != null) throw OnlyOneAllowed(); ... }

The problem is that these checks are buried in the implementation. A better solution is to factor them into declarative preconditions:

34

Reader Seite 113

"An Overview of Program Analysis using Formal Methods" Seite 35

1 2 3 4

Just as in the case of iterative optimization, you need a criterion when to stop iterating. Again, you can stop when you run into a timeout, but this does not guarantee anything about the number or quality of the generated input vectors. So a better approach is to iterate until you find a minimum number of test vectors. However, you still don’t know if those cover the complete implementation. Thus, you can combine test case generation with test case execution, and measure the coverage of the function, for example, by instrumenting the implementation. You continue iterating until you achieve a specified minimum coverage.

int divide( int a, int b ) where b != 0 { ... } int sendToOneOf( Receiver r1, Receiver r2 ) where r1 != null && r2 == null || r1 == null && r2 != null { ... }

This way, these constraints can be included in the set of equations sent to the solver, so it is guaranteed that it only finds solutions (test vectors) that respect these preconditions. Of course, a similar check then also needs to be performed at runtime, for example, by generating declarative preconditions into regular asserts at the beginning of the function. Another concern is that, no matter which argument values the solver comes up with, one usually wants to explore the boundaries of the range allowed by the data types, as well as special values such as 0, 1 or -1. So in addition to letting the solver find arbitrary solutions, you can start with some “hardcoded” ones. Note that preconditions are once again helpful, because, if those special values are not allowed, you will just get an UNSAT – meaning, you can try them without harm. A problem with test case generation is that you cannot “generate” the expected results. So by default, you can just run the test and check that the function does not crash. Alternatively, you can of course fill in the expected values manually: this still benefits from the automatic exploration of the inputs. One step better is to also specify postconditions for the functions under test. These can be seen as a kind of implicit test assertions. So if your functions have postconditions, you consider them as an additional approximation if the test succeeded. Obviously, the usefulness of this approach depends on the fidelity of the postconditions. Consider the following example (which you have already seen in the introduction where we have used it as an example of a good specification): 1 2 3 4 5 6 7 8

5.7

Advanced Uses of Solvers

Mapping of Sets Consider a typical problem from systems engineering: you have a set of software components as well as a set of hardware nodes. Some pairs of software components have communication relationships through connectors, as shown in Fig. 19. Similarly, some of the hardware nodes have network connections. The problem to be addressed with the solver is: what are valid deployments of software components to hardware nodes. Such a mapping must take into account the obvious constraint that, if two software components communicate, then the nodes onto which the two are deployed must either be the same, or connected (indirectly?) with a network. However, there are typically two additional considerations. First, the bandwidth required by the communication of deployed software components cannot be higher than what is available on the underlying networks. And second, the resource requirements (such as memory or processing power) of the software components deployed on a given hardware node must be lower than what is supplied by the node. In the example below, we only look at the second constraints to keep things simpler. This is an example of a more general problem, where we have

list sort( list data ) post res.size == data.size forall e in data | res.contains(e) forall i, j: int | i >= 0 && i < res.size && j >= 0 && j < res.size && i > j => res[i] >= res[j] { ... }

• two sets S and H, • the type of the elements in S and H are essentially

records (i.e., each element has a unique identity plus a list of typed attributes),

• elements within both S and H are potentially linked

This postcondition specifies the complete semantics of the sort function (leaving only non-functional concerns open for the implementation): the size of the result must be the same as the size of the input, and the higherindex elements in the list must be bigger or equal to the lower-index ones. If the postcondition is so detailed, the generated test cases, together with checking the postcondition, is all you need for testing36 .

to each other,

• a number of properties (expressed in a logic special-

ized to talk about relations, (sets of) nodes, and their attributes as well as links),

• and we are interested in a relation R ✓ S ⇥ H such

that R satisfies all properties.

As usual, the system can then be used to a) check an existing relation, b) generate a new one or complete a partial relation, or c) optimize for a parameter (minimum overall network traffic) through iteration.

36 Of

course, getting the postcondition correct might be a challenge in itself, as we have outlined in the introduction. So you might want to use a couple of completely manually written test cases as well.

35

Reader Seite 114

"An Overview of Program Analysis using Formal Methods" Seite 36

The software/hardware example can be concisely expressed with the following DSL (which we have also implemented as a check in the intermediate solver language): 1 2 3

also support directly defining functions, but the more explicit way we use here leads to UNSAT cores that contain useful information and allow us to inform users about which properties contributed to the problem. The forall at the end constrains the function’s value for all values except [1..3]; Without it, the solver would be free to choose these values as it sees fit, even outside the range of our “enum”.

record S { requiredMemory: int }

4 5 6 7

record H { availableMemory: int }

1

8 9 10 11

2

relation communicates S -> S relation network H -> H relation deployed S *-> H

// 1:1 communication // 1:1 network connections // allow many S per one H

3 4 5

12 13 14 15 16

6

constraint for h: H { forall [s, h] in deployed: sum(s.requiredMemory) s2 communicates s1 -> s3 network h1 -> h2

1 2 3 4 5 6 7

Note that a first check for this model can check whether all required attribute values are set, whether the values have the correct types and whether the cardinalities of the relations (1:1, 1:n) are respected. No solver is required; these are plain old (type) checks on the AST. Next, we describe the mapping to the solver. We assume the capabilities of Z3, but will spare our readers the Lisp-like SMTLIB syntax. We start with type aliases of int that represent hardware nodes and software components. 1 2

fun availableMemory(h: H): int assert availableMemory(1) == 50; assert availableMemory(2) == 20; assert forall x: int. (1 < x || x > 2) => availableMemory(x) == 0

Next, we model the communication relationships. Note again that we also explicitly model all pairs of components that do not communicate, because, if those were missing, the solver would interpret this as a degree of freedom (“if I let these other two components communicate, then I could make this deployment work”). Finally we express with a forall that for values outside of [1..3], no communication happens. We then do the same for the network connections between hardware nodes.

A given system with three communicating software components and two connected hardware nodes might then be represented as follows. Since the relation deployed is unspecified, this is what the solver will compute. 1

fun requiredMemory(s: S): int assert requiredMemory(1) == 20; assert requiredMemory(2) == 10; assert requiredMemory(3) == 40; assert forall x: int. (1 < x || x > 3) => requiredMemory(x) == 0

8 9 10 11

fun communicates(from: S, to: S): bool; assert !communicates(1, 1); assert communicates(1, 2); assert communicates(1, 3); assert !communicates(2, 1); assert !communicates(2, 2); assert !communicates(2, 3); assert !communicates(3, 1); assert !communicates(3, 2); assert !communicates(3, 3); assert forall x: S, y: S. (x < 1 || x > 3 || y < 1 || y > 3) => !communicates(x,y)

12 13 14 15 16 17 18

fun network(from: H, to: H): bool; assert !network(1, 1); assert network(1, 2); assert !network(2, 1); assert !network(2, 2); assert forall x: H, y: H. (x < 1 || x > 2 || y < 1 || y > 2) => !network(x,y)

Next up are variables that capture onto which H each S is deployed. However, we constrain the value range of these variables: they can only be [1..2], because we have only two hardware nodes h1 and h2.

type S: int type H: int

Next, we represent node properties as functions that return the required memory for each software component. We first define the signature, and then assert over the outcome of the function call for specific values of s. The values of s correspond to the enums s1, s2 and s3. Essentially, this is a constraint-oriented way of providing a function implementation. Z3 would

1 2 3 4 5 6 7

var deployed_s1 H var deployed_s2 H var deployed_s3 H assert 0 < deployed_s1 state[t+1] == B && v[t+1] == v[t] state[t] == A && event[t] == F => state[t+1] == C && v[t+1] == v[t] state[t] == B && event[t] == E => v[t+1] == v[t] + 1 && state[t+1] == C && v[t+1] == v[t]

1

14

exists t: state[t] == AnErrorState

The resulting model then contains a list of total states for all t that lead to an invalid state. The IDE can illustrate this in various ways; what we did in our experiment was to have a table with the values for the total state where the user can click through, highlighting that state in the state machine. We also played with showing the values directly inline in the state machine, thereby animating the machine itself.

12 13

forall t: state[t] != AnErrorState

// this is the initial condition for t == 0 state[0] == A && v[0] == 0

Notice how we are not constraining the event for time t+1. This is because this is what we will task the solver to find out. Remember that an SMT solver solves sets of equations, i.e., it tries to find values for all nonconstrained variables so that all equations become true. In this case, the set of unconstrained variables is the sequence of events event[t_i] that are processed by the state machine. So, if the solver finds a solution, then it has computed a sequence of events, and thus, implicitly, a sequence of total states.

7.

Discrete Event Simulation

In general, dynamic simulation of a system means imitating its behavior as it progresses through time [38]. In discrete event simulation [16] only those points in time are represented at which the state of the system changes. Thus, the system is modeled as a series of events. An event is a particular instant in time when a state change occurs. Discrete event simulators (DES) are especially useful for understanding complex systems, where applying

Encoding Properties To encode an AG property (such as the AG(state != AnErrorState) discussed in the previous subsection), one can simply add another constraint: 46

Reader Seite 125

"An Overview of Program Analysis using Formal Methods" Seite 47