CO7205: Advanced Systems Design Course notes

July 22, 2017 | Autor: Andrew Flintoff | Categoria: Computer Science, Software Engineering, Computer Networks

Descrição do Produto

CO7205: Advanced Systems Design Course notes — Last updated December 6, 2014— Emilio Tuosto Department of Computer Science, University of Leicester, UK

Abstract. This course will consider several models for designing distributed applications. More precisely, we will consider a few modelling formalisms to specify and reason about application-level protocols. After a brief discussion (tailored to distributed systems and applications) on software architectures, we discuss choreographies and how they can be used in the design of distributed applications.

Disclaimer This is the first revision of the notes which were written last year; new revised versions will be made available during the module. Despite the author’s effort in avoiding mistakes or inaccuracies, the presence of some typos or errors is inevitable. The author begs students for their patience and warmly encourages them to suggest improvements and flag any oddities they may find in these notes.

Table of Contents

CO7205: Advanced Systems Design Course notes . . . . . . . . . .

1

Emilio Tuosto

I

Overview and Motivations

1 Introduction: context and motivations . . . . . . . . . . . . . . . . . .

II

7

Software Architectures

2 Software Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

2.1 A Note on Connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

2.2 Coupling vs Cohesion . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.3 Architectural styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.4 Client-Server style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

2.5 Pipe-filter style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

2.6 Blackboard style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

III

From Global Specifications...

3 What is a choreography? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4 Choreographies as interaction diagrams . . . . . . . . . . . . . . . . .

28

5 An execution model (by examples) . . . . . . . . . . . . . . . . . . . . .

31

6 More expressiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

6.1 Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

6.2 A note on choice and non-determinism . . . . . . . . . . . . . .

37

6.3 Problems with branching . . . . . . . . . . . . . . . . . . . . . . . . .

39

7 Global Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

7.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

7.2 Execution flow of global graphs . . . . . . . . . . . . . . . . . . . .

48

7.3 A more complex scenario . . . . . . . . . . . . . . . . . . . . . . . . .

51

IV

...To Local Specifications

8 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Communicating Finite State Machines . . . . . . . . . . . . . . 9 Projections of Global Graphs . . . . . . . . . . . . . . . . . . . . . . . . . .

3

57 58 63

4

Part I Overview and Motivations

1

Introduction: context and motivations

It is widely accepted that distributed systems and applications are not easy to design, implement, verify, deploy, and maintain. Not only there are intrinsic issues due to the underlying computational model (concurrency, physical distribution, fragility of the communication networks and their low-level protocols, etc.), but also applications have to be engineered within a strange schism. In fact, applications are typically made of computational components that, on the one hand, have to collaborate with each other while, on the other hand, may have to compete for resources and/or conflicting goals. Moreover, paradigms like service-oriented computing or its recent offspring cloud computing envisage distributed applications as autonomous components that dynamically discover and bind to each other. Non-functional requirements make the problem even more intricate. For instance, different administrative domains may impose different access policies to their services or resources. Or they may provide different (levels of) services to their users. Of course, the picture gets worse when factoring in malicious components/users that may try to spoil or take advantage of non robust applications. For such (and other) reasons, developers are required to carefully design their applications so that uninted behaviours do not happen at runtime. This is becoming more and more important since distributed applications are becoming ubiquitous and starting to be used not only in commercial applications but, and more importantly, in any other aspect of people’s life. For instance, the issues above affect e-health or e-government applications. 7

Approaches The issues briefly discussed above call for design and implementation methodologies grounded on rigorous foundations to precisely specify (and verify) applications according to – the assumptions relied upon and – the guarantees an application should provide to its partners. In other words, distributed applications should be built out of contracts that precisely state the expectation they have on their execution context, and what they guarantee to other partners. In fact, application-level protocols can be thought of as contracts that stipulates the expected communication pattern or distributed components. Remark 1 The notion of “contracts” is not new in computer science and dates back to the seminal work of Floyd [5], Dijkstra [4] and Hoare [7] who pioneered the idea of decorating software with statements that should hold true when the control reaches them. Probably, the most successful practical fallout of these research lines is Meyer’s design-by-contract (DbC); in fact, DbC is nowadays an effective software development technique in object-oriented programming [9]. The idea of DbC is that the method of an object realises a precise contracts between invokers and the object expressed in terms of pre- and post- conditions. For instance, in1 int keepGoing(int p) // pre: p >= 0 // post: p < keepGoing(p)

int keepGoing(int p) // pre: p < 0 // post: keepGoing(p) < p

the assertions on the left establish that if the method keepGoing is invoked with a positive integer p, it will return an integer strictly greater than p; instead, the assertions on the right stipulate that 1

I use a fictional syntax.

8

keepGoing returns a value strictly lower than p when the latter is negative. The benefits of the DbC approach in software development is undisputed. As a matter of fact, DbC allows programmers to avoid errors in their software and greatly reduces defensive programming, that is the need of cluttering programs with code to check if the state of the computation meets the conditions expected by a construct before its execution. Moreover, DbC also enables the automatic synthesis of monitors that can check the conditions stipulated in the contracts at run-time. In the recent past years, sofware has been shifting from ’standalone’ applications to mobile and dynamically composable ones. This has in turn also increased the emphasis on communication and interaction. In this context, it is crucial to define and use languages, methodologies, and tools to specify, analyse, and verify distributed application-level protocols. The typical properties of applicationlevel protocols that are of interest extend classical protocols like deadlock- or livelock-freedom, with progress or absence of message orphanage that we will define later. Remark 2 The so called behavioural types (that constrain the reciprocal interactions of distributed participants) are an example of such formalisms. Notably, behavioural types are at the basis of the recent project Behavioural Types for Reliable Large-Scale Software Systems (BETTY, oc-2011-2-10054-EU COST Action) which aims to develop new foundations, programming languages, and software development methods for communication-intensive distributed systems. Unlike in the classical DbC approach (where contracts are specified by directly annotating software with assertions) application-level protocols specify high-level contracts at design level. Such difference 9

should probably be more deeply explored as it introduces interesting questions (that will not be addressed in this course). For instance, contracts of high-level specifications do not guarantee the correctness of actual implementations. On the other hand, code correctness does not guarantee that e.g. the protocol of an application enjoys good properties. It is therefore necessary to establish clear links between the two levels and study their complementary interplay. Remark 3 A possible drawback of using sophisticated contracts in distributed applications is that design and implementation become more complex. Indeed, this is what happens in the DbC approach in object-oriented programming mentioned above.2 However, this seems to be a necessary price to pay; since the problem is complex and simplistic approaches do not provide satisfactory solutions.

2

In this respect it is interesting the reaction of students in software engineering from different universities reported in [11].

10

Part II Software Architectures

2

Software Architectures

Modern applications can hardly be imagined as ’stand-alone’ software. Applications are more often than not developed in a way that permits their smooth integration so that new applications can be obtained by integrating exiting ones. A paradigmatic example of such trend is offered by new paradigms such as the so-called serviceoriented or cloud computing. Those emerging paradigms cater for simple integration and cooperation of software but – at the same time – they yield new problems and challenges. In this context the complexity of software increases and we need more abstract approaches to deal with such complexity. Software architectures (SA) have been used to tackle software complexity. In fact, SA are an abstraction level between (high-level) requirements and (low-level) implementations; such abstraction allows developers to – de-/compose and reuse computational elements (for instance, determining pattern of interactions or communications) – identify and address system-level concerns (such as scalability, throughput, etc) – identify application-dependent aspects and reflect them in implementations (e.g., software components may reflect organisational aspects) A (vague) definition of SA is the one in [1] “A software architecture for a system is the structure or structures of the system, which comprise elements, their externallyvisible behaviour, and the relationships among them.” An appealing aspect of SA when considering distributed applications is the intrinsic separation between computational aspects vs 13

interaction and communication. In fact, SA distinguish two different types of elements of software: – components and – connectors A component is an element of a system in charge of some specific functionalities (including data or resources storage, access, and management) while a connector is an element that manages componenet interactions in order to specify some types of communication, coordination, or adaptation. Example 1. In a client-server architecture, clients and servers are the components while the connectors represent the communication middleware. Typically, software architectures have a “box-and-wire” graphical representation. Example 2. A graphical representation of a client-server architecture (cf. Example 1) where requests and response ports are different is given by the following diagram

C

S

that represents a client C and a server S interacting through two connectors (represented as bullets). The connector may be realised in many different ways; for instance, it could be a TCP/IP connection, a complex protocol, a communication middleware (e.g., RMI, or a MOM). Although the diagram above does not explicitly specify the communication pattern of the architecture, one could imagine that a connector is used by the client C to make a request to the server while the other is used by the server S for the response. 14

Remark 4 We do not have time to introduce a full-fledged language to express software architectures, and therefore common sense has to be exercised to ’read’ the diagrams representing software architectures. For instance, the diagram does not specify e.g., where data is or where computation happens. Common sense would suggest that in a client-server architecture in general, that data is on the server side and that computation is on the client side (or on other servers if e.g., thin clients are used). An important concept in software architectures is the notion of architectural configuration. This concept refers to the structural layout software elements can actually assume at runtime. For instance, in a client-server architecture typically many clients can make requests to a server. This is not usually specified in the diagram describing the architecture as the one in Example 1. Example 3. Using the same graphical notation of software architectures, a configuration with two clients connected to the same server of the client-server architecture of Example 1 could be depicted as:

C1

C2

S

The fact that clients share the same connectors is a design choice and may depend on the nature of the connectors and/or be an applicationdependent choice. Exercise 1 Consider the client-server architecture in Example 1. Assume that at each new request from a client the server replies with the reference to a new connector where further interactions with the client take place. Draw a diagram for a configuration with two clients, one that is connected to the server but has not made any request yet, and 15

another one that has made a request. Answer. A possible solution C2 C1

S

the dashed line is optional. Although we will not deal with runtime configurations and their consistency with software architectures, consistency is an important property that one should guarantee. It is important to observe that at runtime such consistency can be lost as shown in the following example. Example 4. In the configuration of Example 3 the connection between C1 and the response connector may fail leading to the configuration

C1

C2

S

which is no longer consistent with the architecture in Example 1.

Typically, if consistency between configurations and their architecture is lost, re-configurations are necessary (and can be applied manually or (semi-)automatically). 2.1

A Note on Connectors

The notion of connector and the role of connectors in a software architecture may be not as clear as the role of components. As said, in § 2 (and illustrated in Example 2), connectors incorporate application-independent functionalities for required -generally speaking- for the coordination of components. Roughly, connectors 16

are used to facilitate the interoperability of components. This means that the functionalities that connectors offer may be quite diversified and be related to – – – – – –

data flow control flow data transformation persistence load balancing etc.

Such considerations show that there is a wide range of possible types of connectors; nevetherless, we can classify connectors for 1. data exchange / communication 2. control flow management Examples of connectors’ types are – for (1) procedure calls, shared memory, DBMS, data adapters, communication middlewares, ... – for (2) task dispatchers, transactional functionalities, wrappers, ... to mention but a few. 2.2

Coupling vs Cohesion

Modern distributed applications have two specific characteristics: they are open and dynamic. This means that applications are designed and implemented to be composed with other applications (openness) and that some parts may be changed (dynamism). All this may possibly happen on-the-fly (i.e., at runtime)! A paradigmatic example is offered by applications developed according to a service oriented architecture where functionalities are envisaged as 17

services to be invoked and applications are built by binding (i.e., composing) services together. Such kind of applications require a suitable architectural design. More precisely, a good software architecture for open and dynamic applications should minimise coupling among components while maximising the cohesion of each module. But what do we mean by ’coupling’ and ’cohesion’ ? Coupling can be understood as the degree of dependency among components; an application is loosely coupled when each component does not rely on how other components are implemented. For instance, a service in service-oriented computing should only rely on the interfaces of the other services it invokes and not on how they are implemented. Cohesion is the degree of “similarity” of the functionalities offered by each component. A low cohesive application is an application where there are components offering functionalities which are not very related to each other. For instance, developing an application in the cloud offering a data warehousing service developed as a single component that gives access to databases, offers reporting functionalities, and data analysis functionalities would have low cohesion. A high cohesive design would instead have (at least) three components, each one grouping the different types of functionalities. Exercise 2 Consider the diagrams below

Diagram A

Diagram B

18

where boxes represent components, circles represent their functionalities, dashed lines represent similarities among functionalities, and solid headed-lines represent dependencies among components induced by their interactions. For each diagram above say if it represent a low/high coupled application and a low/high cohesive application. Answer. The system in the diagram on the left is low-coupled and high-coehesive; the system in the diagram on the right is highlycoupled and low cohesive (Why? Because the functionalities in the components interact more with functionalities of other components than among themselves. 2.3

Architectural styles

The design of an application requires many decisions to be taken. Some of those decisions form what is called an architectural style. Those decisions are the ones that – can be applied in a given development context – add some constraints to the architectural structure – distill some specific properties of the interactions A style can be thought of as a “type for architectures”, namely an abstract description of classes of architectures that obey the rules dictated by the style. There are three basic ingredients in a style: a vocabulary, a set of structural constraints, and a description of the interaction invariants of the architectural elements. We now define these concepts. The vocabulary of a style identifies the types/roles of architectural elements. For instance, the vocabulary of the style of the clientserver architecture of Example 1 would specify a type for each component, say client and server, and a type for each connector, say 19

request and response. Typically, such types identify specific roles of components (e.g., services, controllers, presenters, etc.) and of connectors (e.g., stream manipulators, events, mechanisms to control executin flows, etc.). The structural constraints specify restrictions on the topological structure of the architectures obeying the style. For instance, a style for a client-server architecture may impose that no more than two components of type client can be connected to a connector of type request or that a connector of type response is connected to only one component of type server. Finally, the style specifies invariants that interactions must preserve. For instance, in the style of a client-server architecture one could impose the invariant that interactions are one-way, e.g., connectors of type request “go” from components of type client to components of type server and, vice versa, connectors of type response “go” from components of type server to components of type client. There are many styles in the literature; we consider few styles suitable for distributed applications. In particular, – client-server3 – pipe-filter – blackboard We discuss such styles by considering the running example of a (simple) on-line multi-player game. Example 5. The each controlling galaxy in search control the flight 3

game consists of an unspecified number of player a spaceship with which they explore an unknown of a planet where human beings can live. Players of their spaceship (throttle of the engine, fuel level,

Unfortunately, the term ’client-server’ is overloaded and can mean either a software architecture or an architectural style

20

altitude and speed) and communicate with other players by sending them information about the planets they explore. 2.4

Client-Server style

This is probably the simplest style for modelling this kind of applications (and possible for any distributed application). Vocabulary The types of architectural elements include – – – –

game-server player game-bus player-bus

(the names of the types, although immaterial, are reminiscent of the functionalities of each element). Exercise 3 For each type in the above vocabulary say if it is a type for a component or a connector. Answer. Components are game-server and player, while connectors are game-bus and player-bus. Structural constraint There is a single game-server while there is an arbitrary number of player elements. Connectors are TCP/IP connections and are shared by only one instance of each type of component. Interaction invariants The communication among elements of type player happens through a connection controlled by elements of game-server. Exercise 4 Using our informal graphical notation, draw an architecture that has the style described above. Answer. 21

game bus Player

Game Server player bus

2.5

Pipe-filter style

In this style components, called filters, transform streams of inputs into streams of outputs. The output data streams are delivered to other filters by the connectors, called pipes. Example 6. The pipe-filter style is a core feature of Linux and it is heavily used shell programming. The components are Linux commands that input data from stdin and send output to stdout. For example, the execution of the Linux pipe ls -l | wc -l returns the number of files in a directory, in fact – the command ls -l lists all the files in the current directory lineby-line – the command wc -l counts the lines in a text – and the pipe symbol | connects the stdout from ls -l to the stdin of wc -l.

An important invariant of this style is that filters have to be completely independent (e.g., they cannot share information). Vocabulary Ignoring the inter-player communications, the types of architectural elements for the application in Example 5 include – game-server 22

– player – game-bus Exercise 5 For each type in the above vocabulary say if it is a filter or a pipe. Answer. game-bus is a pipe, the other elements are filters. Structural constraint There is a single game-server while there is an arbitrary number of player elements to be connected according to the pipe-filter style. A more specific constraint could be that player/server have a dedicated (pair) of pipes they are connected to. Interaction invariants Players continuously sends data to game-server that sends back the new game scenario when updates are necessary.

2.6

Blackboard style

This style is very convenient when a high degree of loose coupling is necessary. The style features a special type of connector, called blackboard, which can be thought of as a central data-structure accessed by other components. In general, in this style connectors specify very abstract access policies to the data-structure. Vocabulary The types of architectural elements may include – ship-control – flight-blackboard – communication-blackboard – scenario-updater – display – tuple-based-pattern-matching 23

Structural constraint There is a single game-server while there is an arbitrary number of player elements. For each player element there is a single ship-control, a single display, and a single scenario-updater element. They all connect through the flight-blackboard. The display elements also connect through the communication-blackboard. Interaction invariants The communication among elements happens by storing and removing data from blackboards. Components may access blackboard any time and data has to be time-stamped so to allow the most update information to be processed. Exercise 6 Using our informal graphical notation, draw an architecture that has the style described above. Answer. A possible architecture is: TBPM1 ship-control

flight-blackboard

scenario-updater TBPM2

display TBPM3

communication-blackboard TBPM4

(where TBPM abbreviates tuple-based-pattern-matching). The connectors TBPM1-TBPM3 could also be replaces by one connector. Note that the names flight-blackboard and communication-blackboard do not make those elements connectors in sense of the blackboard style (they are just names for components, other suitable names could be flight-manage and communication-manager respectively.

24

Part III From Global Specifications...

3

What is a choreography?

Consider the following excerpt from [8] (bold text is mine) “Using the Web Services Choreography specification, a contract containing a global definition of the common ordering conditions and constraints under which messages are exchanged, is produced that describes, from a global viewpoint [...] observable behaviour of all the parties involved. Each party can then use the global definition to build and test solutions that conform to it. The global specification is in turn realised by combination of the resulting local systems [...]” The first part of the quotation above envisages a choreography as a global specification regulating the exchange of messages. The last part of the quotation above yields two distinctive elements of choreographies. The first element is the fact that the global definition can be used by each party to build its component. The second element is that conformance checks can also be done locally using the global contract. Remark 5 The first element is referred to as the projectability of the choregraphy, that the possibility to determine the behaviour of all the participants of the choreography from the global specification. The second element is referred to as realisability of the choregraphy, namely the property of a choregraphy to be implemented by distributed components. The methodology sketched above is quite appealing for industry as it allows parties developed independently (e.g., services) to combine together while hiding implementation details that typically companies do not want to reveal. Moreover, the developers of a local 27

component can check it against the global view and have the guarantee that, if the local component is compliant with the global view, the whole application will conform to the global choreography. It is important to remark that a choregraphy should guarantee that the distributed interactions happen harmoniously. For instance, a choreography where some of the participants terminate while others remain waiting for some messages to arrive, is not considered a good choreography. Remark 6 Harmonious execution has to be considered under the light of distributed execution based on the assumptions made on the communication model and on the fact that each participant executes distributively. More precisely, we require that well-behaved choreographies have the following properties: Graceful termination: all the participants involved in a choreography eventually terminate, or never get stuck No orphan messages: all sent messages are eventually received No unspecified reception: each participant should receive only expected messages We will first consider a (semi-)formal language to represent global specifications. Such language resembles UML’s sequence diagrams (actually, it is similar to message sequence charts [6]).

4

Choreographies as interaction diagrams

Global choreographies can be represented using sequence diagrams [10] (see also http://www.uml-diagrams.org/sequence-diagrams.html). A sequence diagram is an interaction diagram describing message communications among some components. Graphically, components are represented as headed lifelines as depicted in Fig. 1. 28

id1

id

id2 msg

Fig. 1: Lifeline

Fig. 2: Asynchronous call

Remark 7 It is worth emphasising that we use sequence diagrams with a different flavour than their typical usage in object-oriented design; in fact, we will use a simplified version of sequence diagrams. More precisely, the lifelines in our sequence diagrams do not represent objects, but rather participants4 , namely components that execute distributively. Hereafter, we use the terms ’component’ and ’participant’ interchangeably. For us, a participant could be either a human being interacting with the system or one of the software components of the system; we do not need to specify which of the two cases apply. In fact, by Remark 7 (and as illustrated in Fig. 1), we may omit the class name of participants from the lifeline head of our sequence diagrams. Our basic assumption is that interaction is possible through message exchange: this is dictated by the distributed nature of the systems that we consider. Also, each participant is supposed to be not multithreaded and communication is asynchronous so our sequence diagrams will not have the so-called synchronous calls. Moreover, we will assume that messages are exchanged on channel (or port) even when channels are not explicitly mentioned in messages. Example 7. A choregraphy for simplified ATM-application can be depicted as follows: 4

Although participants may be actually implemented as objects in an object-oriented language.

29

C

A

B

pin amount withdraw ok money

Notice that this is a very basic choreography and we will come back to this aspect later. There are two possible ways of interpreting Example 7: data-flow interpretation envisages the message on the arrows of a sequence diagram as a description of what (and when) data are be exchanged by participants. In this interpretation it is often immaterial how the values are exchanged (since the focus is on the data-flow). For instance, in the first interaction between C and A of the choreography in Example 7, pin has to be understood as C providing its personal identification number (represented by the variable pin) to A. To make it explicit the communication channel, say c, over which such exchange happen, one could have written pin on c. channel-based interpretation envisages the message on the arrows of a sequence diagram as identifiers of communication channels used by participants when interacting. Often the values exchanged in the interaction are not mentioned but optionally they could be explicitly represented. For instance, in the first interaction between C and A of the choreography in Example 7, pin has to be understood as C sending a value on a channel called pin and A waiting on that channel to receive such value. To make it explicit the personal identification number, one could have written pinxpin numbery. 30

Remark 8 Since we are interested in the coordination of communicating distributed participants, we will adopt the channel-based interpretation. As we will see later, channels (implicit or not) behave as an unbound FIFO queue (that is, they preserve the order).

5

An execution model (by examples)

We illustrate here the communication behaviour of components. A precise definition will be given later. Each lifeline has to be thought of as a component made of an autonomous (single) thread of execution. Lifelines show only the (observable) communication patter of participants in terms of their send and receive operations. Between two consecutive interactions, a component may execute internal behaviour (i.e. local computation not represented in the diagram). We assume that send operations are non-blocking while receive operations are blocking. In other words, the sender continues its execution even if the receiver is not ready to consume the message, while a receiver cannot continue if (one of) the expected message(s) is not available. It is easy to verify that in Example 7 a possible order of the messages (and therefore of the interactions) is 1. C sends her pin number to A 2. C sends A the amount of money to withdraw 3. A tells B that C intendes to widthraw some money 4. B tells A that the operation is allows 5. A gives C the money Notice that, between steps 3 and 4, B does an internal computation by e.g. accessing a local database. 31

Exercise 7 Give all the consistent traces of execution in terms of send/receive actions on channels for the order given above. Answer. Let P!m denote that participant p sends on m and P?m denote that participant p receives from m. There are two possible traces: C!pin; A?pin; C!amount; A?amount; A!widthraw; B?widthraw; B!ok; A?ok; A!money; C?money and C!pin; C!amount; A?pin; A?amount; A!widthraw; B?widthraw; B!ok; A?ok; A!money; C?money

Remark 9 You are invited to figure out why the trace C!pin; C!amount; A?amount; A?pin; A!widthraw; B?widthraw; B!ok; A?ok; A!money; C?money is not a correct answer to Exercise 7. Sequence diagrams may look odd. Exercise 8 Is the choreography below executable with asynchronous communications? id1

id2

a

b

Justify your answer. And what about this one? 32

id1

id2 b a

Reflect on “how time flows”. Answer. The first diagram describes a possible choreography where Id1 and Id2 both start by sending a message to each other; since the communication is asynchornous none of the participants has to wait for the message it sent, so they can receive the message sent by the other participant. The second diagram does not have an execution because both Id1 and Id2 start off by waiting for a message sent by the other, so an implementation of Id1 and Id2 would deadlock when starting from an initial configuration (where all channels are empty). The execution model introduced above yields a semantics to sequence diagrams in terms of execution traces once the communication actions of each participant have been identified (following its lifeline as done in § 5). Example 8. For the participants of the choreography in Exercise 8 we have: id1 a!; b? and id2 b!; a?

Remark 10 Hereafter, given a participant A, we denote with Ai its program after its i-th action has been executed. For instance, for the participant id1 in Example 8, id10 is id1 itself, id11 b?, and id12 represents that id1 has terminated. Then, from the initial configuration (that is the configuration where each participant has not executed any action yet and all the commu33

nication channels are empty), for all the enabled actions 5 , we add an arc labelled with the fired action to a new node representing a configuration where the participant who fired the enabled action continues to its next communication and the channels are updated); once all the enabled action of the initial configuration have been considered, one continues the same process with another node taking care of not adding a node for a configuration that is already present in the graph. This process terminates when there are no more enabled actions. Example 9. Building the execution graph of the choreography Exera b cise 8 where the initial configuration is xid10 , id20 , .. , .. y, we get . . a

xid10, id20, ...

b , .. .

y b!

a! a

xid11, id20,

b , .. .. . .

a b xid10, id21, ... , . ..

y b!

a! a

b

..

..

xid11, id21, . , . y Note that this is not the whole graph!

Remark 11 In principle, it is not necessary to explicitly specify the nodes of the execution graph (only the labels of the arrows matter to find the traces), but it is good practice to spell them out because they help to figure out all the possible executions. 5

The action of a participant is enabled when it is an output (output are non blocking) or it is an input on a channel whose first element is the one mentioned in the input action.

34

y

Exercise 9 Complete the execution graph of Example 9. Answer.The missing part is a b

xid11 , id21 , , y .. .. . .

a?

b? a

b xid12 , id21 , , .. .. . .

a xid11 , id22 , .. .

y a?

b? a xid12 , id22 , .. .

6

b

, y .. .

b , .. .

y

More expressiveness

As said, the choreography in Example 7 is very basic. Therefore we need to add constructs to express more interesting (and precise) choreographies. For instance, one would be able to express what happens when C enters the wrong pin or tries to withdraw more money those on her account. 6.1

Branching

Standard sequence diagrams use the so-called alternate construct to express conditional behaviour. For our purposes it is better to use a more general construct, called branching, that allows us to express more than two alternative behaviours. We adopt the following notation for the branching construct: branch

l1 on c

ln on c

35

In the branching construct above, labels l1, . . . , ln are sent by a participant – the one whose lifeline intersects the circle – on channel c to a partner – the one pointed by the empty-headed arrow – to communicate how the choregraphy should proceed. Remark 12 The ’on c’ part specifies on which channel the selection/branching is made; we can just write the labels when the channel is understood. We will now see that branching brings in expressiveness, but also problems. In fact, branching-blocks will be subject to some restrictions that we will spell out later. Let us first refine the choreography in Example 7 and show how branching caters for more expressiveness. Example 10. A refined choregraphy for the simplified ATM-application can be depicted as follows: C

A

B

pin branch

right on c amount

withdraw ok wrong on c

Note that the interactions between A and B happen only if C enters a right pin number. Also, in the case that C enters a wrong pin, the choreography stops after A sends the label ’wrong’ to C. Remark 13 An important observation emerging from Example 10 is that the check on the pin number performed by A is an “local” 36

computation not represented in the diagram; we will come back on this. 6.2

A note on choice and non-determinism

Two important concepts of choreographies (and more generally in parallel and distributed computing) are the notions of choice and non-determinism. Unlike in sequential computations, in concurrent and distributed computations the control flow is not unique; in fact, by definition there may be many parts of a system that concurrently execute. This makes non-determinism unavoidable (and often desirable) in distributed applications; also, the interplay between distributed choice and non-determinism may lead to problems in distributed choreographies. Distributed choices. When the execution of a participant, say A, does not require A to interact with other participants we say that A is executing an internal computation. At design level, very often such internal computations are abstracted away so to maintain the design simple. In a choreography this becomes evident when considering branches; as observed earlier (cf. Remark 13), the sequence diagram of a choreography simply describes the communication pattern of participants and, for instance, does not detail how the selector of a branch decides which label to send. In particular, it is crucial to establish when the choice of a participant is external or internal. For the moment internal an external choices can be only described informally (later, when the automata model will be introduced, we will give a precise definition): a participant of a choreography, say A, makes an internal choice when its execution can proceed along two or more different directions (that is A can execute different patterns 37

of interactions) and the decision about what direction to take is made without interacting with other participants an external choice when A has several possible alternatives to continue its execution, but cannot progress autonomously and has to wait for interactions with other participants (in our framework, this means that A has to wait for inputs from other participants). Example 11. In the branch of the choreography of Example 10, A makes an internal choice when she decides to send either label right or label wrong; instead, C makes an external choice in that branch as he cannot progress until either of the labels is received. In other words, how C continue his execution depends on what interactions A does (hence the choice of C is external). Non-determinism. In general a non-deterministic computation is a computation where at least one step of execution is not a function of the state in the previous step and/or of the input. Non-determinism is a theoretical abstraction to represent computations that are not entirely under the control of a program. The closest approximation of non-deterministic sequential computation can be obtained in usual programming via the generation of (pseudo)random values. For instance, one could write an ’if’ statement whose guard is a randomly generated boolean value so to simulate the non-deterministic choice between the ’then’ and ’else’ branches. In distributed computations non-determinism is not only an important abstraction mechanism, but also an intrinsic part of the behaviour. In fact it is practically unfeasible to make a distributed system completely deterministic. For example, it is not possible to predict the order in which requests from independent clients arrive to a server in an asynchronous setting (unless one wants to impose severe limitations that would highly degrade performances and efficiency). 38

An intuitive way to think of non-determinism is by admitting that in some points of the computation several alternative computations are possible and the design (or the program) does establish which alternative has to be taken; the decision is taken by an (hypothetical) scheduler that establishes which alternative to execute next “arbitrarily”, that is regardless the state of the computation or of the design/program.

Remark 14 It is worth to remark that non-determinism and distributed choices are orthogonal concepts. For instance, internal choices can be deterministic or non-deterministic and similarly for external choices. As an example, the ATM in Example 10 would very likely make a deterministic choice (one could imagine that which branch to take is established by a simple if-statement that checks the validity of the pin number). Instead, the behaviour of B in the model answer of Exercise 10 could be a non-deterministic internal choice where B generates a random number representing the time to wait for a label from A; if the label does not arrive with such time, B sends a message on b2 .

6.3

Problems with branching

As anticipated, more expressiveness yields problems; to present the problems that branching-blocks bring in, we consider a few examples.

Unique selector Consider the following choreography: 39

After A starts by triggering C and B on channels a1 and a2 respectively, participants A, B, and C engage in a branch so that – if A sends label l1 to C then B sends something on b to C – if instead B sends label l2 to C then C sends something on c to A. In a distributed asynchronous execution, this may lead to a deadlock. In fact, in the branch A may (locally) decide6 to either communicate label l1 to C and then terminate, or wait for C to send something on channel c. Similarly, after receiving on a2, B (locally) decides whether to send a label to C and terminate or wait on channel b. If both A and B choose to send the label to C, they will both terminate hence, whatever branch C takes, the communications on b or c will not be completed as the message sent by C will not be received.7 Exercise 10 Give a choreography such that there is a deadlock due to the fact that in a branch labels are sent to different participants by the same participant. Answer. In the choreography below 6

7

What do we mean by “local decision”? Or more generally by “local computation”? In a distributed system/application, a computation is local to one of the participants, say P, if the computation does not require any interaction with other participants and it is performed (and modifies) only the local state of P. This is an example of message orphanage that we will define later.

40

B may non-deterministically decide to send on b2 while A and C execute the first branch; note that C is informed of what to do because if it receives on channel c1 first, then it behaves as per the first branch, otherwise if label l2 is received first then C behaves as per the second branch.

Exercise 11 Deadlock configurations can be found by building the execution graph from the sequence diagram. Consider the choreography below:

Intuitively, a deadlock occurs when both A and B decide to send their labels; in such case in fact, A will be stuck on the input from channel b which will never arrive, since B would have terminated once it has sent label l2. The execution graph of the above choreography is: 41

a xA0 , B0 , .. .

b , .. .

y b!l 2

a !l 1 a l1 b xA1 , B0 , , .. .. . .

a xA0 , B1 , .. .

y

b l , 2 y .. .

a?l1 b , .. .

y 1

b

, y .. .

b?l2

b!l 2

b! a xA1 , B2 , .. . 1

a !l

a xA1 , B11 , .. .

b? a xA2 , B2 , .. . 1

b , .. .

y

a b l1 l2 xA1 , B1 , , y .. .. . .

a xA11 , B1 , .. .

b , .. .

y

and the middle state at the bottom of the graph is a deadlock state because A1 is stuck on the input on b which is empty and the only partner which could communicate on b is terminated. Determinacy The participants involved in a branching execution have to be able to unambiguously determine what they have to execute after being notified a label. Take the following choreography

When A is notified the label l1, she can decide either to send on channel a or on channel b. However, depending on which of the branches 42

B executes, B may be waiting on the other channel. Namely, it may happen that B chooses the first branch (and so waits on a) while A decides to send on channel B after the notification of the label. In such case, a deadlock will occur. Un-notified participants If a participant, say C, is not notified about which branch has been chosen, then C has to have the same behaviour in any branch. Let us reconsider the choreography in Example 10; there B was not involved in the choice and still was required to be prompt to interact in case a customer C enters a right pin. This type of choreographies are not considered “good” because one of the participants (B in our example) may not terminate so causing deadlocks. Exercise 12 Modify the choreography in Example 10 so that B is involved in the choice. Answer. It is just necessary to extend the scope of the branch to include B and to add a message from A to B in the 2nd branch so that B has an external choice that allows it to decide how to continue in the different branches. C

A

B

pin branch

right on c amount withdraw ok wrong on c abort

Exercise 13 Which of the conditions of well-behaviour the second choreography of Exercise 8 and the choreography below violate. 43

C

B

A

branch

l1 on c a a l2 on c

b

Answer. The choreography of Exercise 8 violates graceful termination as it immediately deadlocks. The choreography above violates message orphanage (but not graceful termination).

44

7

Global Graphs

Although appealing for their simplicity, sequence diagrams do not fully realise the duality between global and local view of choreographies discussed in § 3. In fact, it is not easy to define the projection of a sequence diagram because they do not come with a precise semantics. For such reasons, more formal models of choreographies have been developed. Our formal model of global specifications will be global graphs, that are a graphical formalism to describe and highlight – causal dependency, – fork and branch point, – and merge and join point of distributed interactions...“from a global viewpoint” (cf. the W3C quotation on page 27). 7.1

Basic definitions

We now start to be more precise and introduce a few sets, notations, and definitions used throughout the rest of the course. We fix a finite set P (ranged over by p, q, r, s, etc.) whose elements will represent participants of choregraphies. Also, we use a finite alphabet A of symbols to represent the messages exchanged by participants. Definition 1 (Graphs). Given a set L (of labels), a ( labelled) graph on L is a triple xV, A, Λy where – V is (a finite) set of vertexes (or nodes) – A V V is a set of edges (or arcs), 45

– and Λ : V

Ñ L is a labelling function.

A global graph is just a labelled graph with special labels and conditions. More precisely, our fixed set of labels will be L: t , , ,

u Y ts _ r : a | s, r P P ^ a P Au

where will single out the source node, will label final nodes, will label branch/merge nodes or to entry points of loops, will label fork/join nodes, and labels of the form s _ r : a will be attached to nodes to express interactions, that is the fact that s sends message a to r. Definition 2 (Global graph). A global graph (over P and A) is a labelled graph xV, A, Λy on L such that the following conditions hold

q is a singleton 2. for each v P V , (a) if Λpv q is of the form s _ r : a then v has unique incoming and 1. Λ1 p

unique outgoing edges, and

(b) if Λpv q P t , edge

u, v has at least one incoming and one outgoing

(c) if Λpv q then v has a single outgoing edge and if Λpv q then v has no outgoing edges.

Nodes that are labelled with or labels are called gate nodes (o simply gates) while those labelled with interactions are called interaction nodes. Condition 1 in Definition 2 ensures that a global graph has a single starting node while conditions 2a, 2b, and 2c constrain the possible connections among nodes depending on their labels. More precisely, nodes labelled with interactions must be reached by a single edge and must reach a single node (condition 2a). By condition 2b, gates 46

have to be reached from and must reach some other nodes. Finally, nodes labelled with (resp. ) cannot reach more than one (resp. any) other node. Before discussing how to interpret the behaviour represented by a global graph, we give an example to show a convenient diagrammatic representation.

Example 12. Consider the diagram below:

A _ C : cwin

A _ B : bwin

C _ B : blose

B _ C : close

B _ A : sig

C _ D : busy

C _ A : msg

A _ D : free

such diagram corresponds to the global graph of the choreography of a simple game that will be discussed later.

Exercise 14 Give the definition of the global graph corresponding to the diagram of Example 12. Answer. The set of nodes has to contain 16 elements (as many as the nodes in the diagram); say that the set of integers from 1 to 16 47

is the set of nodes, then the labelling function could be ΛEx.12 ΛEx.12 : 1 ÞÑ

ΛEx.12 : 2, 4, 10 ÞÑ

ΛEx.12 : 3, 11, 13, 15 ÞÑ

ΛEx.12 : 5 ÞÑ pA _ C : cwinq

ΛEx.12 : 6 ÞÑ pC _ B : bloseq

ΛEx.12 : 7 ÞÑ pA _ B : bwinq

ΛEx.12 : 8 ÞÑ pB _ C : closeq

ΛEx.12 : 9 ÞÑ pC _ D : busyq

ΛEx.12 : 12 ÞÑ pB _ A : sigq

ΛEx.12 : 14 ÞÑ pC _ A : msgq

ΛEx.12 : 16 ÞÑ pA _ D : freeq and the set of arcs is

VEx.12 tp1, 2q, p2, 3q, p3, 4q, p3, 9q, p4, 5q, p5, 6q, p4, 7q, p7, 8q,

p6, 10q, p8, 10q, p9, 13q, p10, 11q, p11, 12q, p11, 13q, p13, 14q, p14, 15q, p12, 15q, p15, 16q, p16, 2qu

7.2

Execution flow of global graphs

We now give an intuitive account of the semantics of global graphs. Roughly speaking, a global graph represents the workflow of a choreography which is determined by the gate nodes and the interaction nodes. The concatenation of interaction nodes models sequential composition of interactions while gates allow for form decision branches, loops, or parallel composition. More precisely: 48

Sequential composition When interaction nodes are composed as follows A_B:a B_A:b

the flow of execution imposes that the second interaction B _ A : b is executed after the first one (namely A _ B : a). Branch A pattern such as

specifies a flow that enters the gate and splits across alternative branches. Example 13. In the global graph A_B:a A_B:b

A_C:c

either the interaction A _ B : b or the interaction A _ C : c is executed after the interaction A _ B : a. Fork A pattern such as

specifies a flow that splits into different parallel threads of execution.

49

Example 14. In the global graph A_B:a B_A:b

C_D:c

both interactions B _ A : b and C _ D : c are executed after the interaction A _ B : a. Exercise 15 Consider the following global graph: A_B:a B_A:b

C_A:c

Explain under which conditions the choreography above makes sense. Answer. Under our hypothesis such choreography is not possible as participants are single-threaded. If a participant could be a concurrent program then the above choreography would be possible. Merge/loops A pattern such as

specifies a flow where alternative branches are merged or the flows loop back. Join Similarly to the previous case, a pattern such as

50

specifies that flows of parallel threads join before the execution could continue. 7.3

A more complex scenario

We now describe what is the underlying workflow of a global graph by discussing the following global graph which is the one in Example 12 decorated with blocks to identify the different stages of the flow of execution.

A _ C : cwin

A _ B : bwin

C _ B : blose

B _ C : close

B _ A : sig

C _ D : busy

The blocks identify regions of the global graph explained below.

C _ A : msg

A _ D : free

Initially the flow moves from the source node to the entry point of a loop (highlighted by the topmost block in the previous diagram). In the loop, the flow splits into two parallel threads of execution: the rightmost thread (let us call it T ) starts with participants C and D exchanging the message busy, while the leftmost thread (the middle block in the diagram) is a branch block. In fact, in the leftmost thread participant A either starts by interacting with C or with B (in the former case, the inner block in the previous diagram highlights a sequence of interactions). Once the branch is merged, the flow splits again in parallel to represent that the interaction B _ A : sig is 51

executed in parallel with the thread T that forked at the start of the loop. Once the threads have joined (lowermost block in the previous diagram), A and D exchange free and loop back for another iteration.

Exercise 16 According to the global graph of Example 12, is the thread T forking? Justify your answer. Answer. The tread is not forking as the only gate on it is the join gate in the lowermost block. Why is such gate necessary?

Exercise 17 Give a global graph for the choreography described by the sequence diagram in Example 10 (on page 36). Answer. A possible global graph is:

C _ A : pin A _ C : right

A _ C : wrong

C _ A : amount A _ B : withdraw B _ A : ok

However, a global graph where the scope of the branch is more clearly identified is better. Modify the above global graph so that it is clear where the branch ends.

Exercise 18 Modify the global graph of Exercise 17 so that when the entered pin is not correct, the customer re-sends the pin message. 52

Answer.

C _ A : pin A _ C : right

A _ C : wrong

C _ A : amount A _ B : withdraw B _ A : ok

This solution has the same problem of the solution of Exercise 18. Modify the above global graph so that it is clear where the branch ends.

53

54

Part IV ...To Local Specifications

8

Projections

As discussed in § 1, an appealing aspect of choreography is the possibility of obtaining local viewpoints from global ones. This is often achieved by projecting the global viewpoint. We introduce the idea of projection by an intuition based on sequence diagrams. Take the following diagram borrowed from http: //msdn.microsoft.com/en-us/library/dd465153.aspx

(the actual choreography described in the diagram above is immaterial for the purpose of this section). The diagram above can be “projected onto the three participants” WebSite, Warehouse, and Bank so to obtain a specification of each participant. We consider projection as the operation of focusing on a single role of the choreography at the time. This is achieved by “blurring” any other participant. For instance, the projection of the diagram with respect to WebSite could be thought of as being represented by the following figure 57

while the one of the Warehouse by the following one

The above idea is rather informal and the projections do not correspond to any meaningful model of specification. Instead, a projection operation can be precisely defined on global graphs to obtain communicating finite state machines that we now define. 8.1

Communicating Finite State Machines

The model of communicating finite state machines (CFSMs) was proposed in [2] as a convenient model to analyse communication 58

protocols. Informally, this model uses finite state automata (aka machines) to represent the behaviours of distributed participants that interact exchanging messages. When a machine sends a message to another machine, the message is “stored” into a queue that the receiver can access. Dually, a machine willing to input a message visits its queues and consumes the message (if any). In order to adopt CFMSs to represent local viewpoints of choreographies we fix some sets, notations, and terminology. Let P be a (finite) set of participants (ranged over by p, q, r, s, etc.) and A be a finite alphabet (that is a set of symbols representing the content of messages). The set of channels is C : tpq p, q P P and p qu

(1)

and the set of actions (ranged over by `) is Act

:

tτ u Y

C t !u A

Y

C t?u A

(2)

Intuitively, an element pq P C identifies a channel from participant p to q and an element ` P Act denotes either an internal computation (τ ) or a communication action; more precisely, ` ppq, !, aq denotes an output on channel pq of the symbol a P A and ` ppq, !, aq denotes an input from channel pq of the symbol a P A. Hereafter, we adopt the (more evocative) notation pq!a instead of ppq, !, aq and likewise we use pq?a instead of ppq, ?, aq. Elements in C t!u A are called sending actions and those in C t?u A are called receiving actions. We let A , ranged over by w, (resp. Act , ranged over by ϕ) denote the set of finite words on A (resp. Act) with ε a distinguished symbol (not in A or in Act) representing the empty word. Write |ϕ| 59

for the length of a word ϕ, and ϕ ϕ1 or ϕϕ1 for the concatenation of words ϕ and ϕ1 (we overload these notations for words over A). Communicating finite state machines are finite state automata whose transitions are labelled by actions. Definition 3 (CFSM). A communicating finite state machine is a finite transition system given by a 4-tuple M pQ, q0 , A, Ñq where – Q is a finite set of states,

P Q is the initial state, and Ñ Q Act Q is a set of transitions; we write q Ñ Ý` pq, `, q1q PÑ (and q Ñ Ý q1 when ` is immaterial).

– q0 –

q 1 for

Example 15. The following machine AB!bwin A0

AC!cwin

A1

BA?sig A2

AD ee !fr

CA?msg

CA?msg A3 BA?sig

A4

specifies the behaviour of the first participant of the choreography in Example 12 (cf. page 47). Exercise 19 Give the machine of participant B of the choreography in Example 12. In the following we will use the notion of language of a CFSM as defined below. Definition 4. The language of a CFSM M is the set LpM q Act such that ϕ P LpM q iff ϕ is accepted by the automaton corresponding to machine M where each state of M is an accepting state. 60

Given a CFSM M pQ, q0 , A, Ñq, it is also convenient to establish some terminology: a state q P Q is a – final if it has no outgoing transition – sending state if all its outgoing transitions are labelled with sending – receiving state if all its outgoing transitions are receiving actions – mixed state otherwise. Exercise 20 List all the final, sending, receiving, and mixed stated of the machine in Example 15. Hereafter we restrict ourselves to the following class of CFSMs. Definition 5 (Deterministic CFSM). A machine M pQ, q0 , A, Ñ q is deterministic iff for all states q P Q and all actions ` P Act if q

Ñ Ý`

q 1 and q

Ñ Ý`

q2

then

q1

q2

A CFSM M is minimal if there is no machine M 1 with fewer states than M such that LpM q LpM 1 q. Remark 15 Sometimes, a CFSM is considered deterministic when sr!a sr!a1 q ÝÝÑ q 1 and q ÝÝÑ q 2 then a a1 and q 1 q 2 . Here, we follow a different definition in order to represent branching constructs. Definition 6 (Communicating systems). Given a CFSM Mp pQp, q0p, A, Ñpq for each p P P, the tuple S pMpqpPP is a communicating system ( CS). A configuration of S is a pair s xq ; wy where q pqp qpPP with qp P Qp and where w pwpq qpqPC with wpq P A ; component q is the control state and qp P Qp is the local state of machine Mp . The initial configuration of S is s0 xq0 ; εy with q0 pq0p qpPP. 61

Hereafter, we fix a machine Mp pQp , q0p , A, Ñp q for each participant p P P and let S pMp qpPP be the corresponding system. The notation for vectors x

Ñ

in the lectures is x .

Definition 7 (Reachable states and configurations). A configuration s1 xq 1 ; w1 y is reachable from another configuration s xq ; wy iff either of the following conditions holds: 1. qs ÝÝÑs qs1 and (a) qp1 qp for all p s, and 1 wpq for all pq sr; or (b) wsr1 wsr a and wpq sr?a 2. qr ÝÝÑr qr1 and (a) qp1 qp for all p r, and 1 wpq for all pq sr. (b) wsr a wsr1 and wpq 3. qr Ñ Ýτ r qr1 and (a) qp1 qp for all p r, and (b) w w1 . sr!a

Write sñ ù` s1 when s reaches s1 with an action `. Condition (1b) in Definition 7 puts a on channel sr, while (2b) gets a from channel sr. Finally, condition (3b) establishes that internal computations do not modify any communication buffer.

`m

Definition 8 (Reachability). Let s1 ùùùùñsm 1 hold iff, for some `1 `m configurations s2 , . . . , sm we have that s1 ù ñ s2 sm ùñsm 1 . The set of reachable configurations of S is `1

`m

RS pS q ts there are `1 , . . . , `m s.t. s0 ùùùùñsu

`1

A sequence of transitions is k-bounded if no channel of any intermediate configuration on the sequence contains more than k messages. The k-reachability set of S is the largest subset RS k pS q of RS pS q within which each configuration s can be reached by a k-bounded execution from s0 . 62

Note that, for every integer k, the set RS k pS q is finite and computable. We now recall several definitions about communicating systems Definition 9 (Well-behaved systems). Fix a system S and let s xq ; wy be one of its configurations. We say that s is a deadlock configuration [3, Def. 12] if w ε, there is r P P such sr?a that qr ÝÝÑr qr1 , and for every p P P, qp is a receiving or final state, i.e., all the buffers are empty, there is at least one machine waiting for a message, and all the other machines are either in a final or receiving state. orphan message configuration if all qp P q are final but w ε, i.e., there is at least a non-empty buffer and each machine is in a final state. unspecified reception configuration [3, Def. 12] if there exists sr?a r P P such that qr is a receiving state, and qr ÝÝÑr qr1 implies that |wsr | ¡ 0 and wsr R aA, i.e., qr is prevented from receiving any message from any of its buffers. System S is well-behaved if for each of its configurations s P RS pS q, s is neither a deadlock, nor an orphan message, nor an unspecified reception configuration.

9

Projections of Global Graphs

The definition of the projection of a global graph onto a participant is straightforward, but in the case of fork gates. We first define a parallel composition of automata, which is required to project global graphs with a participant appearing in different threads. We define 63

the function on vector of states: $

q :

pq1, q2q and q1 q2

&

q1

if q

%

q

otherwise

we overload it on sets of vector of states, i.e., Q : q q We define:

$

q

q1 ðñ

& %

PQ

(

.

q2 and q1 pq2q, or q 1 pq1 , q2 q ^ q qi ^ i P t1, 2u

q

We write q q 1 iff pq q 1 ) and we overload the operator on set of states such that q Q ðñ Dq 1 P Q : q q 1 .

Definition 10 (Parallel composition). The composition of Mi pQi, q0i , δiq, i P t1, 2u, written M1 k M2, is the automaton ppQ1 Q2 q , pq01 , q02 q , δ q s.t.

ppq1, q2q, `, pq11 , q21 qq P δ pqi, `, qi1q P δi if qj qj1, qi Qj and i j P t1, 2u ðñ pqi, `, qi1q P δi if qi Qj and i j P t1, 2u $ & %

Notice that k is a commutative and associative operation. Below, we give the definition of the projection function.

Definition 11 (Projection). Given G xV 1 , A, Λy and v P V 1 , the projection of pG, v, V q onto p, denoted by pG, v, V qçp , is defined 64

as follows: pG, v, V qçp

$ ' xtv u, v, Hy if Λpv q or v P V ' ' ' xQ Y tv u, v, δ Y tpv, `, v 1 quy if v 1 P v , ` s _ r : aç ' ' ' and xQ, v 1 , δ y pG, v 1 , V Y tv uqç ' & ¤ ¤ xtv u Y Q 1 , v, δ 1 Y pv, ε, v 1 qy if Λpv q P t , u ' 1

1

P P ' ' and xQ 1 , v 1 , δ 1 y pG, v 1 , V Y tv uqç ' ' ' ' xQ, v, δ Y pv, ε, v 2 qy if Λpv q ' % and xQ, v 2 , δ y k 1 P pG, v 1 , V Y tv uqç p

p

v

v

v

v

v

v

v

p

v

v

v

p

Given a vertex v P V 1 such that Λpv q , the projection of G onto p, written Gçp , is the automaton, minimised wrt. language equivalence, pQ, q0 , δ, Aq with pG, v, Hqçp xQ, q0 , δ y. The projection of a global graph onto p uses the auxiliary function pG, v, V qçp. The function takes the following parameter, p: the identifier of the participant onto the projection is invoked, G: the global graph to be projected, v: a node in G used as an initial node for the projection, and V : a set of visited nodes. If v has already been visited or it is a sink node, then a single state automaton is returned. If v is labelled by s _ r : a, then the projection of it successor is connected to v by a transition labelled by either pr!a, sp?a, or ε. If v is a choice gate or the source node, then the projection of each successor of v, connected by an ε transition from v, is returned. If v is a parallel gate, then the parallel composition of the projections of its successors are returned, connected to v by an ε transition. The parallel composition of automata “normalises” state identities so that visited nodes are comparable with nodes produced by composing automata.

65

References 1. L. Bass, P. Clements, and R. Kazman. Software Architecture in Practice. AddisonWesley, 1998. 2. D. Brand and P. Zafiropulo. On communicating finite-state machines. JACM, 30(2):323–342, 1983. 3. G. C´ec´e and A. Finkel. Verification of programs with half-duplex communication. I&C, 202(2):166–190, 2005. 4. E. Dijkstra. A Discipline of Programming. Prentice-Hall, 1976. 5. R. W. Floyd. Assigning meaning to programs. In Proc. Symp. in Applied Mathematics, volume 19, 1967. 6. D. Harel and P. Thiagarajan. Message sequence charts. Available at http://www. comp.nus.edu.sg/~thiagu/public_papers/surveymsc.pdf. 7. T. Hoare. An axiomatic basis of computer programming. CACM, 12, 1969. 8. N. Kavantzas, D. Burdett, G. Ritzinger, T. Fletcher, and Y. Lafon. http://www. w3.org/TR/2004/WD-ws-cdl-10-20041217. Working Draft 17 December 2004. 9. B. Meyer. Applying “Design by Contract”. Computer, 25(10):40–51, 1992. 10. Z. Micskei and H. Waeselynck. Uml 2.0 sequence diagrams’ semantics. http: //home.mit.bme.hu/~micskeiz/sdreport/uml-sd-semantics.pdf. 11. M. Nordio, R. Mitin, B. Meyer, C. Ghezzi, E. D. Nitto, and G. Tamburrelli. The role of contracts in distributed development. In SEAFOOD, pages 117–129, 2009.

66

Lihat lebih banyak...

CO7205: Advanced Systems Design Course notes

Descrição do Produto

Comentários