2010 MSc RafaelOliveira

Plug and Trace: A Component-Based Approach to Specify and Implement Traces By Rafael Ferreira Oliveira M.Sc. Dissertati...

0 downloads 27 Views 2MB Size
Plug and Trace: A Component-Based Approach to Specify and Implement Traces By

Rafael Ferreira Oliveira M.Sc. Dissertation

Federal University of Pernambu o posgradua ao in.ufpe.br

www.cin.ufpe.br/~posgraduacao

RECIFE, AUGUST/2010

Federal University of Pernambuco Center for Informatics Graduate in Computer Science

Rafael Ferreira Oliveira

Plug and Trace: A Component-Based Approach to Specify and Implement Traces

A M.Sc. Dissertation presented to the Center for Informatics of Federal University of Pernambuco in partial fulfillment of the requirements for the degree of Master of Science in Computer Science.

Advisor: Roberto Souto Maior de Barros Co-Advisor: Jacques Robin and Pierre Deransart

RECIFE, AUGUST/2010

To my parents and my wife. . .

Acknowledgements

“Do not be anxious about anything, but in everything, by prayer and petition, with thanksgiving, present your requests to God.” Philippians 4:6, Holy Bible

I would like to priorize my gratitute to God who always stood by my side supporting me, leading me and also showing me how good it is to do everything in His presence. To all my family, especially to my parents for the solid educational foundation I received, for the zeal and incentives throughout my Master’s Program. To my wife for her care and love, especially important in my walk. Also for her concerns when I had been absent myself for short and long times. To my friends Arlucio Viana, Rilton Souza, Marcelo Pedro and Halley Bezerra, who not only lived with me during my stay in Recife, but constantly supported me to continue working hard. To my classmate Marcos Aurelio for the constant and fruitful discussions on this work and partnership in the studies To the professors Jacques Robin and Pierre Deransart for the cooperation and encouragement they gave me during the program. I am also particularly grateful to my professor Roberto Souto Maior for graciously agreeing to guide me in completing this work, for his objective guidance and support in the improvement of all content. My thanks to all workmates, friends of CIn/UFPE and Itapetinga, you have contributed indirectly, but, without a doubt, you had been necessary! Thank you all!

iv

Resumo

A análise de aplicações tem ganhado bastante valor comercial com o grande crescimento de heterogeneidade e distribuição dos atuais sistemas - tanto logicamente quanto fisicamente. Esta convergencia de complexidade em relação aos ambientes de projeto, desenvolvimento e produção tem introduzido novos desafios em se tratando do monitoramento, análise e melhorias desses sistemas. Além disso, as abordagem tradicionais tem oferecido cada vez menos valor para o gerenciamento dos atuais ecosistemas das aplicações cada vez mais sofisticadas e distribuídas. Diante desse cenário, o projeto Plug and Trace integra duas propostas, a Meta-Teoria dos Rastros e o Desenvolvimento Baseado em Componentes, para prover uma maneiras simples de embutir uma variedade de serviços de análise em qualquer tipo de aplicação. Dessa forma, nossa intenção é mudar a maneira com que as ferramentas de análise são projetadas, de somente construir ferramentas de análise para applicações específicas, para prover um framework de rastreamento independente de domínio e altamente reusável em qualquer domínio. Adicionalmente, com o intuito de forcener para os atuais sistemas um framework com um boa relação custo-benefício, nós focamos em automação usando a Engenharia Dirigida por modelos, ou seja, fazer mais com menos, eliminando tarefas redundantes e manuais e facilitanto o processo de estensão de nossa proposta sobre qualquer aplicação. Claramente essas vantagens representam uma contribuição para o domínio de Análise de Aplicações, no qual o projeto Plug and Trace simplifica o processo de conceber uma ferramenta de análise e facilita o análise de qualquer aplicação usando um framework comum. Há também contribuições em outros domínios: no Desenvolvimento Baseado em Componentes, com a primeira proposta de componentização da Meta-Teoria dos Rastos adornada com novos componentes genéricos de rastreamento; e, na Engenharia Dirigida por Modelos, com um framework de rastreamento baseado em quatro princípios: qualidade, consistência, produtividade e abstração, reduzindo a codificação manual e promovendo a reusabilidade de todo o framework. A fim de validar nossa proposta, apresentamos um estudo de caso que mostra como estender o framework Plug and Trace para o domínio da linguagem CHR. Palavras-chave: Análise de Aplicações, Rastro, Desenvolvimento Baseado em Componentes, Engenharia Digirida por Modelos, CHR

v

Abstract

Application analysis has assumed a new business importance as the world moves increasingly towards heterogeneous and distributed systems - both logically and physically. This convergence of complexity across design, development and production environments has introduced new challenges regarding monitoring, analysis and tuning these systems. Furthermore, traditional approaches offer less and less value in managing today’s sophisticated and distributed application ecosystems. Given the aforementioned shortcomings, the Plug and Trace project integrates two proposals, the Component-Based Development and the well-founded Trace Meta-Theory, to provide an easy way to embed a variety of analysis services into any kind of application. In that case, we envisage a change in the way of the application analysis tools are designed, from building only analysis tools of specific applications to providing a domain-independent trace framework highly reusable in any domain. Additionally, to enable a cost-effective adoption of the tracer framework in everyday systems, we focus on automation by using the Model-Driven Engineering, i.e., to do more with less, eliminating redundant and manual tasks and making the phase of extending our proposal over any application easy. We advocate these advantages represent a contribution in the domain of Application Analysis, in which the Plug and Trace simplifies the process of conceiving analysis tools and facilitates the analysis of any application using a common tracer framework. There are also contributions in other domains: in Component-Based Development, by providing the first proposal for the Trace Meta-Theory applied to the Component-Based development with generic components for tracing; and, regarding Model-Driven Engineering, a tracer framework based on four principles: quality, consistency, productivity and abstraction, reducing the hand coding and promoting the reusability of the entire framework. In order to validate our proposal, we present a case study showing how to extend the Plug and Trace framework to the domain of CHR language. Keywords:

Application Analysis, Trace, Component-Based Development, Model-

Driven Engineering, CHR

vi

Contents

List of Figures

x

Acronyms

1

1 Introduction 1.1 Plug and Trace: Goals and Design Principles . . . . . . . . . . . . . .

2 4

1.2 1.3

Scope of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . Envisioned Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Contributions to Application Analysis . . . . . . . . . . . . . .

6 7 7

1.3.2 1.3.3

Contributions to CBD and MDE . . . . . . . . . . . . . . . . . Contributions to Rule-Based Automated Reasoning . . . . . . .

7 8

Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . .

8

2 Software Engineering Background 2.1 Component-Based Software Development . . . . . . . . . . . . . . . . 2.1.1 Fundamental Changes From Traditional Software Development

10 10 11

1.4

2.1.2 2.1.3

Software Components Specification . . . . . . . . . . . . . . . Component-Based Development Process . . . . . . . . . . . .

12 13

Building systems from components . . . . . . . . . . . . . . . Building reusable components . . . . . . . . . . . . . . . . . . Model-Driven Engineering . . . . . . . . . . . . . . . . . . . . . . . .

13 14 14

2.2.1

MDE Languages . . . . . . . . . . . . . . . . . . . . . . . . . MOF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16 17

UML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . OCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model Transformations . . . . . . . . . . . . . . . . . . . . . .

18 19 21

Chapter Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

3 Trace Meta-Theory 3.1 Towards reusability and extensibility . . . . . . . . . . . . . . . . . . .

23 24

2.2

2.2.2 2.3

3.2

Generic Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Generic Full Trace of a Familly of Applications . . . . . . . . . 3.2.2 Generic Full Trace of an Application . . . . . . . . . . . . . .

26 27 28

3.3

Querying Trace Events . . . . . . . . . . . . . . . . . . . . . . . . . .

29

vii

3.4

Chapter Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 The Plug and Trace Project 4.1 4.2

29 30

Goal and Design Principles . . . . . . . . . . . . . . . . . . . . . . . . The Top-Level Plug and Trace Component . . . . . . . . . . . . . . . . 4.2.1 Trace Receiver component . . . . . . . . . . . . . . . . . . . .

31 32 35

4.2.2 4.2.3

Trace Driver component . . . . . . . . . . . . . . . . . . . . . Trace Analyzer component . . . . . . . . . . . . . . . . . . . .

37 39

4.3 4.4 4.5

Trace Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generic Trace Schema . . . . . . . . . . . . . . . . . . . . . . . . . . The Plug and Trace Process . . . . . . . . . . . . . . . . . . . . . . . .

40 43 46

4.6

Chapter Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

5 Extending Plug and Trace to CHR 5.1 Case Study: A Debugging Tool for CHR . . . . . . . . . . . . . . . . .

50 50

5.1.1 5.1.2

Understanding the context: CHR by example . . . . . . . . . . Operational Semantics . . . . . . . . . . . . . . . . . . . . . . Modeling the trace events: ωt . . . . . . . . . . . . . . . . . .

51 52 54

5.1.3 5.1.4

Instrumenting: A Debugging Tool for Eclipse Prolog . . . . . . 57 Configuring the Plug and Trace framework: Connecting all pieces 60

5.1.5 Evaluating the Analysis . . . . . . . . . . . . . . . . . . . . . Chapter Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62 63

6 Related Work 6.1 Eclipse TPTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64 64

5.2

6.1.1 6.1.2

Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66 66

6.2

dynaTrace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67 69 70

6.3

TAU Performance System . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70 71

6.3.2 Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . InfraRED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

72 72 73

6.4.2

74

6.4

Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . .

viii

6.5

Chapter Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Conclusion

74 75

7.1

Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Contributions to Application Analysis . . . . . . . . . . . . . . 7.1.2 Others related contributions . . . . . . . . . . . . . . . . . . .

75 76 77

7.2

Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . .

78

References

80

ix

List of Figures

1.1

Increase and heterogeneity of the application architectures . . . . . . .

3

2.1 2.2 2.3

CBD process as a combination of several parallel processes . . . . . . . Basic component specification concepts . . . . . . . . . . . . . . . . . The 4-level architecture of MDA . . . . . . . . . . . . . . . . . . . . .

12 13 16

2.4 2.5

EMOF and CMOF package architecture . . . . . . . . . . . . . . . . . MOF Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17 18

2.6 2.7 2.8

Simplified UML metamodel . . . . . . . . . . . . . . . . . . . . . . . Association between operations and OCL expressions . . . . . . . . . . Class Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19 20 20

2.9 Derived Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10 OCL pre and post condition example . . . . . . . . . . . . . . . . . . .

21 21

3.1 3.2 3.3

Virtual and Actual Trace . . . . . . . . . . . . . . . . . . . . . . . . . Roles in the TMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Generic and specific traces . . . . . . . . . . . . . . . . . . . . . . . .

24 25 27

3.4

A unique abstract model for several observed processes . . . . . . . . .

28

4.1

The Top-Level Plug and Trace Component . . . . . . . . . . . . . . . .

33

4.2 4.3 4.4

The Plug and Trace workflow . . . . . . . . . . . . . . . . . . . . . . . The Trace Receiver acts as a listener to trace events . . . . . . . . . . . The Trace Receiver component . . . . . . . . . . . . . . . . . . . . . .

34 35 36

4.5 4.6

The Trace Driver component . . . . . . . . . . . . . . . . . . . . . . . The Trace Analyzer component . . . . . . . . . . . . . . . . . . . . . .

37 39

4.7 4.8 4.9

An example of a trace event model . . . . . . . . . . . . . . . . . . . . The Plug and Trace Process . . . . . . . . . . . . . . . . . . . . . . . . Artifacts involved in the Plug and Trace instrumentation . . . . . . . .

41 46 49

5.1 5.2

Solver strictly connected to the debugging tool . . . . . . . . . . . . . . ωt model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51 55

5.3 5.4

Visualizing a CHR execution . . . . . . . . . . . . . . . . . . . . . . . Running the CHR debugging tool . . . . . . . . . . . . . . . . . . . .

59 63

6.1

TPTP Project Architecture . . . . . . . . . . . . . . . . . . . . . . . .

65

6.2

dynaTrace Architecture . . . . . . . . . . . . . . . . . . . . . . . . . .

68

x

6.3

TAU Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

6.4

InfraRed Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

xi

Acronyms

CBD Component-Based Development CBSE Component-Based Software Engineering CHR Constraint Handling Rules CIn Centro de Informática TPTP Test & Performance Tools Platform Top-Level Project FACEPE Fundação de Amparo à Ciência e Tecnologia do Estado de Pernambuco GUI Graphical User Interface INRIA Institut National de Recherche en Informatique et Automatique IT Information Technology MDA Model-Driven Architecture MDE Model-Driven Engineering OMG Object Management Group OOD Object-Oriented Development PIM Platform-Independent Model PSM Platform-Specific Model TMT Trace Meta-Theory UFPE Universidade Federal de Pernambuco UML Unified Modeling Language

1

1 Introduction

Modern application architectures are more complex than ever. Applications themselves have become ever more heterogeneous and distributed - both logically and physically. N-tier applications, globally distributed, are more and more common. Service oriented environments with components built by 3rd parties, including commercial and opensource, are commonplace. The Figure 1.1 illustrates this evolution, where most enterprises today are integrating several types of technologies into their IT infrastructures. While this convergence of complexity across development, architecture and production has been increasing as time goes by, it have also introduced new challenges with regard to the monitoring, diagnosis and tuning of these complex application ecosystems. Furthermore, traditional analysis approaches offer less and less value in managing the performance, scalability and stability of today’s sophisticated, distributed applications. A new generation of application analysis approaches is required. Below are the key limitations of these traditional analysis tools given today’s application reality. • no integrated lifecycle approach: In nearly all cases, traditional application analysis vendors amassed a cadre of tools that could be used for various tasks by different stakeholders throughout the application development lifecycle. Unfortunately, these tools have rarely been well integrated, forcing architects, developers and performance specialists to use human cycles and guess-work to correlate findings among themselves. • no conditional trace: It should be noted that, in most cases, it is very difficult to produce traces of applications that are simple and complete. Simple in the sense of producing minimal information, to reduce and improve traffic data. And, complete by showing everything the observer wants to see. Due to this fact, it is necessary

2

Figure 1.1 Increase and heterogeneity of the application architectures. Source: The Application Performance Management Imperative, Forrester Research, 2009

to add the concept of querying, where the observer can request just what he wants to see. • difficult to integrate into existing environments: No environment is homogeneous with uniform hardware, development processes between teams, and application architectures. Therefore, applications must be easily integrated with preexisting systems, self-managed with complementary automation interfaces to fit existing processes, and highly extensible to adapt to future needs. • static applications only: As applications have become increasingly complex and dynamic, architects can no longer predict the exact runtime behavior of their applications. They know what their applications are supposed to do, but no one really knows how they actually behave and how transactions are really being processed under loading. This partly is due to the increase in services being used and the widely distributed nature of today’s multi-tiered applications. The dynamic code executes under load only, and the behavior of third-party code and frameworks is often impossible to determine even when the application is live. Taken together, these limitations of traditional application approaches, especially in light of the accelerating application complexity we are encountering, are driving the

3

1.1. PLUG AND TRACE: GOALS AND DESIGN PRINCIPLES

urgency for a new application analysis approach. This new approach must take into consideration the limitations described above and must anticipate the future requirements. Our proposal changes the focus “from” building only application monitoring tools of specific technologies “to” providing a generic tracer framework that specifies and realizes services for analysis of any kind of application. It is possible due to the fact that we focus on more abstract artifacts using a domain-independent and component-based approach. Our long-term goal is to provide the means for developing and deploying modeldriven tracer components that support a cost-effective adoption of monitoring techniques in everyday systems. Our specific goal is to develop the kernel of the bottom component of the tracer framework, Plug and Trace. It will be the first domain-independent tracer framework, model-driven, component-based and highly reusable debugging tool. Our work is to define the top level architecture of Plug and Trace as well as three of its main sub-components: the trace receiver, to get and adapt the received trace events; the trace driver, the core of our framework; and, the trace analyzer, the element to visualize and monitor traces. Finally, Plug and Trace is going to be the most reused component for the deployment of more advanced application analysis services. Today, Plug and Trace is already being reused i) to integrate any kind of application, ii) to design several GUI’s to easily analyze and monitoring application on the fly, and, iii) to implement a debugging tool for CHR [Sneyrs et al. (2003)]. For the future, we expect to achieve grid scalability by port Plug and Trace in the cloud computing, such as Google App Engine1, and by incorporating new application analysis built-in services, such as: Application Performance Management [Khanna et al. (2006)]; Transaction Performance Management [Gao et al. (2004)]; End User Experience Management [Croll and Power (2009)]; and, Performance Management for Cloud Hosted Applications [Vecchiola et al. (2009)].

1.1 Plug and Trace: Goals and Design Principles The Plug and Trace project provides the first domain-independent tracer framework and a reusable debugging framework. It aims at provide services and artifacts to embed extraction and analysis services into any application. Each component of this framework will be used either as a stand-alone software or assembled to provide a variety of anal1 Google

App Engine is a platform for developing and hosting web applications in Google-managed data centers. It is cloud computing technology and virtualizes applications across multiple servers and data centers.

4

1.1. PLUG AND TRACE: GOALS AND DESIGN PRINCIPLES

ysis services that can be integrated in the most diverse domains. Furthermore, we will describe the process to build a tracer framework using our proposal. In order to provide a formally founded architecture we are going to use the Trace Meta-Theory (TMT), an approach that focus particularly on providing semantics to tracers and the produced traces. Our goal in this project is to provide the following deliveries: • A platform independent model that specifies generic trace components; • A generic trace schema. This generic trace will also enable any debugging tool to be defined independently from its application, and, conversely, tracers to be built independently from these tools; • GUI components to interactively submit queries and inspect solution explanations at various levels of details, with patterns to specify what trace information is needed. To fulfill these requirements Plug and Trace architecture is based on the following principles: 1. To integrate different domain into a unique environment, called Plug and Trace; 2. To combine the component-based development and model-driven architecture to produce reusable artifacts and a domain-independent framework; 3. To go one extra step towards an easy tool for application analysis 4. Automation, i.e. to do more with less, eliminating redundant and manual tasks The application we are going to use as case study will be the ECLiPSe Prolog3 , a CHR4 Solver. Versatility is the main reason that motivates our choice of rule-based constraint programming (and in particular CHR) to validate and test all steps of our framework. And, talking specifically about CHR, it has matured over the last decade to a powerful and elegant general-purpose language with a wide spectrum of application domains [Sneyrs et al. (2003)]. The Plug and Trace project is the result of the cooperation between CIn/UFPE and INRIA and co-financed in 2008-2009 by FACEPE and INRIA. 3 ECLiPSe Prolog is an open-source software system for the cost-effective development and deployment of constraint programming applications. 4 Constraint Handling Rules (CHR) is a declarative programming language.

5

1.2. SCOPE OF THE DISSERTATION

1.2 Scope of the Dissertation From the software engineering point of view, the core of our research is to investigate how most recent advances in reusable Component-Based Software Engineering (CBSE) can be leveraged to build a versatile tracer framework that fulfills today’s needs of application analysis. On the other hand, regarding the Trace Meta-Theory, the main Plug and Trace architectural design principles are based on the Deransart’s theory, we reuse his principles and roles necessary to specify tracers and traces. This project is, thus, an effort to harvest the benefits of the model-driven componentbased approach and the flexibility of an enhanced application analysis to provide the reusable and extensible components. The design decision of this dissertation involves the following topics: • Precise metamodeling in UML 2.0 and OCL 2.0 of all computational languages used in Plug and Trace. • Modeling of all Plug and Trace extensions using UML 2.0 and OCL 2.0. • Specifying automated transformations between models and between models and executable platforms using the MofScript Language 5 . • A Component-based model-driven approach, to design the artifacts of the Plug and Trace project. In the Application Analysis realm the scope of our thesis includes: • Mainly, developing a complete architecture to tracer any application; • Developing a generic trace schema to promote reusability of trace events; furthermore, • Incorporating the MDE to facilitate the modeling and generating of tracer structures. 5 MOFScript

is a tool for model to text transformation, e.g., to support generation of implementation code or documentation from models.

6

1.3. ENVISIONED CONTRIBUTIONS

1.3 Envisioned Contributions Our work combines recent developments in three areas that traditionally do not interact very often. However, we believe that these techniques may contribute a great deal to each other bringing meaningful mutual benefits. Firstly, the application analysis are driving the urgency for monitoring tools that can be easily integrated with pre-existing systems, and are self-managed with complementary automation interfaces, to fit pre-existing processes, and highly extensible to adapt to future needs. Secondly, the component-based development, towards reusability and extensibility of the whole framework. Finally, a domain-independent approach, using the the model-driven engineering, to give us and make easy the adaptation to any kind of domain. The following subsections summarize the contributions of our work in sub-fields of these areas.

1.3.1 Contributions to Application Analysis Our intention is not to propose yet another tool, but to redefine the way applications should be built, analyzed and managed in production, by supporting the analysis of the entire lifecycle and providing unprecedented insight into even their most complex applications.

1.3.2 Contributions to CBD and MDE Although not yet widely adopted by industry, the Model-Driven Engineering (MDE) vision led by the OMG has already spawned a set of standards based on semi-formal tight-coupled artifacts that support the software development process with great flexibility. In particular the pervasive UML is the most fundamental element that aggregates many facets of the vision. Basically MDE proposes to raise the level of abstraction of the software process, prescribing that the application development starts with a Platform Independent Model (PIM) and then is transformed manually or automatically into other models, called Platform Specific Models (PSM), until eventually executable artifacts such as source and deployed code. Given the above, our mains contributions are: For model transformations, specifying rules for mapping Trace Events into an executable Java code using the MOFScript Language.

7

1.4. OUTLINE OF THE DISSERTATION

For Component-based development (CBD), provide a case study for specifying and realizing a trace framework by means of assembly components. For MDE, demonstrate its feasibility building the first tracer framework using a Model-Driven approach.

1.3.3 Contributions to Rule-Based Automated Reasoning The study of rule-based automated reasoning is not our main focus on this project, but, due the fact that we have chosen this domain to validate the whole project, we intend to provide debugging tools, generic trace schemas and services of reasoning explanation facilities.

1.4 Outline of the Dissertation The rest of this dissertation is organized in six other chapters. In chapter 2 we provide a summary of the Software Engineering background that we used for the development of our application. Firstly, we give an overview of CBD, showing its foundation and principles, detailing the fundamental changes from traditional software development to a CBD and describing how to specify a component. Then we proceed by briefly overviewing the Model-Driver Enginerring (MDE), its goals, principles and vision. We follow with a presentation of model transformation. We end the chapter with remarks focusing on our project. In Chapter 3 we discuss about the Trace Meta-Theory (TMT). We start by stating the current difficulties to analyze modern applications, in the sense of traditional approaches of application analysis. Then we present the TMT, as the basis of the entire framework that we will provide. We follow discussing about generic traces, our key ideas to promote its use and how to query traces. Finally, we explain, in some remarks, how we will use this theory as basis of our project. In chapter 4 we present the overall architecture of Plug and Trace, its design principles and its complete PIM. Firstly, we detail the top-level plug and trace component. Then we proceed explaining each sub-component evolved on the framework. We follow showing how to specify generic trace schemas to leverage any application to be analyzed and we talked about trace events, showing its structure and how MDE can improve its utilization. We end the chapter presenting some relevant points discussed. In Chapter 5 we present the way of extending our tracer framework to a given domain. We discuss which components should be reused and extended by creating a simple

8

1.4. OUTLINE OF THE DISSERTATION

debugging tool for Constraint Handling Rules (CHR), a rule-based language. In Chapter 6 we present related works regarding to the application analysis, highlighting the differences to our method and showing some ideas and techniques used in this work that were studied and derived from theses related works. Finally, in Chapter 7 we conclude this thesis summarizing our contributions, pointing the current limitations and proposing future developments.

9

2 Software Engineering Background

In this chapter we describe the key technologies of software engineering we use to develop the components of our proposed application. In particular, we present the ideas of component-based development and model-driven engineering, a set of principles and technologies that provide the structural basis of this thesis.

2.1 Component-Based Software Development A software component encapsulates a set of basic functionalities whose need recurs in diverse applications. It contains metadata that specifies how to assemble these functionalities with those encapsulated in other components to build more complex functionalities through assembly. According to [Eriksson (2004)] “a component is a self-contained unit that encapsulates the state and behavior of a set of classifiers". All the contents of the components, including its sub-components, are private. Its services are available through provided and required interfaces. The key feature of CBSD is the ability to promote the reuse of software components. This is possible by using the full encapsulation and separation of interfaces from implementation, furthermore, this separation of concerns enables a component to be a substitutable unit that can be replaced at design time or run-time by another component that offers equivalent functionality. In an assembly, a given component may act as both a server to some component and a client to another component. The assembly structural meta-data of a component includes provided interfaces, the operations that are available by connecting to the server ports of the component. It may also include required interfaces, the operations that component expects to be available in the deployment environment through connections to its client ports.

10

2.1. COMPONENT-BASED SOFTWARE DEVELOPMENT

A component may also include assembly behavioral meta-data that describes the pre- and post-conditions of the operations provided and required at its ports in terms of its states and the states of its clients and servers in the assembly [Robin and Vitorino (2006)]. Such meta-data allows defining a contract between a client-server component pair. Such design by contract permits black-box reuse, which is ideal for leveraging third party software and more cost-effective than the white-box reuse by inheritance in object-oriented frameworks. A component can be substituted at any time by another one that is internally different but respects the same contracts at its ports, without affecting the rest of the software.

2.1.1 Fundamental Changes From Traditional Software Development Mature software development, in general, follows a well-defined process model. Considering CBD is one of the many possible approaches to software development, it is worth discussing whether or not a generic software development process model is well suited for CBD. Several authors have argued against using a traditional process model in CBD, as described in the next paragraphs. Ning discusses several aspects of development which are common in CBD and require special attention, when defining a suitable process model for CBD [Ning (1996)]. He contrasts CBD with object-oriented development (OOD), where typical development models such as the waterfall model [Royce (1970)] encourage opportunistic forms of reuse, rather than systematic approaches to it. In such process models, reuse is not regarded as a “first class activity”, and it is up to the designers and developers to recognize opportunities for reuse. The lack of a “de facto” standard definition for components adds to the lack of systematic reuse by making the identification of potential reuse artifacts harder. Aoyama has found several potential approaches to facilitate reuse in OOD, including software architectures, design patterns, and frameworks [Aoyama (1998)]. All these approaches to reuse are set during development or maintenance. An important contrast from OO reuse to component reuse is that components may have to be composed at run-time, without further compilation, using a plug and play mechanism. This requires components to be viewed as black-boxes, accessible through their interfaces and fosters the definition of architectures for which the components are developed, including the standards for connecting components in those architectures. Crnkovic et al. add to the discussion the existence of several kinds of CBD, including architecture-driven CBD, product-line CBD and argue for the adoption of a process

11

2.1. COMPONENT-BASED SOFTWARE DEVELOPMENT

model tailored for each of these varieties of CBD [Crnkovic et al. (2006)]. The model presented in Figure 2.1, illustrates how the CBD process model can be regarded as a combination of several processes that occur in parallel. With some adaptations, Crnkovic et al. define variations of this model to support architecture-driven and product-line CBD.

Figure 2.1 CBD process as a combination of several parallel processes

A common point to these three studies [ Ning (1996), Aoyama (1998), Crnkovic et al. (2006)] is the adoption of a modified version of some existing well-known process model, with a shift of focus in some activities and the introduction of parallel process flows for each of the participating organizations. It is also worth noticing the introduction of a third process, component assessment, that can be carried out by an organization independent both from the component developers and component users.

2.1.2 Software Components Specification UML has great potential for component-based systems, the de facto industry-standard in object-oriented modeling. Figure 2.2 depicts the basic concepts concerning components specification using a simplistic UML metamodel, adapted from [L"uders et al. (2002)].

12

2.1. COMPONENT-BASED SOFTWARE DEVELOPMENT

Figure 2.2 Basic component specification concepts

A component exposes its functionalities by providing one or more access points. An access point is specified as an interface. A component may provide more than one interface, each interface corresponding to a different access point. An interface is specified as a collection of operations. It does not provide the implementation of any of those operations. Depending on the interface specification technique, the interface may include descriptions of the semantics of the operations, it provides with different degrees of formality. The separation between interface and internal implementation allows the implementation to change while maintaining the interface unchanged. It follows that the implementation of components may evolve without breaking the compatibility of software using those components, as long as the interfaces and their behavior, as perceived by the component user, are kept unchanged with respect to an interaction model. A common example is to improve the efficiency of the implementation of the component, without breaking its interfaces. As long as that improvement has no negative effect on the interaction model between the component and the component clients, and the component’s functionality remains unchanged, the component can be replaced by the new version.

2.1.3 Component-Based Development Process A CBD process includes all activities of a product or a system with components during its entire life, from the business idea for its development, through its usage and its completion of use. Building systems from components The general idea of the component-based approach is building systems from pre-defined components [Crnkovic et al. (2006)]. This assumption has several consequences for the system lifecycle. First, the development processes of component-based systems are sep-

13

2.2. MODEL-DRIVEN ENGINEERING

arated from development processes of the components; the components should have already been developed and possibly used in other products when the system development process starts. Second, a new separate process will appear: Finding and evaluating the components. Third, the activities in the processes will be different from the activities in noncomponent-based approach; for the system development the emphasis will be on finding the proper components and verifying them, and for the component development, design for reuse will be the main concern. System development with components is focused on the identification of reusable entities and relations between them, beginning from the system requirements and from the availability of components already existing [Goulão (2005)]. Much implementation effort in system development will no longer be necessary but the effort required in dealing with components; locating them, selecting those most appropriate, testing them, etc. will increase. Building reusable components The process of building components can follow an arbitrary development process model. However, any model will require certain modification to achieve the goals; in addition to the demands on the component functionality, a component is built to be reused. Reusability implies generality and flexibility, and these requirements may significantly change the component characteristics. For example there might be a requirement for portability, and this requirement could imply a specific implementation solution (like choice of programming language, implementation of an intermediate level of services, programming style, etc.). The generality requirements often imply more functionality and require more design and development efforts and more qualified developers. The component development will require more efforts in testing and specification of the components. The components should be tested in isolation, but also in different configurations. Finally the documentation and the delivery will require more efforts since the extended documentation is very important for increasing the understanding of the component.

2.2 Model-Driven Engineering The term Model-Driven Engineering (MDE) is typically used to describe software development approaches in which abstract models of software systems are created and systematically transformed to concrete implementations [France and Rumpe (2007)]. MDE

14

2.2. MODEL-DRIVEN ENGINEERING

combines process and analysis with architecture [Kent (2002)]. Higher-level models are transformed into lower level models until the model can be made executable using either code generation or model interpretation. The best known MDE initiative is the Object Management Group (OMG) initiative Model-Driven Architecture (MDA) started in 1997 1 . Model-Driven Architecture (MDA) provides a framework for software development that uses models to describe the system to be built [Mellor et al. (2002)]. The MDA provides an approach in which systems are specified independently of the platform that supports it. The three primary goals of MDA are portability, interoperability and reusability through architectural separation of concerns [Miller et al. (2003)]. In the following we address some related principles and basic concepts to understand the MDE proposal: • Reusable assets: the most valuable, durable, reusable assets produced during the development is not code but models. • Improving design and code: the more significant and cost-effective quality gains are achievable by improving design and models rather than by improving code. • Extensibility: benefits from careful, detailed, explicit modeling are not limited to the application under development but extend to all the processes, artifacts, languages, tools and platforms used for this development. • Software process automation: a high degree of automation can be achieved by building a variety of models, each one with a different role in the process; by making each of these models machine processable expressing them in a semiformal notation devoid of natural language; by defining this notation itself as an object-oriented model; and by using model transformations to generate the target models from these source models. To realize the MDA vision, a modeling language such as UML is not enough. It is also important to express the links among models (traceability) and transformations. It requires accessing elements not only at model level but also at the modeling formalization level. A metaformalism is a language to define the constructors of another language as well as their structural relationships, such as composition and generalization. It thus defines an abstract syntax of a language, called a metamodel, that ignores the 1 http://www.omg.org/mda/

15

2.2. MODEL-DRIVEN ENGINEERING

ordering constraint among the constructors. A metamodel plays the role of a grammar, but at a more abstract level. MDE defines three levels of abstractions regarding model formalisms plus the object level (Figure 2.3): the model, the formalism for modeling (metamodel) and the metaformalism. MOF (Meta-Object Facility) [OMGa, 2006] is the OMG choice to express de modeling formalisms or metamodels, which in turn express the models. MOF expresses itself as a metametamodel. MOF reuses at another level and for another purpose the UML class diagram. Whereas in UML these diagrams at level M1 are used to model the application, MOF uses these diagrams at level M2 and M3 to model languages. MOF allows OO visual representation of computational language grammars. Furthermore it extends the UML class diagram with a reflective API.

Figure 2.3 The 4-level architecture of MDA

2.2.1 MDE Languages A MDE approach must specify the modeling languages, models, translations between models and languages and the process used to coordinate the construction and evolution of the models [Kent (2002)]. In the next section, we will briefly describe three standard: UML2, a modeling language; OCL2 a language to specify constraint on UML models;

16

2.2. MODEL-DRIVEN ENGINEERING

and, MOF, a standard to represent and manipulate metamodels. MOF MOF (Meta-Object Facility) [OMGa, 2006] is the OMG choice to express metamodels, which in turn express the models. MOF reuses structural core of UML, a mature, wellknown and well-tooled language. The main benefits of MOF over traditional formalisms such as grammars to define languages are: abstract instead of concrete syntax (more synthetic); visual notation instead of textual notation (clarity); graph-based instead of tree-based (abstracts from any reader order); entities (classes) can have behavior (grammar symbols do not); relations between elements include generalization and undirected associations instead of only composition and order; and, specification reuse through inheritance. Figure 2.4 shows the package architecture of MOF. EMOF stands for essential MOF and is a subset of the complete MOF (CMOF) that closely corresponds to the facilities provided by most OO programming languages. A primary goal of EMOF is to allow simple metamodels to be defined using simple concepts while supporting extensions (by the usual class extension mechanism in MOF) for more sophisticated meta-modeling using CMOF.

Figure 2.4 EMOF and CMOF package architecture.

In essence the Basic package contains the Core constructs except for associations, which appear only in Constructs. The Reflection Package allows the discovery and manipulation of meta-objects and metadata. The Identifiers package provides an extension

17

2.2. MODEL-DRIVEN ENGINEERING

for uniquely identifying metamodel objects without relying on model data that may be subject to change, and the Extension package supports a simple means for extending model elements with name/value pairs. Figure 2.5 shows an example of the metamodel for the Use Case diagram, one of many UML diagrams. The diagram contains three metaclasses: Actor, System and UseCase, the Metaclass Actor has a meta-attribute name of type String, the Metaclass UseCase has a meta-attribute title of type String, and the Metaclass System has a metaattribute name of type String. There is a recursive meta-association which inherits from the Metaclass Actor. Also there are two more recursive meta-associations with the metaclass UseCase namely extends and includes. Finally there is an aggregation meta-association between the metaclass System and UseCase.

Figure 2.5 MOF Example.

UML The Unified Modeling Language (UML) [Rumbaugh et al. (2004)] is a graphical modeling language standardized by the OMG whose objective is to describe software systems, business processes and similar artifacts. It integrates most constructs from objectoriented, imperative, distributed and concurrent programming paradigms. In Figure 2.6, we show a simplified metamodel of UML. We are only going to focus on the constructors used to represent Class Diagrams. They represent the structure and interfaces provided and required by the objects on the running system, without putting too much emphasis on the representation of complicated execution flows.

18

2.2. MODEL-DRIVEN ENGINEERING

Figure 2.6 Simplified UML metamodel

In UML, we have the concept of InstanceSpecification that allows the modeler to include an abstract vision of how the instances of the classes in the model are going to be organized at runtime. We also added the concept of Constraint which annotates elements in the diagram and enriches its semantics. They are often used to simplify the graphical rendering of the model by utilizing a more expressive language in the constraints. In Figure 2.6, we also show the attribute isDerived to the meta-class Property, such that when it is true it indicates that the value of the attribute can be computed from the value of other attributes (and thus doesn’t need to be stored). The default association in the meta-class Property defines the default value for an attribute, which is the value to be associated to an attribute in case no value is defined by the model. OCL The Object Constraint Language (OCL) allows to adorn UML and MOF diagrams with constraints and make them far more semantically precise and detailed. In general, a constraint is defined as a restriction on one or more values of (part of) an object-oriented model or system [Warmer and Kleppe (1998)]. The main purpose of OCL is to augment a model with additional information that often cannot be expressed appropriately (if at all) in UML. This information is given by constraint which in general are easier to specify in a textual notation that in a graphicoriented language. UML modelers can use OCL to specify: • Arbitrary complex structural constraints among potentially distant elements of an application UML structural diagram or language metamodel; for this purpose OCL has the expressive power of first-order logic, and allows specifying class invariants

19

2.2. MODEL-DRIVEN ENGINEERING

and derived attributes and associations; • Arbitrary complex algorithms that combine behavior of class operations or message passing; for this purpose, OCL is Turing-complete and allows specifying operations preconditions, read-only operation bodies, and read-write operations post-conditions. Figure 2.7 shows the association between operations and OCL expressions. There are three kinds of constraints that might be associated to Operations: pre-conditions, pos-conditions and body (for query-only operations).

Figure 2.7 Association between operations and OCL expressions.

OCL allows the specification of invariant conditions that must hold for the system being modeled. For example, it is possible to specify in OCL that an attribute balance of a class BankAccount cannot store negative values. This can be accomplished (Figure 2.8) using a simple constraint on both the class and the specific attribute; below we show it in OCL concrete syntax.

Figure 2.8 Class Invariants.

20

2.2. MODEL-DRIVEN ENGINEERING

OCL allows developers to specify attributes or associations that permit their instances can be derived from those of others in the model. For example, the Figure 2.9 shows a simple model adorned with an OCL expression to derive attributes for the class Customer: The OCL expression derives the value of attribute golden by checking if the customer has an account which is greater than a given amount of money. This construction allows using OCL as a business rule specification language over business domains modeled as UML class diagrams.

Figure 2.9 Derived Attributes.

OCL pre-conditions may accompany an operation to detail which are the requirements to execute that operation, i.e. a pre-condition is a Boolean expression that must be true prior to the operation execution. OCL post-conditions express the state of the system after the operation is executed, including changes of objects. The diagram below gives a simple example of a pre-condition and post-codition: the withdrawing operation is only allowed if the balance is greater than or equal to the required amount and the new balance will be this previous value subtracted from the amount withdrawn.

Figure 2.10 OCL pre and post condition example.

2.2.2 Model Transformations Model transformation is the process of converting one model to another model of the same system [Judson et al. (2003)]. Because many aspects of a system might be of interest, various modeling concepts and notations can be used to highlight one or more particular perspectives, or views, of that system, depending on what is relevant at any point in time. Furthermore, in some instances, it is possible augment the models with hints, or rules, that assist in transforming them from one representation to another. It is

21

2.3. CHAPTER REMARKS

often necessary to convert to different views of the system at an equivalent level of abstraction (e.g., from a structural view to a behavioral view), and a model transformation facilitates this. In other cases, a transformation converts models offering a particular perspective from one level of abstraction to another, usually from a more abstract to less abstract view, by adding more detail supplied by the transformation rules. MDA practitioners recognize that transformations can be applied to abstract descriptions of aspects of a system to add detail [Brown (2004)], to make the description more concrete, or to convert between representations. Distinguishing among different kinds of models allows us to think of software and system development as a series of refinements between different model representations. These models and their refinements are a critical part of the development methodology in situations that include (i) refinements between models representing different aspects of the system, (ii) addition of further details to a model, or (iii) conversion between different kinds of models. Underlying these model representations, and supporting the transformations, is a set of metamodels. The ability to analyze, automate, and transform models requires a clear, unambiguous way to describe the semantics of the models. Hence, the models intrinsic to a modeling approach must themselves be described in a model, which we call a metamodel. For example, the static semantics and notation of the UML are described in metamodels that tool vendors use for implementing the UML in a standard way. The UML metamodel describes in precise detail the meaning of a class, an attribute, and the relationships between these two concepts [Brown et al. (2005)]. The OMG recognizes the importance of metamodels and formal semantics for modeling, and it has defined a set of metamodeling levels as well as a standard language for expressing metamodels: the Meta Object Facility (MOF). A metamodel uses MOF to formally define the abstract syntax of a set of modeling constructs.

2.3 Chapter Remarks This chapter presented the CBD and MDE, showing the principles and languages that establish the basis of our tracer framework. In order to promote reusable and extensible tracer artifacts we will adopt the CBD together with the MDE, that provide full encapsulation and separation of concern over the entire range of development stages, from requirements to modeling, implementation, testing quality insurance and maintenance.

22

3 Trace Meta-Theory

In this chapter we present the Trace Meta-Theory, that sets the foundation of our entire tracer framework. We address its principles and roles needed to specify tracers and traces. Furthermore, we discuss about generic traces and how to query traces. First of all, it is necessary to understand what a Meta-Theory is. According to the definition given by systemic TOGA [GADOMSKI (1997)], a Meta-Theory may refer to the specific point of view on a theory and to its subjective meta-properties, but not to its application domain. Therefore, a theory T of the domain D is a meta-theory if D is a theory or a set of theories. A general theory is not a meta-theory because its domain D is not composed of theories. By the previous definitions, the Trace Meta-Theory (TMT) [Deransart (2008)] is a meta-theory because it provides a set of definitions about how to define trace theories to specific domains. The term trace may be interpreted as a sequence of communication actions that may take place between the observer and its observed process, where its trace can be described in terms of finite-length sequences of events representing each step of running a given process. There is also the tracer that means the generator of trace. According to [Deransart (2008)], TMT focus particularly on providing semantics to tracers and the produced traces. Its semantics should be independent as possible from those of the processes or from the ways the tracers produce them. To illustrate the previous concepts, let’s suppose that we want to trace programs written in a given language called CHR 2 . The Figure 3.1 shows our scenario, where a CHR program is firstly translated into Prolog 3 , and after, it is executed in the SWIProlog 4 engine. 2 CHR

is a high-level language for concurrent logical systems. Prolog is a general purpose logic programming language. 4 SWI-Prolog is an open source implementation of the programming language Prolog. 3

23

3.1. TOWARDS REUSABILITY AND EXTENSIBILITY

Figure 3.1 Virtual and Actual Trace.

Suppose further that in our example we want to see the execution of CHR programs disregarding the states achieved on the underlying technologies, Prolog and SWI-Prolog. The remaining abstract states achieved during the execution, i.e. regarding the CHR environment forms a virtual trace. When we extract trace events from the virtual trace, for example, materializing these events by logging the CHR execution in a file system, we produce an actual trace. Finally, there is the idea of full trace, if the parameters chosen to be observed about the process represents the totality of knowledge regarding the process. In our example, the totality is represented by CHR, Prolog and SWI-Prolog.

3.1 Towards reusability and extensibility The TMT approach is mainly based on the concepts of actual or virtual trace and constitutes the start point for studying the modular construction of tracers and traces. Figure 3.2 shows the different roles related to the conception of a tracer. The TMT distinguishes 5 roles. 1. Observed process The observed process, or input process (one that produces trace events), is assumed to be more or less abstract in such a way that its behavior can be described by a virtual trace, that is to say, a sequence of (partial) states. A formal description

24

3.1. TOWARDS REUSABILITY AND EXTENSIBILITY

Obs. Process

T^v

T^w Extractor

Full

OS

T^w Filter

Full

Partial

Querying

Filter

T^v

Rebuilder

Partial

Analyser

IS

Figure 3.2 Roles in the TMT

of the process, if possible, can be considered as a formal semantics, which can be used to describe the actual trace extraction. 2. Extractor This is the extraction function of the actual trace from the virtual trace. In the case of a programming language, usually requires modifying the code of the process. 3. Filter The role of the filter, or driver [Langevine and Ducassé (2005)], is to select a useful sub-trace. This element requires a specific study. It is assumed here that it operates on the actual trace (that produced by the extractor). The filtering depends on the specific application, implying that the produced trace already contains all the information potentially needed for various uses. 4. Rebuilder The reconstruction performs the reverse operation of extraction at least for a subpart of the trace, and then reconstructs a sequence of partial virtual states. If the trace is faithful (i.e. no information is lost by the driver) [Deransart (2009)], this ensures that the virtual trace reconstruction is possible. Also in this case, the separation between two elements (rebuilder and analyzer) is essentially theoretical; these two elements may be in practice very entangled. 5. Analyzer The element to visualize and monitor a trace, it may be a trace analyzer or any application. TMT defines that the whole process of providing tracers and trace can be visualized in three main views (Figure 3.2):

25

3.2. GENERIC TRACE

1. Observational Semantics (OS) The OS describes formally the observed process (or a family of processes) and the actual trace extraction. Due to the separation in several roles, the actual trace may be expressed in any language. TMT suggests using XML. This allows to use standard techniques querying defined for XML syntax. 2. Querying TMT discusses about how to query a trace event, where it will be processed by the trace filter, on the fly, with respect to the conditions of the queries. 3. Interpretative Semantics (IS) The interpretation of a trace, i.e. the capacity of reconstructing the sequence of virtual states from an actual trace, is formally described by the Interpretive Semantics. In the TMT no particular application is defined; its objective is just to make sure that the original observed semantics of the process has been fully communicated to the application, independently of what the application does.

3.2 Generic Trace The Trace Meta-Theory also describes the motivation to build a generic trace format. This artifact is intended to facilitate the adaptation of analyzer tools on different domains. Furthermore, it enables anlyzers to be defined almost independently from specific domains and, conversely, tracers to be built independently from these tools. For this reason it is qualified “generic". The generic trace format contains the definitions of the trace events and what each tracer should generate when tracing execution of a specific domain. As illustrated by Figure 3.3, each application may generate a specific trace with many particular events not taken into account by the generic trace. In order to produce genertic trace events, it is thus requested that each event matches to a generic event. To this match it is requested that the subsequence of the specific trace which corresponds to the generic trace must be a consistent generic trace, i.e. a trace whose syntax and semantics follows from the trace schema specified and thus can be understood by the analyzers tools. Notice that not all application may be able to generate all described generic events. Thus the generic trace format describes a superset of the generic events a particular tracer is able to generate.

26

3.2. GENERIC TRACE

Figure 3.3 Generic and specific traces

On the other hand a “portable" analyzer tool should be able to extract from a specific trace and to understand the sub-flow of events corresponding to the generic trace. Figure 3.3 illustrates two cases: portable tools which use the specific trace only (Tools A, B and Y), and specific tools which use generic traces (Tool X). Both situations are acceptable. However, a specific tool which relies on specific trace events may be more difficult to adapt to another application. In short, TMT represents a generic trace as a sequence of trace events consisting of: • a sequential event number; • the port (the name of one of the semantics rules); • the observed state of the observed process; and • some specific attributes on the port.

3.2.1 Generic Full Trace of a Familly of Applications Consider Figure 3.4. This shows how different applications produce traces and the possibility to abtract them to a unique trace. This common trace is used to specify the virtual and actual traces.

27

3.2. GENERIC TRACE

Figure 3.4 A unique abstract model for several observed processes

This also illustrates how TMT proceeds to get a generic trace from any application: starting from an abstract theoretical suficiently refined semantics which is (almost) the same implemented in all applications.

3.2.2 Generic Full Trace of an Application Now we consider again the case of an application written in CHR (Figure 3.1). It may be for example trace events regarding a specific domain, like CLP(FD) 6 . In this case there exists a generic trace called GenTra4CP [Deransart & al (2004)]. This trace is generic for most of the CLP(FD) existing constraint solvers. Therefore a tracer of CLP(FD) solver implemented in CHR should also produce this trace. But we may be interested in refining the trace considering that there are two layers: the layer of the application (CLP(FD)) and the layer of the language in which it is implemented (CHR). The most refined trace will then be the trace in the GenTra4CP format extended with elements of the generic full trace of CHR alone. The generic full trace of CLP(FD) on CHR is an extension of the application trace taking into account details of lower layers. 6 CLP(FD)

is particularly useful for modeling discrete optimization and verification problems such as scheduling, planning, packing, timetabling etc.

28

3.3. QUERYING TRACE EVENTS

3.3 Querying Trace Events The TMT approaches for trace querying is based on events and trace interrogation. This interrogation is processed by filtering the trace events, on the fly, with respect to the conditions of the queries. For this purpose, a tracer driver should contain a filtering mechanism: it will receive the filtering queries from the analysis process and send back filtered information to it. TMT suggests using XML. This allows to use standard querying techniques defined for XML syntax, like XPath [Clark et al. (1999)].

3.4 Chapter Remarks This chapter presented the Trace Meta-Theory (TMT), an approach that focus particularly on providing semantics to tracers and the produced traces. We showed its three views: the Observational Semantics, that produces the trace information; the Driver Component, a trace query processor; and, the Interpretative Semantics, a front-end that takes as input the produced trace to show it in pretty-printting. This dissertation is focused on specifying the Observational Semantics and a Trace Driver in the context of a tracer framework and, as case study, a debugging tool for CHR. Furthermore, we specify a generic CHR trace schema for debugging using the XML Schema, and XML as host language to specify the rules of this schema.

29

4 The Plug and Trace Project

In previous chapters we explained how application analysis tools have steadily evolved during the last decades but, because the new generation of applications have increased in complexity across development, architecture and production, new challenges with regard to the monitoring, diagnosis and tuning of these complex application ecosystems are still emerging. Furthermore, as soon as applications are used for mission critical processes, performance and availability are important non-functional requirements. The need for flexible and user-friendly trace explanation facilities has been increasing as times goes by. This is so because the way applications are built today has fundamentally changed. The possibility of analyzing dynamic and static properties of several applications using a common analyzer is an important issue to reduce the learning curve. The actual scenario is that an analyzer of vendor X does not work with an analyzer of vendor Y, which does not work with another analyzer developed by vendor Z. In order to solve the aforementioned problem, the Plug and Trace project provides the first domain-independent and reusable debugging framework Its goal is to embed extraction and analyzer services into any application. Each component of this framework will be used either as a stand-alone software or assembled in order to provide a variety of analysis services that can be integrated in the most diverse domains. We proceed to explain how our tracer framework can be used as the basis for a trace analysis that realizes much of the existent tracer tools. We then argue the model-driven component-based architecture, that is the choice most aligned with our primary goal of delivering such a suite of reusable artifacts that can be easily integrated in everyday software. This chapter details the architecture of Plug and Trace, our proposed realization of such a framework.

30

4.1. GOAL AND DESIGN PRINCIPLES

4.1 Goal and Design Principles The main Plug and Trace architectural design principles are based on the Deransart’s theory [Deransart (2008)]. In a nutshell, our work means a first object-oriented mapping of this theory. First of all, let us introduce the requirements of our framework the proposed framework should: • be able to integrate with any kind of input process, addressing the entire test and performance life cycle, from early testing to production application monitoring, including test editing and execution, monitoring, tracing and profiling, and log analysis capabilities. The platform should support a broad spectrum of computing systems including embedded, stand-alone, enterprise, and high-performance, permiting to expand its support to encompass the widest possible range of systems. • be built on a component-based architecture and to be simple, intuitive and easy to reuse and operate. This project should build a generic, extensible, standards-based tool platform upon which software developers can create specialized, differentiated, and interoperable offerings for world class analysis tools • contain a trace request, sent by the trace analyzer, which means the part of the trace that the trace analyzer wants to see. In other words, it consists of receiving all the execution events and analyzing them on the fly to show only the interesting information. • permit its integration with the input process in a simple way without compromising the performance of the input processes. The following goals should be achieved in order to meet the aforementioned requirements: • To integrate and manage different domains into a unique environment, called Plug and Trace; • To combine the component-based development and model-driven architecture to produce reusable artifacts and a domain-independent framework; • To use a generic trace schema with the intention of maintaining a unique structure of trace events produced by the input processes.

31

4.2. THE TOP-LEVEL PLUG AND TRACE COMPONENT

• To provide a set of services to facilitate the extension and reuse of the entire framework; • To provide a set of views to easily analyze the trace events produced. • To provide GUI components to interactively submit queries and inspect solution explanations at various levels of details, with patterns to specify what trace information is needed. • To provide a trace request processor, to analyze the requests sent by the trace analyzer; • To support automation, i.e. to do more with less, eliminating redundant and manual tasks. In the next sections we specify the whole framework in details, showing its theoretical foundations, architecture and components.

4.2 The Top-Level Plug and Trace Component The Plug and Trace is designed for usage in any kind of application and across the entire lifecycle, including development, test, staging and production [Deransart and Oliveira (2009)]. Its architecture enables any application to be traced on the fly, an ideal solution for 24x71 production environments. The Plug and Trace framework describes all the phases involved from collecting the trace events until analyzing these information. The Plug and Trace basically acts as server collecting trace events from any kind of application. This is possible by injecting hooks into the application to produce trace events and is is the only source code changes required in the input process. After, all trace management will be performed through its three main sub-components: the TraceReceiver, the TraceDriver and and the TraceAnalyzer. Figure 4.1 shows the components involved in the application analysis process using the Plug and Trace. • The Trace Receiver, a listener that takes as input trace entries, sent by any input process, and forwards this trace to the Trace Driver component. It has also the important function of adapting these trace entries received in any format to a common structure inside the Plug and Trace framework, called TraceEvent. This adaptation 1 24/7

is an abbreviation which stands for "24 hours a day, 7 days a week", usually referring to a business or service available at all times without interruption

32

4.2. THE TOP-LEVEL PLUG AND TRACE COMPONENT

«component» PlugAndTrace

TraceForward TraceEntry

«component» TraceDriver

TraceAnalyzerRegister TraceAnalyzerRegister

«delegate» Analyzer

TraceEntry

«component» TraceReceiver

«component» TraceAnalyzer

TraceEvent TraceEvent

TraceFilter

«datatype» Event

Figure 4.1 The Top-Level Plug and Trace Component

is performed by extending the TraceAdapter class. This class is showed in details in the Section 4.2.1. • The Trace Driver, in a nutshell, provides the services and data structures necessary to process and filter the trace events. This component has the intention of maximizing the application analysis possibilities with minimum instrumentation and overhead. • The Trace Analyzer provides services and some views to analyze the trace events. Furthermore, through this component, it is possible to adjust the level of detail showed in the views by tuning it on-the-fly without restarting the target application. Other elements included in this framework are the model templates. Their goal are to reduce the hand coding by generating some artifacts using the MDE approach. These artifacts are detailed in Sections 4.3 and 4.4. The sequence diagram included in the Figure 4.2 presents how each component operates with one another and in which order.

33

4.2. THE TOP-LEVEL PLUG AND TRACE COMPONENT

sd PlugAndTrace

Process

loop

TraceReceiver

TraceDriver

TraceAnalyzer

[ true]

TraceAdapter sendTraceEntry (entry)

getTrace (entry) traceEvent

TraceFilter sendTraceEvent (traceEvent)

loop

[ for each filters ]

getRequest() request

doFilter(traceEvent, request) traceEvent

notify(traceEvent)

Figure 4.2 The Plug and Trace workflow

34

4.2. THE TOP-LEVEL PLUG AND TRACE COMPONENT

In the first lifeline the main loop is started, this interaction means the execution of a given process sending trace entries to the Plug and Trace framework. This input process will connect to the Trace Receiver and send each trace entry. To maintain the common data structures of traces among our components, we have defined the Trace Event (see Section 4.3), to adapt the input data into Trace Event the Trace Adapter will get the these entries, to convert it to a trace event and forward this information to the Tracer Driver. The internal loop iterates through the connected Trace Analyzers, filtering and sending their requested information.

4.2.1 Trace Receiver component In order to integrate our tracer framework with any kind of process, we provide the Trace Receiver component, as a mechanism for integrating, receiving and translating traces to a common structure, called TraceEvent. This component basically is a listener that gets the trace entries, sent by a connected input process, and forwards these entries to another component called TraceDriver. Figure 4.3 illustrates its operation.

Figure 4.3 The Trace Receiver acts as a listener to trace events

As a listener of trace entries, the Receiver has the function of forwarding these information to the subsequent components. Figure 4.4 shows its structure. The Trace Socket class is our default implementation of the Receiver class. This class is specified as a Java Socket. 2 Its operation is basically to run on a specific computer and use a socket that is 2 http://java.sun.com/j2se/1.4.2/docs/api/java/net/Socket.html

35

4.2. THE TOP-LEVEL PLUG AND TRACE COMPONENT

bound to a specific port number. The Trace Socket just waits, listening to the socket, for an input process to make a connection. This main loop was presented in Figure 4.1. For each trace entry the input process will connect to the Trace Socket and will send trace entries. After, it is necessary to convert (using the TraceAdapter class) the received information to a trace event (Section 4.3), this step is necessary to maintain a unique trace structure. If the trace entries are specified using the XML format, we can use the XMLTraceAdapter class, our default implementation of the Trace Adapter class. «component» TraceReceiver

TraceEntry

«interface» TraceEntry

TraceForward Receiver

TraceAdapter

1

+getTrace(entry:Object):TraceEvent +sendTraceEntry(entry:Object)

adapter

TraceSocket

XMLTraceAdapter

Figure 4.4 The Trace Receiver component

The Trace Receiver provide the following service: context Receiver::sendTraceEntry(entry:Object) post: traceForward.traceEvent = getAdapter().getTrace(entry)

This method is just to get the entry sent by the input process, then, call the implementation of the Trace Adapter class, convert this entry into a TraceEvent and, finally, forward the converted trace event to the Trace Driver component. Our default implementation of the TraceAdapter class uses the XMLDecoder3 to adapt the XML trace entries to a trace event, it is described in the following code: context XMLTraceAdapter::getTrace(entry:Object):TraceEvent post: result = XMLDecoder.convert(entry) 3 http://download.java.net/jdk7/docs/api/java/beans/XMLDecoder.html

36

4.2. THE TOP-LEVEL PLUG AND TRACE COMPONENT

If the input process sends the trace entries using the XML format, then it is not necessary to register a trace adapter or configure anything to convert the trace entries into trace events, because the Plug and Trace framework is configured (by default) to received trace entries specified in this format.

4.2.2 Trace Driver component As the core of the Plug and Trace framework, the Trace Driver has the function of filtering the trace events and iterating through each Trace Analyzer to send the filtered trace events. Figure 4.5 shows its structure in a class diagram. «component» TraceDriver

Forward «interface» TraceForward

«interface» TraceAnalyzerManager

Driver

+notifyAnalyzers(event:TraceEvent) +getAnalyzers():TraceAnalyzer[*]

1

TraceAnalyzerRegister

+register(analyzer:TraceAnalyzer) +unregister(analyzer:TraceAnalyzer)

filterManager

FilterManager +process(analyzer:TraceAnalyzer, event:TraceEvent) 1

chain «interface» TraceFilter

FilterChain * +configureFilter() +addFilter(filter:TraceFilter)

filters

TraceFilter

+doFilter(event:TraceEvent, request:Request) +getAnalyzer():TraceAnalyzer

Figure 4.5 The Trace Driver component

After the Trace Receiver forwards the trace event, the Trace Driver starts its execution. The main activity is to notify all connected analyzers about the trace event received. There is also an interceptor, to filter the trace event in according to the Trace Request sent by the Trace Analyzer. This interceptor is called FilterManager. It will get the tuple trace analyzer and trace event, delegate to the FilterChain to determine which filters will be applied and, finally, forward the filtered trace event to the trace analyzer. The Trace Driver provides the following service: context TraceForward::notifyAnalyzers(event:TraceEvent)

37

4.2. THE TOP-LEVEL PLUG AND TRACE COMPONENT post: getAnalyzers()->forAll(analyzer: TraceAnalyzer | filterManager.process(analyzer, event))

For all analyzers connected to the Plug and Trace framework, the TraceDriver will iterate over them, filtering the trace event, and after, sending the requested trace for each analyzer. context TraceAnalyzerManager::register(analyzer:TraceAnalyzer) post: getAnalyzers()->includes(analyzer)

The basic step to connect an analyzer to the Plug and Trace framework is just to call the register service passing the analyzer and the framework will start to send trace events to it. context TraceAnalyzerManager::unregister(analyzer:TraceAnalyzer) post: getAnalyzers()->excludes(analyzer)

The inverse process of registering an analyzer connected to the Plug and Trace framework is similar, it is only necessary to call the unregister service passing the analyzer desired and the framework will stop to send trace events to it. The behavior of the interceptor class is described as follows: context FilterManager::process(analyzer:TraceAnalyzer, event:TraceEvent) post: filterChain.filters->forAll(filter: TraceFilter | getAnalyzer().notify( filter.doFilter(event, analyzer.request)))

If an analyzer wants to filter the trace events before receiving them, it is necessary to include this information into the request attribute of the TraceAnalyzer. Then, when the FilterManager starts to process the events, it will get the request sent by the analyzer together with the event and will return the desired event. Notice that the framework permits to plug more than one filter. context FilterChain::addFilter(filter: TraceFilter) post:

38

4.2. THE TOP-LEVEL PLUG AND TRACE COMPONENT filters->includes(filter)

The addFilter method shows how to include a new Filter into the Plug and Trace framework, the addFilter service should be called, passing the desired filter class.

4.2.3 Trace Analyzer component The TraceAnalyzer component is the artifact that the end-user will have access to, by viewing and analyzing the trace events produced. Figure 4.6 shows its structure in a class diagram. «component» TraceAnalyzer

TraceAnalyzer

«interface» TraceEvent

TraceEvent

+notify(event:Event)

«interface» StepByStepAnalyzer

Request

+newStep():TraceEvent

Figure 4.6 The Trace Analyzer component

We provide two simple trace analyzers: • The TraceAnalyzer class, that is notified when a trace event is received; • The StepbyStepAnalyzer class, that queues each trace event received, and, releases it when the newStep() method is called. As the TraceAnalyzer is an abstract class, its function is just to be notified when a trace event is received. The StepByStepAnalyzer provides the following service:

39

4.3. TRACE EVENT

context TraceAnalyzer::newStep():TraceEvent post: if (hasNext()) then result = queue.remove(); else endif

4.3 Trace Event The actual format of the trace events has no influence on the tracer driver mechanisms. The important issue is that events have attributes and some attributes are specific to the type of events. However, we have adopted a common structure, called TraceEvent, to represent this information inside the Plug and Trace framework the goal is to make its utilization and extension easy. Outside this framework any kind of the trace can be used, in that case, it is necessary to implement the integration by means of the TraceAdapter class. There is just two steps to generate the trace events related to the input process: • Modeling: we use the UML notation to represent the structure of each trace events • Generating: the next step is to generate all classes that will to be instantiated when a trace event is produced. To exemplify the previous steps suppose that a given input process generates trace events that contains two actions: Action1 and Action2. The Action1 has one parameter of the type String, param1. The Action2 has two parameters of same type String, param1 and param2. The Figure 4.7 shows in a class diagram the representation of the trace event to this context. The next step is to generate all concrete classes from the TraceEvent model. The Plug and Trace framework provides a mechanism to transform model-to-text. We have choosen a Template-Based Approach [Czarnecki and Helsen (2003)] to produce the some artifacts in our propossal, a template usually consists of the target text containing splices of metacode to access information from the source and to perform code selection and iterative expansion (see [Cleaveland and Cleaveland (2001)] for an introduction to templatebased code generation). The majority of currently available MDE tools support templatebased model-to-code generation, e.g., b+m Generator Framework, JET, FUUT-je, Coda-

40

4.3. TRACE EVENT

TraceEvent

action

Event1

Action

1 +chrono:Integer

Action1 +param1:String

Action2 +param1:String +param2:String

Figure 4.7 An example of a trace event model

gen Architect, AndroMDA, ArcStyler, OptimalJ and XDE (the latter two also provide model-to-model transformations). MOFScript [Oldevik (2006)] was our choice to specified and transform trace models into some artifacts. In the following we describe briefly the motivations to choose MOFScript as host language to our propostal, this framework covers: • Generation of text (code) from MOF-based models: The ability to generate text from any MOF-based model, e.g., UML models or any kind of domain model. • Control mechanisms: The ability to specify basic control mechanism such as loops and conditional statements. • String manipulation: The ability to manipulate string values. • Output of expressions referencing model elements • Production of output resources (files): The ability to specify the target file for text generation. • Traceability between models and generated text: The ability to generate and utilize traceability between source models and generated text, e.g., to provide regeneration.

41

4.3. TRACE EVENT

• Ease of use: An easy-to-use interface The next sentence presents a piece of code of the MOFScript Language, showing the template to transform the UML representation of the trace events into Java code. The Plug and Trace repository4 contains the complete template of the trace event classes. texttransformation UML2TraceEvent (in uml:"http://www.eclipse.org/uml2/2.1.0/UML") { property ext:String = ".java" uml.Model::main () { property a1:String="" self.ownedMember->forEach(p:uml.Package) { p.mapPackage() } } uml.Package::mapPackage ()

{

if (self.name.equals("model")){ var pName:String = self.getFullName() self.ownedMember->forEach(c:uml.Class) c.mapClass(pName) self.ownedMember-> forEach(i:uml.Interface) { i.interfaceMapping(pName) } } ... uml.Class::mapClass(packageName:String) { var pLocation:String = packageName.replace ("\\.", "/") file (package_dir + pLocation + "/" + self.name + ext) ’package ’ package_name + "." + packageName ’;’ ... 4 http://code.google.com/p/plugandtrace and

http://www.cin.ufpe.br/ roberto/AlunosPG/rafoli

42

4.4. GENERIC TRACE SCHEMA } ...

The following contains a piece of code of the model to text transformation to the example showed in the Figure 4.7 to generate the trace classes from trace model example. The Plug and Trace repository contains all classes generated from the UML2TraceEvent template. ... public class CHRTrace extends Trace { private Integer chrono; private Action action; public Integer getChrono () { return chrono; } public void setChrono(Integer chrono) { this.chrono = chrono; } ...

4.4 Generic Trace Schema This section describes a generic trace schema. This schema is intended to facilitate the definition of analysis tools in any kind of domain. In this section we define the syntax and the semantics of the generic trace schema. Traces are encoded in an XML format, using the XML Schema described in this document. A trace must be a valid XML document according to this Schema. A trace with specific tracer events should be a valid XML document too and should provide a reference to the corresponding Schema, fully compatible with the one described here. The long term objective of designing a generic trace format is to fully define the communications between input processes and analyzers ensuring full compatibility of all possible analysis tools with all possible input processes. More work and experiments are necessary to fully formalize it. Only one part of this communication is considered in

43

4.4. GENERIC TRACE SCHEMA

this document which includes some simple mechanisms of communication in order to allow further experimentation. The next step is to generate the specific trace schema from the TraceEvent model. The Plug and Trace framework provides also a mechanism to transform model-to-text. In the following sentence we present a piece of code of the MOFScript Language, showing the template to transform the UML representation of the trace events into Trace Schema. The Plug and Trace repository contains the complete template of the trace schemas. texttransformation UML2TraceSchema (in uml:"http://www.eclipse.org/uml2/2.1.0/UML") { property ext:String = ".xsd" uml.Model::main () { property a1:String="" self.ownedMember->forEach(p:uml.Package) { p.mapPackage() } } uml.Package::mapPackage ()

{

if (self.name.equals("model")){ var pName:String = self.getFullName() self.ownedMember->forEach(c:uml.Class) c.mapTraceSchema(pName) } self.ownedMember->forEach(p:uml.Package) { p.mapPackage() } ... }

In the following we have applied the model to text transformation to the exemple showed in the Figure 4.7 to generate the trace schema from trace model example.

45

4.5. THE PLUG AND TRACE PROCESS

4.5 The Plug and Trace Process Here we intend to describe a possible process to extend our framework. It is just a set of guidelines about each step necessary to build an application analysis tool. In fact, the Plug and Trace acts as a small tracer with the main funcionalities required by any tracer tool, providing several mechanisms to be extended. To present the process to extend the Plug and Trace framework, we are going to use a top-down vision, i.e., as start point, we discuss about the informations that we want to see, how to present these informations and until the way to collect the trace informations in the input process. Figure 4.8 illustrate this process to extend our trace framework.

Understanding the context

Evalua!ng the analysis

Configuring the Plug and Trace framework

Modeling the trace events

Instrumen!ng

Figure 4.8 The Plug and Trace Process

We have adopted an incremental and iterative process [Fichman and Moses (1999)], the basic idea is to develop a tracer tool through repeated cycles (iterative) and in smaller portions at a time (incremental), allowing the developer to take advantage of what was learned during the development of earlier portions or versions of the tracer. Learning comes from both the development and use of the tracer, where possible key steps in the process start with a simple implementation of a subset of the requirements and iteratively enhance the evolving versions until the full tracer is implemented. At each iteration, design modifications are made and new functional capabilities are added.

46

4.5. THE PLUG AND TRACE PROCESS

In the next, we describe the steps of this process, showing its activities and the artifacts produced. 1. Understanding the context The first step is to comprehend the input process and determine what we want to analyze from it and how the trace information should be presented. In other words, we have to establish the tracer scope and boundary conditions, including an operational vision about the trace analyzers, acceptance criteria and what is intended to be in the tracer and what is not. It is mainly necessary to model the trace events and to instrument the input process. 2. Modeling the trace events The primary objective is to mitigate the trace information identified up to model this information using an UML representation. The modeling phase is where the tracer starts to take shape. In this phase the problem domain is made and the model of the tracer gets its basic form. This phase is expected to capture a healthy majority of the requirements related to the trace information, to be collected from the input process and to create its class structure. The final modeling phase is a translation. At this point the concrete Trace classes (see Section 4.3) and Trace Schema (see Section 4.4) should be generated, using a MDE approach and UML as modeling language to descride the specification. 3. Instrumenting Instrumenting is the largest phase in the conception of the tracer tool. In this phase the remainder of the tracer is built on the foundation laid in Modeling. Views (Trace Analyzers) are implemented in according to the expected analysis. Each iteration should result in an executable release of the tracer. This phase means to intrument the input process, to produce the trace events following the generate trace schema and to create the views, following the concrete classes of the trace events produced in the previous phase. 4. Configuring the Plug and Trace framework After all artifacts are specified and implemented, it is time to configure the Plug and Trace framework with the tool created. The following steps are necessary to execute and start to analyze the trace events:

47

4.6. CHAPTER REMARKS

• launch the TraceReceiver component, that will be waiting for connection; • launch the input process and connect it to the TraceReceiver; • register all views in the TraceDriver component; • register the Adapter class, if none is defined the Plug and Trace framework will use the default XMLTraceAdapter to receive trace events specified in the XML format; • finally, the input process is ready to be traced, using the views (TraceAnalyzers) 5. Evaluating the analysis The evaluation has an important goal of monitoring the whole process, to ensure a strong adherence of the tool produced in relation to the requirements specified early in the process. The main evaluation criteria for the elaboration phase involve the answers to these questions: (a) Are the trace events showed in the views to the expected? (b) Is the architecture well-defined? (c) Does the executable tracer show that the requirements analysis have been addressed and credibly resolved? (d) Is the tracer tool sufficiently detailed and accurate? Is it backed up with trace events model? (e) Do all stakeholders agree that the desired analysis of the input process can be achieved, in the context of the current tracer? This activity should be performed to verify the quality of the artifacts produced in each step, restarting the process if necessary. Figure 4.9 summarizes the components and artifacts that should be specified (blue), generated (gray) and reused (black).

4.6 Chapter Remarks In this chapter we exhibited the Plug and Trace framework, showing its sub-components and how this framework can be integrated with any kind of domain. Furthermore, we

48

4.6. CHAPTER REMARKS

Figure 4.9 Components and Artifacts involved in the Plug and Trace instrumentation

presented the goals and principles of this project, showing how to define and integrate differents kind of domain in an unique environment, called Plug and Trace. We also defined the generic trace schema that sets the foundation of the interoperability between applications and our framework, by maintaining a unique communication structure. Finally, we presented the Trace Event, showing how to model and generate the concrete classes used on the Trace Analyzers.

49

5 Extending Plug and Trace to CHR

This chapter describes how to instantiate our tracer framework to a given domain. We discuss which components should be reused and extended by creating a simple debugging tool for Constraint Handling Rules (CHR), a rule-based language. In the following sections we present: • An introduction of our case study, showing the motivations for choosing this language as basis of our case study, and also, the requirements and features that should be available in this tool. • In a nutshell, the concepts of Rule-Based System, in particular about CHR, showing its structure and operational semantics. • Finally, how to build, step by step, a debugging tool for CHR using the Plug and Trace framework.

5.1 Case Study: A Debugging Tool for CHR In order to validate the entire Plug and Trace framework we have chosen, as case study, to create a debugging tool for CHR. The motivation for choosing this language is due to the fact that, CHR provides an elegant general-purpose language with a wide spectrum of application domain. However, at present, there exists a number of useful debugging tools for CHR, for example, ECLiPSe Prolog Aggoun (2001), SWI-Prolog Wielemaker (2006) or CHROME Vitorino (2009), but, these tools were designed and implemented in a specific way for each solver, not all tools benefit from all the existing tool. Figure 5.1 shows this current scenario, for each CHR solver there is a specific implementation of the debugging tool.

50

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR

Figure 5.1 Current situation: each solver is strictly connected to the debugging tool. Figure adapted of Langevine et al. 2004

This way each implementation results in a set of one-to-one specialized connections between a solver and its tools. If we want to interchange data between each solver, hooks have to be added into the solver code in order to be able to do it. Furthermore, the types of basic information required by a given debugging tool is not often made explicit and may have to be reverse-engineered. This is a non-neglectable part of the cost of porting debugging tool from one constraint programming platform to another. In order to solve the above-mentioned problem and improve analysis and maintenance of rule-based constraint programs, like CHR, we decide to use this domain as example and case study to validate the Plug and Trace framework. In the next sections we present, using the Plug and Trace Process (see Section 4.5, how to extend the Plug and Trace framework by creating a debugging tool for CHR.

5.1.1 Understanding the context: CHR by example The first and main step is to understand the context and to decide what and how should the information be presented. This section describes briefly the Constraint Handling Rules environment, showing its concepts and exemplifying this language. The set of constraint handling rules below defines a less-than-or-equal constraint (leq/2). The rules illustrate several syntactical features of CHR. reflexivity

@ leq(X,Y) X=Y | true.

antisymetry

@ leq(X,Y) , leq(Y,X) X=Y.

idempotence

@ leq(X,Y) \ leq(X,Y) true.

transitivity @ leq(X,Y) , leq(Y,Z) leq(X,Z).

51

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR

This CHR program specifies how leq simplifies and propagates as a constraint. It program implements reflexivity, antisymetry, idempotence and transitivity in a straightforward way. CHR re f lexivity states that leq(X ,Y ) simplifies to true, provided it is the case that X = Y . This test forms the (optional) guard of a rule, a precondition on the applicability of the rule. Hence, whenever we see a constraint of the form leq(X , X ) we can simplify it to true. The rule antisymetry means that if we find leq(X ,Y ) as well as leq(Y, X ) in the constraint store, we can replace it be the logically equivalent X = Y . Note the different use of X = Y in the two rules: in the re f lexivity rule the equality is a precondition (test) on the rule, while in the antisymetry rule it is enforced when the rule fires. (The reflexivity rule could also have been written as re f lexivity@leq(X , X ) true.) The rules re f lexivity and antisymetry are simplification CHR. In such rules, the constraint found are removed when the rule applies and fires. The rule idempotence is a simpagation CHR, only the constraint right of ´ ´ removed. The rule says that if we find leq(X ,Y ) and another leq(X ,Y ) in the conbe straint store, we can remove one. Finally, the rule transitivity states the the conjuction leq(X ,Y ), leq(Y, Z) implies leq(X , Z). Operationally, we add leq(X , Z) as (redundant) constraint. without removing the constraints leq(X ,Y )′ and leq(Y, Z). This kind of CHR is called propagation CHR. In the next section, we explain how to execute a CHR program, using a theoretical operational semantics, called ωt . Operational Semantics Operationally, CHR exhaustively applies a set of rules to an initial set of constraints until a fixed point is reached. Our intention is to capture the trace information for all action performed during the execution of a CHR program. To accomplish this goal we will instrument our input process to produce traces during its execution. Theses traces will follow a theoretical operational semantics, called ωt as fully defined in [Duck et al. (2004)]. In the next paragraphs, we describe, briefly, how ωt executes a CHR program and what trace information we want to see in our debugging tool. We begin by defining constraints, rules and CHR programs. We define CT as the constraint theory which defines the semantic of the built-in constraints and thus models the internal solver which is in charge of handling them. We assume it supports at least the equality built-in. We use [H|T ] to indicate the first (H) and the remaining (T ) terms

52

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR

in a list, ++ for sequence concatenation and [] for empty sequences. We use the notation a0 , . . . , an for both bags and sets. Bags are sets which allow repetitions. We use ∪ for set union and ⊎ for bag union, and to represent both the empty bag and the empty set. The identified constraints have the form c#i, where c is a userdefined constraint and i a natural number. They differentiate among copies of the same constraint in a bag. We also assume the functions chr(c#i) = c and id(c#i) = i. A nexecution state is a tuple hQ,U, B, Pin, where Q is the Goal, a bag of constraints to be executed; U is the UDCS (User Defined Constraint Store), a bag of identified user defined constraints; B is the BICS (Built-in Constraint Store), a conjunction of constraints; P is the Propagation History, a set of sequences, each recording the identities of the user-defined constraints which fired a rule; n is the next free natural used to number an identified constraint. The initial state is represented by the tuple hQ, [],true, []in. The transitions are applied non-deterministically until no transition is applicable or the current built-in constraint store is inconsistent. These transitions are defined as follows: Solve h{c} ⊎ Q,U, B, Pin 7→ hQ,U, c ∧ B, Pin where c is built-in Introduce h{c} ⊎ Q,U, B, Pin 7→ hQ, {c#n} ⊎ U, B, Pin+1 where c is user-defined constraint Apply hQ, H1 ⊎ H2 ⊎U, B, Pin 7→ hC ⊎ Q, H1 ⊎U, e ∧ B, P′ in where exists a rule r@H1′ \ H2′ ⇔ g⌊C and a matching substitution e, such that chr(H1) = e(H1′ ), chr(H2) = e(H2′ ) and CT |= B ⊇ ∃(e ∧ g); and the sequence id(H1) + +id(H2) + +id[r] 6∈ P; and P′ = P ∪ id(H1 ) + +id(H2) + +[r] Example The following is a (terminating) derivation under ωt for the query leq(A, B), leq(B,C), leq(C, A) executed on the leq program. For brevity, P have been removed from each tuple.

53

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR h{leq(A,B),leq(B,C),leq(C, A)}, 0, / 0i / 1

(1)

7→introduce

h{leq(B,C),leq(C,A)},{leq(A, B)#1}, 0i / 2

(2)

7→introduce

h{leq(C,A)},{leq(A,B)#1,leq(B,C)#2}, 0i / 3

(3)

7→apply

h{leq(C,A),leq(A,C)},{leq(A, B)#1,leq(B,C)#2}, 0i / 3

(4)

7→introduce

h{leq(C,A)},{leq(A,B)#1,leq(B,C)#2, leq(A,C)#3}, 0i / 4

(5)

7→introduce

h0,{leq(A,B)#1,leq(B,C)#2,leq(A,C)#3, / leq(C, A)#4}, 0i / 5

(6)

(antisymetry X = C ∧Y = A)

7→apply

h0,{leq(A,B)#1,leq(B,C)#2}, / {A = C}i5

(7)

(antisymetry X = C ∧Y = A)

7→apply

h0, / 0,{A / = C,C = B}i5

(8)

(transitivity X = A ∧Y = B ∧ Z = C)

No more transition rules are possible, so this is the final state. Given this operational semantics, we want to produce trace information of our input process, following these three actions: Solve, Introduce and Apply. Furthermore, we also want to know the Initial State, given as start point to execute a CHR program.

5.1.2 Modeling the trace events: ωt After we decide what and how should the information be presented, in that case following the operational semantics ωt , we will model it. This step consists in mapping the desire trace events into a UML representation. It will be useful to generate the following two artifacts: • the concrete classes, representing the model of the ωt trace events. • the trace schema of CHR. Each event sent from the input process should follow this trace schema. The Figure 5.2 shows the operational semantics ωt in a class diagram. In another words, this model means an object-oriented mapping of the operational semantics ωt . The root element of our trace model is the CHRTrace class, it means an event produced by the input process. Each event will have a action that says which operation was performed: InitialState, Introduce, Solve or Apply. After the model has been specified, we will use some templates to generate the classes of the trace events and the trace schema. It will be done by executing the templates UML2TraceEvent.m2t and UML2TraceSchema.m2t respectively. The following code shows one of the artifacts generated from the model transformation.

54

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR

TraceEvent

CHRTrace

action

Apply

Action

1 +rule:String

+chrono:Integer

InitialState

Introduce

+goal:String

+udc:String

Solve +bic:String +builtIns:String

Figure 5.2 ωt model

package chr.model; /* * * *

G e n e r a t e d c l a s s − CHRTrace

*

@author MOFScript g e n e r a t o r ’ UML2TraceEvent ’

*

@date 4 / 4 / 2 0 1 0

*/ import core.Trace; public class CHRTrace extends Trace { /* * Attributes */ private Integer chrono; private Action action; public Integer getChrono () { return chrono;

}

public void setChrono(Integer chrono) {

55

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR this.chrono = chrono; } public Action getAction () { return action; } public void setAction (Action action) { this.action = action; } } / / End o f c l a s s CHRTrace

Finally, in the following, the trace schema also generated from the trace model.

56

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR



The next section describes how to instrument the input process to produce traces (following the trace schema) and presents two simple analyzers to debug a CHR program.

5.1.3 Instrumenting: A Debugging Tool for Eclipse Prolog In this section we intend: • to instrument the input process, Eclipse Prolog, to produce trace events following a given trace schema for CHR, defined in the previous section. • to create two simple views to analyze the execution of CHR programs The first step to send events to the Plug and Trace framework is to connect on it. Supposing we are running the Plug and Trace, through the TraceSocket class, in the port 2004, the following code shows how Eclipse Prolog will connect to this framework. start_connection :socket(internet, stream, eclipse_to_java), connect(eclipse_to_java, localhost/2004).

After it is necessary to capture from the input process which actions we want to analyze. As described in the operational semantics ωt , the action Introduce means a constraint added to a constraint store. The following code exeplifies the extraction of this action from the input process. treat_chr_dbg(add_one_constraint(Nb, Constraint)) :printf(debug_output, "ADD (%d) %p\n", [Nb, Constraint]), write(Constraint), send_trace(introduce, Constraint).

57

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR

Finally, we will show the code that produces a trace event following the trace schema for CHR in Eclipse Prolog. send_trace(Action, Parameters) :strToBytes("1.0", Ver), strToBytes("UTF-8", UTF8), strToBytes("1.6.0_04", JavaVersion), strToBytes("java.beans.XMLDecoder", Class), strToBytes("chr.model.CHRTrace", CHRTrace), strToBytes("chrono", ChronoProperty), strToBytes("0", Chrono), strToBytes("action", ActionProperty), strToBytes("udc", UDCProperty), strToBytes("chr.model.Introduce", Introduce), termToBytes(Parameters, UDC), xml_parse(OutXML, xml( [version=Ver, encoding=UTF8], [ element( java, [version=JavaVersion, class=Class], [ element( object, [class=CHRTrace], [ element( void, [property=ChronoProperty], [element( int, [], [pcdata(Chrono)])]), element( void, [property=ActionProperty], [ element( object, [class=Introduce], [ element( void, [property=UDCProperty], [ element(string,[],[pcdata(UDC)]) ] ) ] ) ] ) ] )

58

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR ] ) ] ) ), string_list(S, OutXML, bytes), concat_string([S, "\r\nEOF\r\n"], Out), write(eclipse_to_java, Out), flush(eclipse_to_java).

After the instrumentation of the input process, by extracting trace events, to be completed, it is time to create the views to analyze these trace events produced. To exemplify the analysis we will create the following views: • A pretty-printing view, just to show the states achived after each action execution; • A view to show step by step the execution of CHR programs. The Figure 5.3 presents these two kind of analyzers: View1 shows the evolution of the CHR parameters (Goal, Constraint Store, Built-ins, etc), defined in the trace schema (section 4.4); and, the View2 that focus in a specific rule when it rule is triggered.

Figure 5.3 Visualizing a CHR execution

The View1 extends the TraceAnalyzer clas: for each trace event sent by the input process this view will be notified immediately, through the method notify(TraceEvent event). The following code shows the main execution of the View1

59

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR

... public class PrettyPrinting extends TraceAnalyzer{ ... @Override public void notify(TraceEvent event) { txtArea.append(event.toString()+"\n"); } ... }

The View1 extends the StepByStepAnalyzer. In this case, the view maintains a queue of all trace events received, a trace can be obtained by calling the method newStep() as described in the following code. ... public class StepByStep extends StepByStepAnalyzer{ public StepByStep(){ JButton newStepBtn = new JButton("new Step"); newStepBtn.addActionListener(new ActionListener() { public void actionPerformed(ActionEvent ev) { if (hasNext()) setCHRAction(newStep()); } }); ... } ... }

5.1.4 Configuring the Plug and Trace framework: Connecting all pieces Basically, there are four configuration steps to execute the Plug and Trace framework and to start getting the trace events: 1. To register the trace analyzer, by invoking the registerAnalyzer method from the TraceDriver class;

60

5.1. CASE STUDY: A DEBUGGING TOOL FOR CHR

2. To set which the FilterChain to filter the requests sent by the trace analyzers; 3. To initialize the Listerner class, in that case the TraceSocket class, saying which trace adapter will be used to integrate the input process with the Plug and Trace framework; 4. To run the TraceSocket, listening to events sent by the input process. The following code presents a simple code to execute the Plug and Trace framework integrated with the CHR environment. ... public class CHRTest { public static void main(String[] args) { TraceDriver.getDriver() .registerAnalyzer(new StepByStep()); TraceDriver.getDriver() .registerAnalyzer(new PrettyPrinting()); TraceDriver.getDriver().setFilterChain(new CHRFilterChain()); TraceSocket server = new TraceSocket(new XMLTraceAdapter()); while (true) { server.run(); } } }

After we run the leq example, the Eclipse Prolog will start to send trace events to the Plug and Trace framework, following the trace schema defined in the Section 5.1.2. The next code shows an instance of this trace schema, exemplifying the action Introduce. leq(A,B), leq(B,A), leq(B,C) leq(A,B) leq(B,A), leq(B,C) leq(C,A) leq(B,C) ...

Finally, the Figure 5.4 shows the execution of a CHR program, in that case, the leq example.

5.1.5 Evaluating the Analysis In that case study, we just produced a simple debugger to understand a CHR execution, related to the parameters discussed in the ωt semantics. Starting by the generic Plug and Trace framework, a not well known domain could be specified, in term of traces, and, be analyzed. As our intuit was simply to build a viewer for CHR programs and to demonstrate the Plug and Trace framework, the functionalities available in this tool boil down to just a step by step analysis and a pretty-priting exibihition of the CHR parameters. By extending the TraceAnalyzer component, it is possible to create advanced kind of views to understand in deep a CHR execution, for example, not only showing the evolution of the state of a CHR program, but also creating a user-friendly debugging tool for CHR with reasoning explanatory facilities. The final step is to evaluate the trace events produced and the trace analyzers created. If is necessary some change in the trace model, alter it and regenerate the concrete trace

62

5.2. CHAPTER REMARKS

Figure 5.4 Running the CHR debugging tool

classes and trace schema, or review the code of the trace analyzer. Otherwise, restart the Plug and Trace Process, with a better understanding of the context and reviewing each step specified.

5.2 Chapter Remarks In this chapter we presented how to instantiate, following the Plug and Trace Process, our trace framework to the CHR domain. The case study showed how to instrument the input process, to model the trace events, to create two views and to integrate all of this stuff in the Plug and Trace framework. It is important to remark that our goal is not to produce an useful debugging tool for CHR, we intention was only to present the Plug and Trace framework and the process to instantiante it. In order to facilitate the usage of the whole framework, it is necessary create some pre-defined artifacts, like: trace adapter to convert from another kind of domains, trace schemas and trace analyzers. A detailed description of the limitations and future work is presented in the Section 7.2.

63

6 Related Work

Next sections present the related works, highlighting the differences to our method. The first project is called TPTP, a framework that extends the family of Eclipse technologies to provide an open development platform supplying frameworks and services for test and performance tools that are used throughout the lifecycle. The second one is the dynaTrace, with commercial and open-source load testing solutions to reduce manual problem reproduction, and to rapidly diagnose Java/.NET issues with code-level performance diagnostics. The thirdy project is called TAU, it is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, Java, Python. Finally, we present the InfraRED, a tool that is focused on monitoring performance of a J2EE application. The main different of the Plug and Trace project to these related works is due to the fact that we specify and provide services to integrate with any kind of application.

6.1 Eclipse TPTP The Eclipse Test & Performance Tools Platform Top-Level Project (TPTP) [Eclipse (2007)] is an open source project that allows developers to build project-specific test and performance analyzing tools. These frameworks and services include testing, tracing and profiling, tuning, logging, monitoring, analysis and administration. The project also aims to deliver exemplary tools that not only verify the utility of the platform but also serve to illustrate how it can be used, as well as support the development and maintenance of the platform itself. The objective is to ensure that these tools are extensible and can easily be integrated to as wide a range of IT systems/technologies as possible. TPTP addresses the entire test and performance lifecycle, from early testing to production application monitoring, including test editing and execution, monitoring, trac-

64

6.1. ECLIPSE TPTP

ing and profiling, and log analysis capabilities. The platform currently supports a broad spectrum of computing systems including embedded, standalone, enterprise, and highperformance and will continue to expand support to encompass the widest possible range of systems.

Figure 6.1 TPTP Project Architecture

The TPTP architecture is organized into four distinct project (see Figure 6.2): • TPTP Platform project: The TPTP Platform Project provides the core framework upon which the development of the monitoring, testing and tracing and profiling tools relies. It provides common user interface, standard data models, data collection and communications control, as well as remote execution environments. • TPTP Monitoring Tools Project: The TPTP Monitoring Tools Project addresses the monitoring and logging phases of the application lifecycle, collecting and analyzing system and application resources. The innovative log analysis tools can

65

6.1. ECLIPSE TPTP

correlate disparate logs from multiple points across an environment. The project also includes exemplary tools for monitoring application servers and system performance, such as CPU and memory utilization. • TPTP Testing Tools Project: The TPTP Testing Tools Project addresses the testing phase of the application lifecycle. It contains testing editors, deployment and execution of tests, execution environments and associated execution history analysis and reporting. The project also includes exemplary tools for JUnit based component testing tool, Web application performance testing tool, and a manual testing tool. • TPTP Tracing and Profiling Tools Project: The TPTP Tracing and Profiling Tools Project addresses the tracing and profiling phases of the application lifecycle. It includes exemplary profiling tools for both single-system and distributed Java applications through a JVMPI1 monitoring agent that collects trace and profile data. A generic tool kit for probe insertion is also available.

6.1.1 Strengths Eclipse TPTP provides a flexible and extensible framework for creating and managing tests, deployments, datapools, execution histories and reports with extensions for performance, JUnit, GUI and manual testing of Java applications.

6.1.2 Weaknesses As a tool focused on application analysis, the TPTP has drawbacks related to being userfriendly, performance and providing insightful information. The framework contains views, dialogs and action items that support the capability of collecting and analyzing application performance information. The project includes exemplary profiling tools for both single-system and distributed Java applications through monitoring agents that collects trace and profile data. Although TPTP looks like a very impressive project, the tools have a few drawbacks: the line coverage tool, as of now, works only with JUnit test cases. If the test cases are written using any other framework the line coverage will not work correctly; the TPTP Junit3 test cases do not support testing of server side applications; the GUI testing tool 1 Java Virtual Machine Profiler Interface (JVMPI) is intended for tools vendors to develop profilers that work in conjunction with Sun’s Java virtual machine implementation. 3 JUnit is a unit testing framework for the Java programming language.

66

6.2. DYNATRACE

can only listen to SWT3 events. So, applications outside the eclipse workbench (like Windows explorer, Internet Explorer or even Windows dialog boxes inside the eclipse workbench) cannot be tested using this tool; and, most of the performance analysis tools are based on web applications currently and are not ideally suited for stand alone applications.

6.2 dynaTrace dynaTrace software [dynaTrace software (2010)] was founded in 2005 to address the new Application Performance Monitoring (APM) requirements being driven by the rapid acceleration of complexity in new development techniques, new application architectures and increasingly complex production environments. It is the industry’s first 3rd generation approach to application performance management. Monitoring, is only the beginning. dynaTrace has combined business transaction management (BTM), traditional monitoring, deep diagnostics from business transaction to code level for every transaction, and proactive prevention across the application development lifecycle into a single integrated system. The system is innovative, easy to use, and provides value far beyond the traditional APM tools. There are several breakthroughs dynaTrace has made which, taken in combination, provide the unique power of this system. These breakthroughs are required for today’s advanced applications, and anticipate the requirements of tomorrow’s virtualized data centers and cloud computing. Below is a brief discussion of these breakthroughs. This is not intended to be a complete description of the dynaTrace solution, but rather an introduction to dynaTrace’s innovative approach to APM. The following items describe the dynaTrace architecture: • Knowledge Sensors: KnowledgeSensors mark a transaction’s progress along its execution path and identify all transaction entry points (e.g., Java Servlet invocations) and method calls, as well as their sequence and nesting. For each transaction, the KnowledgeSensors record not only pure performance metrics (e.g., response times, CPU usage) of an execution path, but also contextual information (e.g., method arguments, return values, exceptions, log events, IO usage, network traffic, objects created, SQL calls, remote calls, and synchronization delays) in order to enable precise root-cause analysis. In this way, KnowledgeSensors provide all 3 The

Standard Widget Toolkit (SWT) is a graphical widget toolkit for use with the Java platform.

67

6.2. DYNATRACE

Figure 6.2 dynaTrace Architecture

the data included in PurePaths. Plus, you can subscribe them also as dedicated performance monitors to allow time-based performance analysis. To support a smooth deployment and an easy administration, several KnowledgeSensors can be packaged into a single KnowledgeSensorPack. • dynaTrace Agents: The lightweight dynaTrace Agent injects the instrumented byte/IL-code (original code + KnowledgeSensors) into the target application automatically; no source code changes are required. The dynaTrace point and click auto-sensor assistant and visual class browser with auto-discovery help you to maximize visibility with minimum instrumentation and overhead. The level of detail for code-level transaction tracing can also be adjusted on-the-fly without restarting the target application using HotSensorPlacement. The dynaTrace Agent is also used to gather memory and thread dumps. Plus, it can be deployed and managed from a central location and requires only minimal system resources for sustained 24x7 operation. Finally, the dynaTrace Agents are capable of collecting dynamic monitoring data in-process from JVMs and application servers via JMX. • dynaTrace Collector: The dynaTrace Collector instruments the target application through adding KnowledgeSensors into its byte/IL-code. It also reduces network payload and provides data security between the target application and the dynaTrace Server as it provides data compression and strong encryption. Thus, the Collector reduces memory and CPU usage on the dynaTrace server to collect all

68

6.2. DYNATRACE

the diagnostics data. The dynaTrace Collector allows dynaTrace to efficiently scale from small application deployments to very large server clusters. It also enables global end-to-end transaction tracing across applications deployed over WAN environments, such as SOA applications or applications accessed through rich/fat clients deployed in e.g. remote branch offices. Additionally, the Collector also executes OSGi-based monitoring plugins (e.g., Unix, Windows, SNMP monitors) and forwards results to the dynaTrace server. • dynaTrace Server: The dynaTrace Server collects all diagnostics data including transaction traces, monitors and memory/thread dumps. The Server centrally creates the PurePaths, which may span distributed JVMs/CLRs, and derives all aggregations from them, while preserving the atomic PurePath information. This allows overhead on the target application to be sustained at a very low level of only 3-5providing deep application visibility on a transaction by transaction basis. This makes dynaTrace ideally suited for usage in load test and 24x7 production environments, even in large clustered application environments. • dynaTrace Repository: The dynaTrace Repository stores historical performance data for forward- and backward-looking long range analysis. • dynaTrace Client: By providing an intuitive, platform-independent user interface, the dynaTrace Client guides IT employees through the processes of managing application performance. • Integrations API: Open interfaces allow easy integration into existing environments, such as Continuous Integration, load testing, or enterprise management systems. Plus, you can easily extend dynaTrace with custom instrumentation sensors and OSGi-based open source plugins, such as monitors, actions or notifications.

6.2.1 Strengths The dynaTrace solution is a good solution in heterogeneous cross-platform applications (J2SE/J2EE, .NET), tracing of distributed transactions across multiple tiers in a single view (e.g. in SOA applications). Easy to use, fully configurable API views and dashboards simplifying data interpretation.

69

6.3. TAU PERFORMANCE SYSTEM

6.2.2 Weaknesses The main drawback of this tool is that, currently, it is not possible to use a single tool to get all the information you need. The reason is that it is currently not possible to collect all information using one single technology. JavaScript injection is easy to roll out however, it has limitations regarding the data it can provide. Browser plug-ins require explicit deployment while enabling the deepest insight into browser behavior (not to mention the roll out challenge. Network sniffers) while being able to capture and correlate the total traffic of a user - have no insight into the browser. Synthetic transactions basically serve a slightly different purpose.

6.3 TAU Performance System Tuning and Analysis Utilities (TAU) Performance System [Shende and Malony (2006)] is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, Java, Python. TAU is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements. All C++ language features are supported including templates and namespaces. The API also provides selection of profiling groups for organizing and controlling instrumentation. The instrumentation can be inserted in the source code using an automatic instrumentor tool, at runtime in the Java virtual machine, or manually using the instrumentation API. TAU’s profile visualization tool provides graphical displays of all the performance analysis results, in aggregate and single node/context/thread forms. The user can quickly identify sources of performance bottlenecks in the application using the graphical interface. In addition, TAU can generate event traces that can be displayed with the Vampir, Paraver or JumpShot trace visualization tools. The TAU framework architecture is organized into three layers: instrumentation, measurement, and analysis, where within each layer multiple modules are available and can be configured in a flexible manner under user control. TAU supports a flexible instrumentation model that allows the user to insert performance instrumentation calling the TAU measurement API at different, multiple levels of program code representation, transformation, compilation, and execution. The key concept of the instrumentation layer is that it is here where performance events are defined. The instrumentation mechanisms in TAU support several types of performance events, including events defined by code location (e.g. routines or blocks), library interface

70

6.3. TAU PERFORMANCE SYSTEM

Figure 6.3 TAU Architecture

events, system events, and arbitrary user-defined events. TAU is also aware of events associated with message passing and multi-threading parallel execution. The instrumentation layer is used to define events for performance experiments. Thus, one output of instrumentation are information about the events for a performance experiment. This information will be used by other tools. The framework approach to TAU’s architecture design guarantees the most flexibility in configuring TAU capabilities to the requirements of the parallel performance experimentation and problem solving the user demands. In addition, it allows TAU to extend these capabilities to include the rich technology being developed by other performance tool research groups.

6.3.1 Strengths TAU is a portable and scalable parallel profiling solution, a great tool for: multiple profiling types and options, event selection and control (enabling/disabling, throttling), online profile access and sampling and online performance profile overhead compensation. TAU performance system offers support to the performance analysis in various ways, including powerful selective and multi-level instrumentation, profile and trace measurement modalities, interactive performance analysis analysis, and performance data management. The entire TAU software is available in the public domain and is actively being

71

6.4. INFRARED

maintained and updated.

6.3.2 Weaknesses The TAU set of tools is a high performance computing testing framework that has been recently updated to collect full statistics from a Java Virtual Machine, introducing platform independence. This has been used to create a Java profiler that can selectively instrument and measure parallel and distributed Java applications. The tool’s feature set is comprehensive, and its standard interface allows it to be used on a number of platforms. However, all measurement is confined to within the virtual machine, TAU cannot measure external parameters, like system-level statistics, databases, etc.

6.4 InfraRED InfraRED [InfraRED (2010)] is a tool for monitoring performance of a J2EE application and diagnosing performance problems. It collects metrics about various aspects of an application’s performance and makes it available for quantitative analysis of the application. InfraRED has the ability to monitor the complex architecture of J2EE application environments, provide detailed information for analysis and reporting, alert on performance related problems and guide you to determine the root cause of the problem. When you are trying to identify a performance issue that is causing your production application not to meet customer expectations or you are trying to proactively identify issues prior to deploying your application, InfraRED is essential to helping you save time, and ultimately ensure a better performing, more scalable Java application. InfraRED uses AOP to weave the performance monitoring code into the application. It can also be made to work with other AOP frameworks such as JBoss AOP etc. InfraRED essentially consists of three modules: • Agent • Collector • GUI The agent is embedded in the application server JVM. Instrumented applications can be run on this JVM. By instrumenting an application we mean weaving (adding) code

72

6.4. INFRARED

Figure 6.4 InfraRed Architecture

into the application so that it makes calls to the agent before and after the execution of significant joinpoints. This weaving is achieved using one of the AOP systems available - at this point we support AspectWerkz and AspectJ. Based on calls from the instrumented applications, the agent collects statistics regarding the execution of the applications. This information is aggregated and send over the network to the collector. The collector gets statistics from agents residing on various JVMs. It aggregates this data and stores is periodically into a database. The GUI is attached to the collector. It queries the collector and displays the collected statistics to a user. At this point we provide the GUI as a browser-based application, but other forms of GUI can also be written.

6.4.1 Strengths InfraRED provides statistics about various aspects of an application’s performance (Method timing, JDBC and SQL statistics, HTTP Response). This approach is non-intrusive to the overall development of the application i.e. developer need not edit any of their source files to monitor an application with InfraRED. InfraRED provides an easy to use Web UI for analyzing the metrics. Because of its easy setup and low overhead, it can be used in all environments, from performance lab

73

6.5. CHAPTER REMARKS

to production environments. Finally, InfraRED is free and open-source.

6.4.2 Weaknesses Govindraj et al. [Govindraj et al. (2001)] report that the performance overhead with InfraRED is usually between 1-5% of response times for a variety of enterprise Web applications and up to 10% if call trees are traced. Infrared is a great monitoring tool for J2EE but supports only certain application servers.

6.5 Chapter Remarks To the best of our knowledge, the Plug and Trace framework is the first work that attempts to integrate the component-based development with model-driven engineering to provide a domain-independent tracer framework. However, some ideas and techniques used in this work have been studied and derived from previous research. The mentioned tools: Eclipse TPTP, dynaTrace, TAU and InfraRED have provided us some insights about application analysis and how to integrate different applications into a common environment. We highlight the dynaTrace tool, due to the fact that this tool has well-defined architecture to analyze a wide spectrum of applications. The Plug and Trace goes one step further: our work has provided a mechanism to integrate any kind of domain into an unique architecture. One the other hand, the TPTP tool was developed in a great environment, the Eclipse IDE, This environment is gaining widespread acceptance in both the commercial and academic areas. As one of the requirements of our project, that is to support this IDE, we have provided in the chapter 5 how to extend our framework to a specific domain, called CHR, all source code was designed and implemented based on this IDE. The other tools, TAU and InfraRED, are important application analysis tools, each one in their respective goals, but to our work we took very little advantage of its features.

74

7 Conclusion

In this work we exhibited the Plug and Trace project, a generic architecture to produce and analyze trace information. This project intends to facilitate the application analysis in today’s applications, where these application have become ever more heterogeneous, distributed and globally distributed. Service oriented environments with components built by third parties, including commercial and open-source software, are commonplace. In a nutshell, this project provides an integrated environment to analyze of all kind of domain. This was achieved by providing the first domain-independent, model-driven and reusable debugging framework. In Chapter 4, we presented the goals and principles of this project, showing how to define and integrate different kind of domain in a unique environment, called Plug and Trace. Furthermore, we defined the generic trace schema that sets the foundation of the interoperability between applications and our framework, by maintaining a unique communication structure. Finally, we presented the Trace Event, showing how to model and generate the concrete classes used on the Trace Analyzers. In Chapter 5, we exemplified step by step how to extend the Plug and Trace framework to CHR domain, providing a debugging tool to this declarative language. All the steps of extension was done following the Plug and Trace Process defined in the Section 4.5.

7.1 Contributions We can list the following contributions:

75

7.1. CONTRIBUTIONS

7.1.1 Contributions to Application Analysis Over the last years, application complexity has accelerated dramatically, not just in terms of scale, sophistication and architecture, but also in development techniques, production environments, and performance expectations. The cumulative impact of these changes has created a scenario that requires a re-evaluation of traditional approaches towards application analysis. Without a new approach, this great opportunity presented by new development and production technologies may become a nightmare for guaranteeing the well management of these applications. Plug and Trace overcomes the limitations of traditional application analysis tools, as a reusable solution, meeting the new Application Analysis requirements to solve today’s increasingly complex application management challenges. No longer simply “another tool", the Plug and Trace framework is re-defining how tracer tools are built. It is a single framework supporting the entire lifecycle, providing the basic artifacts to build pluggable tracers into even most complex domains. Summarizing the contributions, Plug and Trace enables you to: • Integrate with any domain, addressing the entire application life cycle, from early testing to production application management; • Automate to do more with less, from the trace model, the framework will generate almost all artifacts necessary to build a tracer tool, eliminating redundant and manual tasks; • Improve time to market for new or enhanced tracers, by using the simple and welldefined trace architecture, it is possible to accelerate the conception of tracers tool, avoiding end-of project architecture overhauls that cause significant delays; • Build on a simple component-based, the Plug and Trace is an intuitive and easy framework to reuse and operate. This project built a generic, extensible, standardsbased tool platform upon which software developers can create specialized, differentiated, and interoperable offerings for world class analysis tools • Reduce mean-time-to-repair, through a MDE approach, the source artifacts are maintained in the model and, finally, the tracer artifacts are generated from this model, accelerating resolution/maintenance time; • Use a trace request, which means the part of the trace that the trace analyzer wants

76

7.1. CONTRIBUTIONS

to see. In other words, it consists of receiving all the execution events and analyzing them on the fly to show only the interesting information.

7.1.2 Others related contributions A proposal for the Trace Meta-Theory[5] applied to the Component-Based development was done by means of three components for trace analysis. These components facilitate the easily capturing of all relevant metrics and accompanying context data required to accurately analyze any application by using an integrated environment, going beyond all traditional methods currently available. This work provided platform-independent model (PIM) that specifies generic components for tracing. Independent of any platform, the model describes the trace events of the application without referring to ant specific operating system, hardware configuration, or even programming language. A PIM makes it easier to integrate applications and facilitates the extension of our framework to specific domains. A debugging tool was also developed as an Eclipse plug-in that provides reusable and extensible components to analyze CHR programs. Regarding to MDE, we have focused on the proposal of a framework based on four principles: quality, consistency, productivity and abstraction. Our intention is to reduce the hand coding and promote the reusability of the entire framework. To achieve these we have providde a framework to generate Java Classes and XML Schemas from UML models. Below, the four benefits derived from use MDE in our proposal: • Quality: firstly, we desire that the output code to be at least as good as what we would have written by hand. Then, the template-based approach we have adopted of today’s generators builds code that is easy to read and debug. Because of the active nature of the generator, bugs found in the output code can be fixed in the template. Code can then be re-generated to fix those bug across the board. • Consistency: second, we want the code should use consistent class, method, and argument names. This is also an area where generators excel because, after all, these is a program writing the source code. • Productivity: third, our approach should be faster to generate the code than to write it by hand. This is the first benefit that most people think of when it comes to generation. Normally, this may not be achieved on the first generation cycle. The real productivity value comes later, as you re-generate the code base to match

77

7.2. LIMITATIONS AND FUTURE WORK

changing requirements; at this point you will blow the hand-coding process out of the water in terms of productivity. • Abstraction: finally, we should be able to specify the design in an abstract form, free of implementation details. That way we can re-target the generator at a later date if we want to move to another technology platform. Through a step-by-step analyzer and a pretty-prettying view, we not only showed how to extend the Plug and Trace project, but also, we provide an user-friendly debugging tool for CHR with reasoning explanatory facilities. A generic trace schema for CHR was also provided. This generic trace will also enable any debugging tool to be defined almost independently from finite domain solvers, and conversely, tracers to be built independently from these tools.

7.2 Limitations and Future Work In terms of limitations and future work, we organize them into three areas: 1. Interoperability In the Plug and Trace framework there is an adapter to promote the integration of different kinds of applications. We have provided just a simple adapter to convert XML traces into Trace Events, called XMLTraceAdapter. It would be interesting to create other kinds of adapters, as many as possible. This will reduce the steps of extend our framework if several adapters were provided before, for example: from C to Trace Events, from CSV to Trace Events, from UML to Trace Events, etc; 2. Application Analysis In Chapter 5 we provided a simple debugging tool to analyzer CHR programs, it was just to validate and exemplify the process of extending the Plug and Trace. A depth study can be derived from it, providing several generic analyzers to other kind of domain like: Application Response Measurement, Application service management, Business transaction management, Integrated business planning, Network management, System administration, Systems management, Website monitoring.

78

7.2. LIMITATIONS AND FUTURE WORK

Due to the limitation of time, we have not implemented the trace querying mechanism, this project just specifies its high level specification. As future work, a GUI to interactively submit queries and inspect solution explanations at various levels of details can be implemented to provide the inverse communication inside the Plug and Trace framework, which will permit the analyzers receive only the trace informations requested. 3. Generic Trace Schemas When an input process send its trace information, each event follows a pre-defined trace schema. This schema defines the trace event structure understandable by the next components in the Plug and Trace framework. In this work we provided a trace schema to the constraint handling rules domain. To facilitate the integration and adoption of the Plug and Trace framework, other kinds of trace schemas can be specified, like: JEE Trace Schema by providing tracer functionality to deploy fault-tolerant, distributed, multi-tier Java software, based largely on modular components running on an application server; Parallel Computing Trace Schema, tracing a specific form of computation in which many calculations are carried out simultaneously; Application Response Measurement Trace Schema; Application Service Management Trace Schema; Business Transaction Management Trace Schema; Integrated Business Planning; Network Management Trace Schema; System Administration Trace Schema; Systems Management Trace Schema; Website Monitoring Trace Schema; among other schemas.

79

Bibliography

Aggoun, A. e. a. (2001). ECLi PS e User Manual. Release 5.3. ECLiPSe Prologs. Aoyama, M. (1998). New age of software development: How component-based software engineering changes the way of software development. In International Workshop on CBSE. Citeseer. Brown, A. (2004). Model driven architecture: Principles and practice. Software and Systems Modeling, 3(4), 314–327. Brown, A., Conallen, J., and Tropeano, D. (2005). Introduction: Models, Modeling, and Model-Driven Architecture (MDA). Model-Driven Software Development, pages 1–16. Clark, J., DeRose, S., et al. (1999). XML path language (XPath) version 1.0. Cleaveland, C. and Cleaveland, J. (2001). Program Generators with XML and Java with CD-ROM. Prentice Hall PTR Upper Saddle River, NJ, USA. Crnkovic, I., Chaudron, M., and Larsson, S. (2006). Component-based development process and component lifecycle. In Software Engineering Advances, International Conference on, pages 44–44. Croll, A. and Power, S. (2009). Complete Web Monitoring. O’Reilly Media. Czarnecki, K. and Helsen, S. (2003). Classification of model transformation approaches. In Proceedings of the 2nd OOPSLA Workshop on Generative Techniques in the Context of the Model Driven Architecture, page 15. Citeseer. Deransart & al, P. (2004). Outils d’Analyse Dynamique Pour la Programmation Par Contraintes (OADymPPaC). Technical report, Inria Rocquencourt and École des Mines de Nantes and INSA de Rennes and Université d’Orléans and Cosytec and ILOG. Projet RNTL. http://contraintes.inria.fr/OADymPPaC last acessed June 21, 2010. Deransart, P. (2008). Semantical View of Tracers and their Traces, and Applications. working Draft. Deransart, P. (2009). Conception de Trace et Applications (vers une méta-théorie des traces). Working document http://hal.inria.fr/ last acessed Juuly 02, 2010.

80

BIBLIOGRAPHY

Deransart, P. and Oliveira, R. (2009). Towards a Generic Framework to Generate Explanatory Traces of Constraint Solving and Rule-Based Reasoning. Research Report RR-7165, INRIA. Duck, G., Stuckey, P., de la Banda, M., and Holzbaur, C. (2004). The refined operational semantics of Constraint Handling Rules. Lecture notes in computer science, pages 90– 104. dynaTrace software (last acessed June 01, 2010). http://www.dynatrace.com/en/. Eclipse, T. (2007). Eclipse Test & Performance Tools Platform Project. Eriksson, H. (2004). UML 2 toolkit. Wiley. Fichman, R. and Moses, S. (1999). An incremental process for software implementation. Sloan Management Review, 40, 39–52. France, R. and Rumpe, B. (2007). Model-driven development of complex software: A research roadmap. In International Conference on Software Engineering, pages 37– 54. IEEE Computer Society Washington, DC, USA. GADOMSKI, A. (1997). Global TOGA Meta-Theory [online]. 1997-2007 [cit. 2007-1212]. Dostupn`y z WWW:< http://erg4146. casaccia. enea. it/wwwerg26701/Gad-toga. htm last acessed June 04, 2010, 2. Gao, J., Kar, G., and Kermani, P. (2004). Approaches to building self healing systems using dependency analysis. In Proceedings of IEEE/IFIP Network Operations and Management Symposium (NOMS). Goulão, M. (2005). Component-based software engineering: a quantitative approach. In Companion to the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, page 239. ACM. Govindraj, K., Narayanan, S., Thomas, B., Nair, P., and Peeru, S. (2001). On using aop for application performance management. In Fifth International Conference on Aspect-Oriented Software Development (Industry Track). InfraRED (last acessed June 01, 2010). http://infrared.sourceforge.net/versions/latest/. Judson, S., France, R., and Carver, D. (2003). Specifying model transformations at the metamodel level. WiSME@ UML, pages 2–4.

81

BIBLIOGRAPHY

Kent, S. (2002). Model driven engineering. Lecture notes in computer science, pages 286–298. Khanna, G., Beaty, K., Kar, G., and Kochut, A. (2006). Application performance management in virtualized server environments. In 10th IEEE/IFIP Network Operations and Management Symposium, pages 373–381. Langevine, L. and Ducassé, M. (2005). A Tracer Driver for Hybrid Execution Analyses. In A. Press, editor, Proceedings of the 6th Automated Debugging Symposium. L "uders, F., Lau, K., and Ho, S. (2002). Specification of software components. Building Reliable Component-based Systems. Artech House, London, pages 52–69. Mellor, S., Scott, K., Uhl, A., and Weise, D. (2002). Model-driven architecture. Lecture Notes in Computer Science, pages 290–297. Miller, J., Mukerji, J., et al. (2003). MDA Guide Version 1.0. 1. Object Management Group, 234. Ning, J. (1996). A component-based software development model. In Proceedings of the 20th Conference on Computer Software and Applications, page 389. IEEE Computer Society. Oldevik, J. (2006). MOFScript Eclipse Plug-In: Metamodel-Based Code Generation. In Eclipse Technology Workshop (EtX) at ECOOP, volume 2006. Robin, J. and Vitorino, J. (2006). ORCAS: Towards a CHR-based model-driven framework of reusable reasoning components. See Fink et al.(2006), pages 192–199. Royce, W. (1970). Managing the development of large software systems. In Proceedings of IEEE Wescon, volume 26, page 9. Rumbaugh, J., Jacobson, I., and Booch, G. (2004). Unified Modeling Language Reference Manual, The. Pearson Higher Education. Shende, S. and Malony, A. (2006). The TAU parallel performance system. International Journal of High Performance Computing Applications, 20(2), 287. Sneyrs, J., VAN WEERT, P., T., S., and L., D. K. (2003). As Time Goes By: Constraint Handling Rules. A Survey of CHR Research from 1998 to 2007. Learning and Individual Differences, pages 1–49.

82

BIBLIOGRAPHY

Vecchiola, C., Pandey, S., and Buyya, R. (2009). High-performance cloud computing: A view of scientific applications. In 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks, pages 4–16. IEEE. Vitorino, J. (2009). Model-Driven Engineering a Versatile, Extensible, Scalable Rule Engine through Component Assembly and Model Transformations. Universidade Federal de Pernambuco. CIn. Ciência da Computacão. Warmer, J. and Kleppe, A. (1998). The object constraint language: precise modeling with UML. Addison-Wesley Longman Publishing Co., Inc. Boston, MA, USA. Wielemaker, J. (2006). SWI-Prolog 5.6 Reference Manual. Department of Social Science Informatics, University of Amsterdam, Amsterdam, Marz.

83