Descartes Query Language (DQL)

From SE-Public-Wiki
Jump to: navigation, search
Descartes Query Language

Introduction

Over the past few decades, many performance modeling formalisms and prediction techniques for software architectures have been developed in the performance engineering community. However, using a performance model to predict the performance of a software system normally requires extensive experience with the respective modeling formalism and involves a number of complex and time consuming manual steps. Our approach is a generic declarative interface to performance prediction techniques that simplifies and automates the process of using architecture-level software performance models for performance analysis. The Descartes Query Language (DQL) is a language to express the demanded performance metrics for prediction as well as the goals and constraints in the specific prediction scenario[1]. It reduces the manual effort and the learning curve when working with performance models by a unified interface independent of the employed modeling formalism. DQL is part of our ongoing work towards self-aware computing systems and to provide means to cope with the growing complexity through the increasing amount of deployed software systems, integrated abstraction layers and required computing resources under operation in modern dynamic data centers.

DQL is the foundation for our future long-term research on a Performance Engineering Query Language (PeQL). PeQL tackles not just performance metric-centric expressions, but also questions related to the development of self-aware computing systems and the establishment of lean management processes in dynamic data centers. The main parts of PeQL are (i) a language with expressions for performance-centric result descriptions and goal-oriented questions (e.g. queries for performance queries, performance issues or what-if questions), (ii) a language to control performance-related tasks (e.g. model extraction, dynamic instrumentation and model calibration), and (iii) another language to modify descriptive performance model instances on-the-fly.

Usage Scenarios

The motivation for the development of DQL is based on different usage scenarios where performance predictions are used. Following to the Descartes Vision of self-aware computing environments[2], we address the following usage scenarios with DQL: (i) the Design Time when a software architect composes components to form a software system, (ii) the Deployment Time during which the assembly of components is deployed on hardware resources, and (iii) the System Run-Time when Service-Level Agreements (SLAs) need to be satisfied to server customer workloads.


Language

DQL has a declarative textual syntax to represent queries. The language structure of DQL is lent from the Structured Query Language (SQL), but differs conceptually. In DQL, there is no relational model. All queries in DQL belong to query classes, that group similar expressions by their semantics. In the following, we introduce query classes of DQL in straight-forward manner. The necessary steps to conduct a performance analysis using DQL are (i) to obtain knowledge about the structure of a referenced descriptive performance model, (ii) to obtain knowledge about available performance metrics for specific performance-relevant model entities, and (iii) to execute performance predictions and/or to extract the performance metrics of interest.

Model Structure Queries

A Model Structure Query is used to obtain information on the structure of a performance model. We focus on typical architecture-level performance models that consist of performance-relevant entities to model resources and services. By means of DQL, a resource is an entity that is demanded by a service to process a given request of a user, i.e. to process workload. When users execute Model Strucuture Queries, they (i) obtain a type-mapping of model entities to the means of DQL and (ii) they obtain information about available performance metrics for given entities. The type-mapping to DQL is necessary to bridge from a specific performance model to the means to DQL and to obtain the absolute identifiers of model instances. Subsequently, when users identified their demanded model entities, they can proceed to obtain the available performance metrics for specific model entities. As DQL is designed independently of any specific performance modeling formalism or performance prediction approach, such interpretations of model instances are necessary during run-time as the language cannot contain such information statically without a significant loss of flexibility.

Example: List all Entities

The following query returns the identifiers of all performance-relevant entities that are part of a performance model. In the second line of the query, the users chooses which DQL Connector to use to access a performance model at any kind of location supported by the DQL Connector, i.e. a file on a file system or a model instance persisted in a database. Then, the DQL Connector executes the listing operation and all resources and services are reported to the user.

LIST ENTITIES
USING connector@'modelLocation';

Example: List all Metrics for given Entities

This example is the subsequent step to start a performance analysis in DQL. Here, the user requests a listing of all performance metrics that are available for specific, performance-relevant entities. The referenced DQL Connector will interpret the performance model instance and determine which metrics can be calculated through the available performance prediction tools or extracted from performance data repositories.

LIST METRICS (RESOURCE 'id1' AS cpu, SERVICE 'id2' AS webService)
USING connector@'modelLocation';

Performance Metrics Queries

A Performance Metrics Query is used to control performance predictions and to extract the demanded performance metrics. The declarative language design of DQL simplifies the tasks that are typically manually executed by users. Common approaches for performance predictions force users to (i) prepare and calibrate descriptive performance models, (ii) configure model-to-model transformations from a descriptive performance model into a predictive performance model, (iii) start the simulation or solving of the predictive performance model and, finally, after the the process has completed, (iv) users can extract the demanded performance metrics manually.

In case of Performance Metrics Queries in DQL, users can rely on the description of their demanded result, specify the relevant model entities in the descriptive performance model to tailor the transformation process and finally specify which descriptive performance model to use. All manual tasks are hidden by the components of DQL and users obtain a tailored result set, which contains only their demanded performance metrics. As DQL is designed independent of a specific performance prediction or modeling approach, the structure of Performance Metrics Queries is generic and once a user is used to DQL, he can employ DQL even for different performance prediction approaches. The developers of performance prediction approaches can provide a DQL Connector and one a DQL Connector is made available, users can use other approaches without the need to learn new syntaxes, semantics or to adapt custom tools for result processing to new result formats.

Performance Metrics Queries can be extended to reflect dynamics in descriptive performance models through so-called Degrees-of-Freedom (DoFs). In descriptive performance models DoFs can specify the valid configuration space of model entities, e.g. a resource can model a compute server that may consist of one to four CPUs or an amount of RAM ranging from one to 64 GBs. These configuration options typically arise in dynamic computing environments like in Cloud Computing. Here, users are typically interested in finding a suitable sizing for such a compute server to deploy a software system onto it while constraints like Service-Level Agreements (SLAs) for the software system in operation are not violated. A SLA can be the response time of a service, e.g. to order a product from a web shop or to purchase stocks, of the software system that is accessed by users.

Example: Computation of Performance Metrics

This is example shows how to obtain performance metrics using DQL. As the user has already discovered the available performance-relevant entities and performance metrics, the user can request the computation of performance metrics through the DQL Connector. Here, the utilization of an entity aliased as cpu is requested as a performance metric. The DQL Connector hides all modeling formalism-specific tasks and returns the resulting performance metrics in a tailored result set as it has been requested by the user.

SELECT cpu.utilization, webService.responseTime
FOR RESOURCE 'id1' AS cpu, SERVICE 'id2' AS webService
USING connector@'modelLocation';

Example: Constrained Computation of Performance Metrics

In this query the prior query is extended with a constraint. In DQL, constraints specify a trade-off for the computation of performance metrics. At the system run-time, users may be forced to obtain performance metrics within strict time bounds. This constraint might be satisfied at the price of a lower accuracy by less detailed transformations from an descriptive performance model into a predictive performance model or to use cached results.

SELECT cpu.utilization, webService.responseTime
CONSTRAINED AS "fastResponse"
FOR RESOURCE 'id1' AS cpu, SERVICE 'id2' AS webService
USING connector@'modelLocation';

Example: Evaluation of Degrees-of-Freedom

In this query again the prior query for the computation of performance metrics is extended. Here, the parameter space of a Degree-of-Freedom (DoF) is modified and varied. For each variation, one result set of the requested performance metrics is returned to the user. In this case, the number of users accessing a system is the referenced DoF and a vector indicating one, 100 and 1000 users is used to specify the amount of users that should be used in simulation of the performance model instance.

SELECT cpu.utilization, webService.responseTime
EVALUATE DOF
  VARYING 'id3' AS userWorkload <1, 100, 1000>
FOR RESOURCE 'id1' AS cpu, SERVICE 'id2' AS webService
USING connector@'modelLocation';

Performance Issue Queries

A Performance Issue Query automates the interpretation of performance metrics and a descriptive performance model to identify issues, e.g. bottlenecks[3]. Currently, we work on this query class as a first step towards Goal-oriented Queries. Opposed to Performance Metrics Queries, Performance Issue Queries and superior Goal-oriented Queries do not focus on performance metrics as a result, but enable users to specify What-If Questions[4][5] and the results provide insight to optimization problems, reconfiguration scenarios or systems management challenges.

Case Study

We evaluated DQL in several case studies to show how the DQL can be employed as a declarative language in a performance analysis. The case studies present query examples and describe the expected result. A comprehensive evaluation and an exemplary workflow how to use DQL can be found in[1]. We provide another examplary case study for a performance analysis with DQL and the Palladio Component Model (PCM) as underlying performance modeling formalism and the Palladio Bench as tool chain [6]. For this case study, we employ the representative MediaStore Example that is part of the Palladio distribution.

Architecture

DQL is built on top of an extensible architecture to integrate existing tools and to provide a single, unified interface for users. Internally, DQL is based on Java Technology, Xtext to generate the language infrastructure and OSGi to encapsulate components. The main components in DQL are introduced briefly in the following.

File:Dql-architecture.png
DQL Architecture
File:Dql-editor.png
Eclipse-based DQL Editor
  • DQL Language and Editor (LE): The DQL LE consists of the language infrastructure based on Xtext. The infrastructure consists of a parser, a model-based representation of queries, an Eclipse-based editor and an Application Programming Interface (API). Users access DQL through this component by using its textual syntax. The DQL Editor is customized to provide content asssist, especially for contents extracted from referenced model instances, and syntax highlighting. Users can submit their queries directly from the DQL Editor to the DQL QEE and visualize the results.
  • DQL Query Execution Engine (QEE): The DQL QEE is the main query processor and controls the execution of queries. It bundles all tasks that are independent of the underlying performance modeling formalism and performance prediction tools. The DQL QEE provides additional functionality such as the computation of aggregates on top of performance metrics and the interpretation of a temporal dimension in performance models.
  • DQL Connector: Supplementary to the DQL QEE, a DQL Connector provides the means to control a specific performance modeling formalism and the necessary tools. In a DQL Environment, multiple DQL Connectors can be deployed and users can select their demanded DQL Connector by referencing it in queries through an mandatory identifier. The implementation of a DQL Connector relies directly on a specific modeling formalism or performance prediction technique. Thus, DQL Connectors employ specific tool chains to provide the necessary means to control and embed other approaches in DQL. We highly appreciate users to contact us for assistance during the development of their customized DQL Connectors.
  • DQL Connector Registry (CR): The DQL CR manages the coexistence of multiple DQL Connectors in a DQL Environment. It leverages the OSGi Bundle Lifecycle and the OSGi Service Layer to maintain the availability of DQL Connectors. This component is designed to be light-weight with a reasonable memory footprint and low computation overhead.

Downloads

DQL is not yet released publicly. Currently, we are integrating Eclipse Kepler and prepare the release of source codes and binary builds.

References

  1. 1.0 1.1 (p. 25 ff) Cite error: Invalid <ref> tag; name "Go2013" defined multiple times with different content
  2. Kounev, S.; Brosig, F.; Huber, N. and Reussner, R., "Towards self-aware performance and resource management in modern service-oriented systems." In Proceedings of the 7th IEEE International Conference on Services Computing (SCC 2010), July 5-10, Miami, Florida, USA, Miami, Florida, USA, July 5-10, 2010. IEEE Computer Society. July 2010. [ bib | .pdf]
  3. Franks, G.; Petriu, D.; Woodside, M.; Jing Xu; Tregunno, P., "Layered Bottlenecks and Their Mitigation," Quantitative Evaluation of Systems, 2006. QEST 2006. Third International Conference on , vol., no., pp.103,114, 11-14 Sept. 2006 [ http]
  4. Thereska, E.; Narayanan, D.; Ganger, G.R., "Towards Self-Predicting Systems: What If You Could Ask "What-If"?," Database and Expert Systems Applications, 2005. Proceedings. Sixteenth International Workshop on , vol., no., pp.196,200, 26-26 Aug. 2005 [ http]
  5. Singh, R.; Shenoy, P.; Natu, M; Sadaphal, V. and Vin, H., "Predico: a system for what-if analysis in complex data center applications." In Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware (Middleware'11), 2011, Fabio Kon and Anne-Marie Kermarrec (Eds.). Springer-Verlag, Berlin, Heidelberg, 123-142. [ http]
  6. Gorsler, F., Brosig, F. and Kounev S., "Controlling the Palladio Bench using the Descartes Query Language". In Proceedings of the Joint Kieker/Palladio Days 2013. to appear


Contact

If you have any questions, please contact Jürgen Walter.