Pinset: A DSL for extracting datasets from models for data mining-based quality analysis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data mining techniques have been successfully applied to software quality analysis and assurance, including quality of modeling artefacts. Before such techniques can be used, though, data under analysis commonly need to be formatted into two-dimensional tables. This constraint is imposed by data mining algorithms, which typically require a collection of records as input for their computations. The process of extracting data from the corresponding sources and formatting them properly can become error-prone and cumbersome. In the case of models, this process is mostly carried out through scripts written in a model management language, such as EOL or ATL. To improve this situation, we present Pinset, a domain-specific language devised for the extraction of tabular datasets from software models. Pinset offers a tailored syntax and built-in facilities for common activities in dataset extraction. For evaluation, Pinset has been used on UML class diagrams to calculate metrics that can be employed as input for several fault-prediction algorithms. The use of Pinset for this calculations led to more compact and high-level specifications when compared to equivalent scripts written in generic model management languages.

Original languageEnglish
Title of host publicationProceedings - 2018 International Conference on the Quality of Information and Communications Technology, QUATIC 2018
PublisherIEEE
Pages83-91
Number of pages9
ISBN (Electronic)9781538658413
DOIs
Publication statusPublished - 26 Dec 2018
Event11th International Conference on the Quality of Information and Communications Technology, QUATIC 2018 - Coimbra, Portugal
Duration: 4 Sept 20187 Sept 2018

Publication series

NameProceedings - 2018 International Conference on the Quality of Information and Communications Technology, QUATIC 2018

Conference

Conference11th International Conference on the Quality of Information and Communications Technology, QUATIC 2018
Country/TerritoryPortugal
CityCoimbra
Period4/09/187/09/18

Bibliographical note

This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.

Keywords

  • Data Mining
  • Domain-Specific Languages
  • Model-Driven Engineering
  • Software Quality

Cite this