TY - GEN
T1 - Pinset
T2 - 11th International Conference on the Quality of Information and Communications Technology, QUATIC 2018
AU - De La Vega, Alfonso
AU - Sanchez, Pablo
AU - Kolovos, Dimitrios S.
N1 - This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.
PY - 2018/12/26
Y1 - 2018/12/26
N2 - Data mining techniques have been successfully applied to software quality analysis and assurance, including quality of modeling artefacts. Before such techniques can be used, though, data under analysis commonly need to be formatted into two-dimensional tables. This constraint is imposed by data mining algorithms, which typically require a collection of records as input for their computations. The process of extracting data from the corresponding sources and formatting them properly can become error-prone and cumbersome. In the case of models, this process is mostly carried out through scripts written in a model management language, such as EOL or ATL. To improve this situation, we present Pinset, a domain-specific language devised for the extraction of tabular datasets from software models. Pinset offers a tailored syntax and built-in facilities for common activities in dataset extraction. For evaluation, Pinset has been used on UML class diagrams to calculate metrics that can be employed as input for several fault-prediction algorithms. The use of Pinset for this calculations led to more compact and high-level specifications when compared to equivalent scripts written in generic model management languages.
AB - Data mining techniques have been successfully applied to software quality analysis and assurance, including quality of modeling artefacts. Before such techniques can be used, though, data under analysis commonly need to be formatted into two-dimensional tables. This constraint is imposed by data mining algorithms, which typically require a collection of records as input for their computations. The process of extracting data from the corresponding sources and formatting them properly can become error-prone and cumbersome. In the case of models, this process is mostly carried out through scripts written in a model management language, such as EOL or ATL. To improve this situation, we present Pinset, a domain-specific language devised for the extraction of tabular datasets from software models. Pinset offers a tailored syntax and built-in facilities for common activities in dataset extraction. For evaluation, Pinset has been used on UML class diagrams to calculate metrics that can be employed as input for several fault-prediction algorithms. The use of Pinset for this calculations led to more compact and high-level specifications when compared to equivalent scripts written in generic model management languages.
KW - Data Mining
KW - Domain-Specific Languages
KW - Model-Driven Engineering
KW - Software Quality
UR - http://www.scopus.com/inward/record.url?scp=85061315485&partnerID=8YFLogxK
U2 - 10.1109/QUATIC.2018.00021
DO - 10.1109/QUATIC.2018.00021
M3 - Conference contribution
AN - SCOPUS:85061315485
T3 - Proceedings - 2018 International Conference on the Quality of Information and Communications Technology, QUATIC 2018
SP - 83
EP - 91
BT - Proceedings - 2018 International Conference on the Quality of Information and Communications Technology, QUATIC 2018
PB - IEEE
Y2 - 4 September 2018 through 7 September 2018
ER -