Software & Algorithms

This page provides information on all software, algorithms and projects developed by ISSEL members internally throughout the years. It is a directory of individual pages on personal projects, PhDs, competitions, platforms, algorithms, additional material of ISSEL publications etc.

Requirements Modeling and Reuse using Ontology-driven Recommendations

This page keeps the files required for reproducing the results of the paper:
Requirements Modeling and Reuse using Ontology-driven Recommendations
submitted to the Special Issue: Model-driven data-intensive Enterprise Information Systems in the Journal of Enterprise Information Systems

The ontologies for the static and dynamic view of software projects are given below:

The evaluation involves a set of software projects from various sources.
For these projects, the following elements are provided:

Finally, more detailed information about the methods used in this paper can be found in deliverable D2.4 of S-CASE.

Requirements Dataset for Specification Extraction

Mapping functional requirements to specifications is one of the more challenging tasks of the software development process. An interesting line of work involves using Semantic Role Labeling techniques to automate this task. The effectiveness of such approaches in requirements engineering scenarios has to be assessed using realistic datasets with functional requirements. In this context, we provide the dataset we have crafted for researchers to evaluate their systems in requirements-to-specifications scenarios and reproduce our findings available at:
Themistoklis Diamantopoulos, Michael Roth, Andreas Symeonidis, and Ewan Klein,
Software Requirements as an Application Domain for Natural Language Processing,
which has been sent to the Journal of Language Resources and Evaluation.
You may find the dataset here.

Reusability Dataset for Component Reuse

The problem of reusing software components has led to the creation of several specialized source code recommendation systems. These systems, however, do not usually assess the reusability of the retrieved components, i.e. the extent to which each component can be reused. In this context, we provide the code reuse quality dataset we have marked, in order for researchers to evaluate their systems and reproduce our findings available at:
Themistoklis Diamantopoulos, Klearchos Thomopoulos and Andreas Symeonidis. QualBoa: Quality-aware Recommendations of Source Code Components, which has been sent to the Mining Challenge of the 13th International Conference on Mining Software Repositories (MSR 2016).
The marked dataset and the component retrieval query are available here.

Dataset for Test-Driven Reuse Recommendation Systems

Finding reusable software components is a challenging task. In the contest of Test-Driven Development, several source code Recommendation Systems assess the functionality of software components using test cases. The effectiveness of these Test-Driven Reuse Recommendation Systems has to be assessed using realistic datasets. Thus, we provide the dataset and the relevant case study we have crafted for researchers to evaluate their systems in code reuse scenarios and reproduce our findings available at:
Nikolaos Katirtzis, Themistoklis Diamantopoulos and Andreas Symeonidis, Mantissa: A Recommendation System for Test-Driven Code Reuse, which has been sent to the International Journal on Software Tools for Technology Transfer.
You may find the dataset here.
You may find the case study project here.

Dataset for Software Component Reuse

Several research efforts in the area of software development have been directed towards locating reusable components. The effectiveness of Code Search Engines and Recommendation Systems in Software Engineering in code reuse scenarios has to be assessed using realistic datasets. In this context, we provide the dataset we have crafted for researchers to evaluate their systems in code reuse scenarios and reproduce our findings available at:
Themistoklis Diamantopoulos and Andreas Symeonidis, AGORA: A Search Engine for Source Code Reuse, which has been sent to the SoftwareX Journal.
You may find the dataset here.

Mertacor

Team Mertacor with agent Mertacor participates in the international Trading Agent Competition (TAC) from 2003 and on in several games developed by the TAC community. Information on MerTACor

RDOTE

R. (RDOTE – Relational Database to Ontology Transformation Engine) is a friendly and powerful framework for transforming relational databases to semantic web data. Users can connect to multiple databases and create mappings to their ontology schemata. Visit R.’s homepage on SourceForge

Dataset for Software Bug Detection

Locating software bugs is a difficult task, especially if they do not lead to crashes. Current research on automating non-crashing bug detection dictates collecting function call traces and representing them as graphs, and reducing the graphs before applying a subgraph mining algorithm. A ranking of potentially buggy functions is derived using frequency statistics for each node (function) in the correct and incorrect set of traces. Although most existing techniques are effective, they do not achieve scalability. Additionally, in most cases, it difficult to find and reuse datasets containing software bugs. In this context, we provide the dataset we have crafted, for researchers to test their approaches and reproduce our findings available at: Themistoklis Diamantopoulos and Andreas Symeonidis, “Towards Scalable Bug Localization using the Edit Distance of Call Traces”, to be presented at the Eighth International Conference on Software Engineering Advances (ICSEA 2013), October 27 – November 1, 2013 – Venice, Italy. You may find the dataset (along with a readme file) here.Following our new paper: Themistoklis Diamantopoulos and Andreas Symeonidis, “”Localizing Software Bugs using the Edit Distance of Call Traces”” that is submitted to the International Journal On Advances in Software, we provide a revised version of our dataset with different types of bugs. You can find the revised dataset here.

Supervised LCS for Multi-label Classification

In recent years, multi-label classification has attracted a significant body of research, motivated by real-life applications, such as text classification and medical diagnoses. Although sparsely studied in this context, Learning Classifier Systems are naturally well-suited to multi-label classification problems, whose search space typically involves multiple highly specific niches.
This is the motivation behind our work that introduces a generalized multi-label rule format – allowing for flexible label-dependency modeling, with no need for explicit knowledge of which correlations to search for – and uses it as a guide for further adapting the general Michigan-style supervised Learning Classifier System framework.
The integration of the aforementioned rule format and framework adaptations results in a novel algorithm for multi-label classification, namely the Multi-Label Supervised Learning Classifier System (MLS-LCS). MLS-LCS has been studied through a set of properly defined artificial problem and has also been thoroughly evaluated on a set of multi-label datasets, where it was found competitive to other state-of-the-art multi-label classification methods.

The current implementation corresponds to the version of the MLS-LCS algorithm originally presented in:

  • Allamanis, A., Tzima, F. A., & Mitkas, P. A. (2013). Effective Rule-Based Multi-label Classification with Learning Classifier Systems. In M. Tomassini, A. Antonioni, F. Daolio, and P. Buesser, editors, Adaptive and Natural Computing Algorithms, Lecture Notes in Computer Science, Volume 7824, pages 466–476, Springer Berlin Heidelberg, 2013.

and further improved in

  • Tzima, F.A., Allamanis, M., Filotheou, A., & Mitkas, P. A. (Under review). Inducing Generalized Multi-Label Rules with Learning Classifier Systems. Evolutionary Computation.

More information on ML-SLCS can be found here.

S.Co.R.E. (Source Code Rating Estimator)

The popularity of open source software repositories and the highly adopted paradigm of software reuse have led to the development of several tools that aspire to assess the quality of source code. S.Co.R.E. is a source code quality estimation system that relates quality with source code metrics. The ground truth behind S.Co.R.E. lies in the fact that the popularity of software components, as perceived by developers, can be considered as an indicator of software quality. S.Co.R.E. uses code quality evaluation models in order to decide whether a given source code component is of high quality (exceeds minimum quality thresholds). If so, the quality is estimated in a quantified manner by computing a quality score. Also, S.Co.R.E. can be used as a bad coding practices detection tool.

The context of our work is available at:
Michail Papamichail, Themistoklis Diamantopoulos and Andreas Symeonidis. User-Perceived Source Code Quality Estimation based on Static Analysis Metrics, which has been sent to the 2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016).

S.Co.R.E. can be found here.