Emmanouil Krasanakis, Emmanouil Schinas, Symeon Papadopoulos, Yiannis Kompatsiaris and Andreas Symeonidis
Information Processing & Management, pp. 102053, 2019 Jun
Local community detection is an emerging topic in network analysis that aims to detect well-connected communities encompassing sets of priorly known seed nodes. In this work, we explore the similar problem of ranking network nodes based on their relevance to the communities characterized by seed nodes. However, seed nodes may not be central enough or sufficiently many to produce high quality ranks. To solve this problem, we introduce a methodology we call seed oversampling, which first runs a node ranking algorithm to discover more nodes that belong to the community and then reruns the same ranking algorithm for the new seed nodes. We formally discuss why this process improves the quality of calculated community ranks if the original set of seed nodes is small and introduce a boosting scheme that iteratively repeats seed oversampling to further improve rank quality when certain ranking algorithm properties are met. Finally, we demonstrate the effectiveness of our methods in improving community relevance ranks given only a few random seed nodes of real-world network communities. In our experiments, boosted and simple seed oversampling yielded better rank quality than the previous neighborhood inflation heuristic, which adds the neighborhoods of original seed nodes to seeds.
Michail Papamichail, Kyriakos Chatzidimitriou, Thomas Karanikiotis, Napoleon-Christos Oikonomou, Andreas Symeonidis and Sashi Saripalle
"BrainRun: A Behavioral Biometrics Dataset towards Continuous Implicit Authentication"
Data, 4, (2), 2019 May
The widespread use of smartphones has dictated a new paradigm, where mobile applications are the primary channel for dealing with day-to-day tasks. This paradigm is full of sensitive information, making security of utmost importance. To that end, and given the traditional authentication techniques (passwords and/or unlock patterns) which have become ineffective, several research efforts are targeted towards biometrics security, while more advanced techniques are considering continuous implicit authentication on the basis of behavioral biometrics. However, most studies in this direction are performed “in vitro” resulting in small-scale experimentation. In this context, and in an effort to create a solid information basis upon which continuous authentication models can be built, we employ the real-world application “BrainRun”, a brain-training game aiming at boosting cognitive skills of individuals. BrainRun embeds a gestures capturing tool, so that the different types of gestures that describe the swiping behavior of users are recorded and thus can be modeled. Upon releasing the application at both the “Google Play Store” and “Apple App Store”, we construct a dataset containing gestures and sensors data for more than 2000 different users and devices. The dataset is distributed under the CC0 license and can be found at the EU Zenodo repository.
Michail D. Papamichail, Themistoklis Diamantopoulos and Andreas L. Symeonidis
"Software Reusability Dataset based on Static Analysis Metrics and Reuse Rate Information"
Data in Brief, 2019 Dec
The widely adopted component-based development paradigm considers the reuse of proper software components as a primary criterion for successful software development. As a result, various research efforts are directed towards evaluating the extent to which a software component is reusable. Prior efforts follow expert-based approaches, however the continuously increasing open-source software initiative allows the introduction of data-driven alternatives. In this context we have generated a dataset that harnesses information residing in online code hosting facilities and introduces the actual reuse rate of software components as a measure of their reusability. To do so, we have analyzed the most popular projects included in the maven registry and have computed a large number of static analysis metrics at both class and package levels using SourceMeter tool  that quantify six major source code properties: complexity, cohesion, coupling, inheritance, documentation and size. For these projects we additionally computed their reuse rate using our self-developed code search engine, AGORA . The generated dataset contains analysis information regarding more than 24,000 classes and 2,000 packages, and can, thus, be used as the information basis towards the design and development of data-driven reusability evaluation methodologies. The dataset is related to the research article entitled "Measuring the Reusability of Software Components using Static Analysis Metrics and Reuse Rate Information
Michail D. Papamichail , Themistoklis Diamantopoulos and Andreas L. Symeonidis
"Measuring the Reusability of Software Components using Static Analysis Metrics and Reuse Rate Information"
Journal of Systems and Software, pp. 110423, 2019 Sep
Nowadays, the continuously evolving open-source community and the increasing demands of end users are forming a new software development paradigm; developers rely more on reusing components from online sources to minimize the time and cost of software development. An important challenge in this context is to evaluate the degree to which a software component is suitable for reuse, i.e. its reusability. Contemporary approaches assess reusability using static analysis metrics by relying on the help of experts, who usually set metric thresholds or provide ground truth values so that estimation models are built. However, even when expert help is available, it may still be subjective or case-specific. In this work, we refrain from expert-based solutions and employ the actual reuse rate of source code components as ground truth for building a reusability estimation model. We initially build a benchmark dataset, harnessing the power of online repositories to determine the number of reuse occurrences for each component in the dataset. Subsequently, we build a model based on static analysis metrics to assess reusability from five different properties: complexity, cohesion, coupling, inheritance, documentation and size. The evaluation of our methodology indicates that our system can effectively assess reusability as perceived by developers.
Emmanouil G. Tsardoulias, M. Protopapas, Andreas L. Symeonidis and Loukas Petrou
Journal of Intelligent & Robotic Systems, 2019 Jul
The alignment of two occupancy grid maps generated by SLAM algorithms is a quite researched problem, being an obligatory step either for unsupervised map merging techniques or for evaluation of OGMs (Occupancy Grid Maps) against a blueprint of the environment. This paper provides an overview of the existing automatic alignment techniques of two occupancy grid maps that employ pattern matching. Additionally, an alignment pipeline using local features and image descriptors is implemented, as well as a method to eliminate erroneous correspondences, aiming at producing the correct transformation between the two maps. Finally, map quality metrics are proposed and utilized, in order to quantify the produced map’s correctness. A comparative analysis was performed over a number of image processing and OGM-oriented detectors and descriptors, in order to identify the best combinations for the map evaluation problem, performed between two OGMs or between an OGM and a Blueprint map.
Christoforos Zolotas, Kyriakos C. Chatzidimitriou and Andreas L. Symeonidis
"RESTsec: a low-code platform for generating secure by design enterprise services"
Enterprise Information Systems, pp. 1-27, 2018 Mar
In the modern business world it is increasingly often that Enterprises opt to bring their business model online, in their effort to reach out to more end users and increase their customer base. While transitioning to the new model, enterprises consider securing their data of pivotal importance. In fact, many efforts have been introduced to automate this ‘webification’ process; however, they all fall short in some aspect: a) they either generate only the security infrastructure, assigning implementation to the developers, b) they embed mainstream, less powerful authorisation schemes, or c) they disregard the merits of the dominating REST architecture and adopt less suitable approaches. In this paper we present RESTsec, a Low-Code platform that supports rapid security requirements modelling for Enterprise Services, abiding by the state of the art ABAC authorisation scheme. RESTsec enables the developer to seamlessly embed the desired access control policy and generate the service, the security infrastructure and the code. Evaluation shows that our approach is valid and can help developers deliver secure by design enterprise services in a rapid and automated manner.
George Mamalakis, Christos Diou, Andreas L. Symeonidis and Leonidas Georgiadis
"Of daemons and men: reducing false positive rate in intrusion detection systems with file system footprint analysis"
Neural Computing and Applications, 2018 May
In this work, we propose a methodology for reducing false alarms in file system intrusion detection systems, by taking into account the daemon’s file system footprint. More specifically, we experimentally show that sequences of outliers can serve as a distinguishing characteristic between true and false positives, and we show how analysing sequences of outliers can lead to lower false positive rates, while maintaining high detection rates. Based on this analysis, we developed an anomaly detection filter that learns outlier sequences using k-nearest neighbours with normalised longest common subsequence. Outlier sequences are then used as a filter to reduce false positives on the FI2DS file system intrusion detection system. This filter is evaluated on both overlapping and non-overlapping sequences of outliers. In both cases, experiments performed on three real-world web servers and a honeynet show that our approach achieves significant false positive reduction rates (up to 50 times), without any degradation of the corresponding true positive detection rates.
Themistoklis Diamantopoulos, Michael Roth, Andreas Symeonidis and Ewan Klein
"Software requirements as an application domain for natural language processing"
Language Resources and Evaluation, pp. 1-30, 2017 Feb
Mapping functional requirements first to specifications and then to code is one of the most challenging tasks in software development. Since requirements are commonly written in natural language, they can be prone to ambiguity, incompleteness and inconsistency. Structured semantic representations allow requirements to be translated to formal models, which can be used to detect problems at an early stage of the development process through validation. Storing and querying such models can also facilitate software reuse. Several approaches constrain the input format of requirements to produce specifications, however they usually require considerable human effort in order to adopt domain-specific heuristics and/or controlled languages. We propose a mechanism that automates the mapping of requirements to formal representations using semantic role labeling. We describe the first publicly available dataset for this task, employ a hierarchical framework that allows requirements concepts to be annotated, and discuss how semantic role labeling can be adapted for parsing software requirements.
Themistoklis Diamantopoulos and Andreas Symeonidis
Enterprise Information Systems, pp. 1-22, 2017 Dec
Enhancing the requirements elicitation process has always been of added value to software engineers, since it expedites the software lifecycle and reduces errors in the conceptualization phase of software products. The challenge posed to the research community is to construct formal models that are capable of storing requirements from multimodal formats (text and UML diagrams) and promote easy requirements reuse, while at the same time being traceable to allow full control of the system design, as well as comprehensible to software engineers and end users. In this work, we present an approach that enhances requirements reuse while capturing the static (functional requirements, use case diagrams) and dynamic (activity diagrams) view of software projects. Our ontology-based approach allows for reasoning over the stored requirements, while the mining methodologies employed detect incomplete or missing software requirements, this way reducing the effort required for requirements elicitation at an early stage of the project lifecycle.
Miltiadis G. Siavvas, Kyriakos C. Chatzidimitriou and Andreas L. Symeonidis
"QATCH - An adaptive framework for software product quality assessment"
Expert Systems with Applications, 2017 May
The subjectivity that underlies the notion of quality does not allow the design and development of a universally accepted mechanism for software quality assessment. This is why contemporary research is now focused on seeking mechanisms able to produce software quality models that can be easily adjusted to custom user needs. In this context, we introduce QATCH, an integrated framework that applies static analysis to benchmark repositories in order to generate software quality models tailored to stakeholder specifications. Fuzzy multi-criteria decision-making is employed in order to model the uncertainty imposed by experts’ judgments. These judgments can be expressed into linguistic values, which makes the process more intuitive. Furthermore, a robust software quality model, the base model, is generated by the system, which is used in the experiments for QATCH system verification. The paper provides an extensive analysis of QATCH and thoroughly discusses its validity and added value in the field of software quality through a number of individual experiments.
Athanassios M. Kintsakis, Fotis E. Psomopoulos, Andreas L. Symeonidis and Pericles A. Mitkas
"Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC) environments"
SoftwareX, 6, pp. 217-224, 2017 Sep
Hermes introduces a new ”describe once, run anywhere” paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.
Cezary Zielinski, Maciej Stefanczyk, Tomasz Kornuta, Maksym Figat, Wojciech Dudek, Wojciech Szynkiewicz, Wlodzimierz Kasprzak, Jan Figat, Marcin Szlenk, Tomasz Winiarski, Konrad Banachowicz, Teresa Zielinska, Emmanouil G. Tsardoulias, Andreas L. Symeonidis, Fotis E. Psomopoulos, Athanassios M. Kintsakis, Pericles A. Mitkas, Aristeidis Thallas, Sofia E. Reppou, George T. Karagiannis, Konstantinos Panayiotou, Vincent Prunet, Manuel Serrano, Jean-Pierre Merlet, Stratos Arampatzis, Alexandros Giokas, Lazaros Penteridis, Ilias Trochidis, David Daney and Miren Iturburu
"Variable structure robot control systems: The RAPP approach"
Robotics and Autonomous Systems, 94, pp. 226-244, 2017 May
This paper presents a method of designing variable structure control systems for robots. As the on-board robot computational resources are limited, but in some cases the demands imposed on the robot by the user are virtually limitless, the solution is to produce a variable structure system. The task dependent part has to be exchanged, however the task governs the activities of the robot. Thus not only exchange of some task-dependent modules is required, but also supervisory responsibilities have to be switched. Such control systems are necessary in the case of robot companions, where the owner of the robot may demand from it to provide many services.