D. Geromichalos, M.Azkarate, E. Tsardoulias, L. Gerdes, L. Petrou and C. Perez Del Pulgar
Journal of Field Robotics, pp. 1-18, 2020 Feb
This paper describes a novel approach to simultaneous localization and mapping (SLAM) techniques applied to the autonomous planetary rover exploration scenario to reduce both the relative and absolute localization errors, using two well‐proven techniques: particle filters and scan matching. Continuous relative localization is improved by matching high‐resolution sensor scans to the online created local map. Additionally, to avoid issues with drifting localization, absolute localization is globally corrected at discrete times, according to predefined event criteria, by matching the current local map to the orbiter's global map. The resolutions of local and global maps can be appropriately chosen for computation and accuracy purposes. Further, the online generated local map, of the form of a structured elevation grid map, can also be used to evaluate the traversability of the surrounding environment and allow for continuous navigation. The objective of this study is to support long‐range low‐supervision planetary exploration. The implemented SLAM technique has been validated with a data set acquired during a field test campaign performed at the Teide Volcano on the island of Tenerife, representative of a Mars/Moon exploration scenario.
A. Tzitzis, S. Megalou, S. Siachalou, E. Tsardoulias, A. Filotheou, T. Yioultsis, and A. G. Dimitriou
IEEE Journal of Radio Frequency Identification, 2020 Jun
In this work, we present a method for 3D localization of RFID tags by a reader-equipped robot with a single antenna. The robot carries a set of sensors, which enable it to create a map of the environment and locate itself in it (Simultaneous Localization and Mapping -SLAM). Then we exploit the collected phase measurements to localize large tag populations in real-time. We show that by forcing the robot to move along non-straight trajectories, thus creating non-linear synthetic apertures, the circular ambiguity of the possible tag’s locations is eliminated and 3D localization is accomplished. A reliability metric is introduced, suitable for real-time assessment of the localization error. We investigate how the curvature of the robot’s trajectory affects the accuracy under varying multipath conditions. It is found that increasing the trajectory’s slope and number of turns improves the accuracy of the method. We introduce a phase model that accounts for the effects of multipath and derive the closed form expression of the resultant’s phase probability density function. Finally, the proposed method is extended when multiple antennas are available. Experimental results in a "multipath-rich" indoor environment demonstrate a mean 3D error of 35cm, achieved in a few seconds.
Michail D. Papamichail and Andreas L. Symeonidis
"A Generic Methodology for Early Identification of Non-Maintainable Source Code Components through Analysis of Software Releases"
Information and Software Technology, 118, pp. 106218, 2020 Feb
Contemporary development approaches consider that time-to-market is of utmost importance and assume that software projects are constantly evolving, driven by the continuously changing requirements of end-users. This practically requires an iterative process where software is changing by introducing new or updating existing software/user features, while at the same time continuing to support the stable ones. In order to ensure efficient software evolution, the need to produce maintainable software is evident. In this work, we argue that non-maintainable software is not the outcome of a single change, but the consequence of a series of changes throughout the development lifecycle. To that end, we define a maintainability evaluation methodology across releases and employ various information residing in software repositories, so as to decide on the maintainability of software. Upon using the dropping of packages as a non-maintainability indicator (accompanied by a series of quality-related criteria), the proposed methodology involves using one-class-classification techniques for evaluating maintainability at a package level, on four different axes each targeting a primary source code property: complexity, cohesion, coupling, and inheritance. Given the qualitative and quantitative evaluation of our methodology, we argue that apart from providing accurate and interpretable maintainability evaluation at package level, we can also identify non-maintainable components at an early stage. This early stage is in many cases around 50% of the software package lifecycle. Based on our findings, we conclude that modeling the trending behavior of certain static analysis metrics enables the effective identification of non-maintainable software components and thus can be a valuable tool for the software engineers.
Antonis G. Dimitriou, Stavroula Siachalou, Emmanouil Tsardoulias and Loukas Petrou
"Robotics Meets RFID for Simultaneous Localization (of Robots and Objects) and Mapping (SLAM) – A Joined Problem"
Charpter:7, pp. -, John Wiley & Sons, Inc., 2020 Feb
Localization of wirelessly powered devices is essential for many applications related to the Internet of Things and Ubiquitous Computing. The chapter is focused on deploying a moving robotic platform, i.e. a robot, which hosts radio frequency identification (RFID) equipment and aims to locate passive RFID tags attached on objects in the surrounding area. The robot hosts additional sensors, namely lidar and depth cameras, enabling it to perform SLAM – simultaneous localization (of its own location) and mapping of any (including previously unknown) area. Furthermore, it can avoid obstacles, including people and perform and update path planning. Thanks to its movement, the robot collects a huge amount of data related to received signal strength information (RSSI) and phase information of each tag, realizing the concept of a “virtual antenna array”; i.e. a moving antenna at multiple locations. The antenna‐equipped robot behaves similarly to a synthetic‐aperture radar. The main application is continuous inventorying and localization; focusing on warehouse management, large retail stores, libraries, etc. The main advantage of the robotic approach versus static‐reader‐antenna deployments arises from the equivalent cost‐reduction per square meter of target area, since a single robot can circulate continuously around any area, whereas a fixed RFID‐network would necessitate for infrastructure costs analogous to the size of the area. Another advantage is the huge amount of data from different locations (of the robot) available to be exploited for more accurate RFID localization. Compared to a fixed installation, the disadvantage is that the robot does not cover the entire area simultaneously. Depending on the size of the target area and the desired inventorying update rate, additional robots could be deployed. In this chapter, the localization problem is presented and linked to practical applications. Representative prior‐art is analyzed and discussed. The SLAM problem is also discussed, while related state‐of‐the‐art is presented. Moreover, experimental results by an actual robot are demonstrated. A robot collects phase and RSSI measurements by RFID tags. It is shown that positioning accuracy is affected by both robotics' SLAM accuracy as well as the disruption of tags' backscattered signal due to fading. Finally, techniques to improve the system are discussed.
Alexandros Filotheou, Emmanouil Tsardoulias, Antonis Dimitriou, Andreas Symeonidis and Loukas Petrou
"Quantitative and Qualitative Evaluation of ROS-Enabled Local and Global Planners in 2D Static Environments"
Journal of Intelligent & Robotic Systems, 2019 Oct
Apart from perception, one of the most fundamental aspects of an autonomous mobile robot is the ability to adequately and safely traverse the environment it operates in. This ability is called Navigation and is performed in a two- or three-dimensional fashion, except for cases where the robot is neither a ground vehicle nor articulated (e.g. robotics arms). The planning part of navigation comprises a global planner, suitable for generating a path from an initial to a target pose, and a local planner tasked with traversing the aforementioned path while dealing with environmental, sensorial and motion uncertainties. However, the task of selecting the optimal global and/or local planner combination is quite hard since no research provides insight on which is best regarding the domain and planner limitations. In this context, current work performs a comparative analysis on qualitative and quantitative aspects of the most common ROS-enabled global and local planners for robots operating in two-dimensional static environments, on the basis of mission-centered and planner-related metrics, optimality and traversability aspects, as well as non-measurable aspects, such as documentation quality, parameterisability, ease of use, etc.
Emmanouil Krasanakis, Emmanouil Schinas, Symeon Papadopoulos, Yiannis Kompatsiaris and Andreas Symeonidis
Information Processing & Management, pp. 102053, 2019 Jun
Local community detection is an emerging topic in network analysis that aims to detect well-connected communities encompassing sets of priorly known seed nodes. In this work, we explore the similar problem of ranking network nodes based on their relevance to the communities characterized by seed nodes. However, seed nodes may not be central enough or sufficiently many to produce high quality ranks. To solve this problem, we introduce a methodology we call seed oversampling, which first runs a node ranking algorithm to discover more nodes that belong to the community and then reruns the same ranking algorithm for the new seed nodes. We formally discuss why this process improves the quality of calculated community ranks if the original set of seed nodes is small and introduce a boosting scheme that iteratively repeats seed oversampling to further improve rank quality when certain ranking algorithm properties are met. Finally, we demonstrate the effectiveness of our methods in improving community relevance ranks given only a few random seed nodes of real-world network communities. In our experiments, boosted and simple seed oversampling yielded better rank quality than the previous neighborhood inflation heuristic, which adds the neighborhoods of original seed nodes to seeds.
Michail Papamichail, Kyriakos Chatzidimitriou, Thomas Karanikiotis, Napoleon-Christos Oikonomou, Andreas Symeonidis and Sashi Saripalle
"BrainRun: A Behavioral Biometrics Dataset towards Continuous Implicit Authentication"
Data, 4, (2), 2019 May
The widespread use of smartphones has dictated a new paradigm, where mobile applications are the primary channel for dealing with day-to-day tasks. This paradigm is full of sensitive information, making security of utmost importance. To that end, and given the traditional authentication techniques (passwords and/or unlock patterns) which have become ineffective, several research efforts are targeted towards biometrics security, while more advanced techniques are considering continuous implicit authentication on the basis of behavioral biometrics. However, most studies in this direction are performed “in vitro” resulting in small-scale experimentation. In this context, and in an effort to create a solid information basis upon which continuous authentication models can be built, we employ the real-world application “BrainRun”, a brain-training game aiming at boosting cognitive skills of individuals. BrainRun embeds a gestures capturing tool, so that the different types of gestures that describe the swiping behavior of users are recorded and thus can be modeled. Upon releasing the application at both the “Google Play Store” and “Apple App Store”, we construct a dataset containing gestures and sensors data for more than 2000 different users and devices. The dataset is distributed under the CC0 license and can be found at the EU Zenodo repository.
Michail D. Papamichail, Themistoklis Diamantopoulos and Andreas L. Symeonidis
"Software Reusability Dataset based on Static Analysis Metrics and Reuse Rate Information"
Data in Brief, 2019 Dec
The widely adopted component-based development paradigm considers the reuse of proper software components as a primary criterion for successful software development. As a result, various research efforts are directed towards evaluating the extent to which a software component is reusable. Prior efforts follow expert-based approaches, however the continuously increasing open-source software initiative allows the introduction of data-driven alternatives. In this context we have generated a dataset that harnesses information residing in online code hosting facilities and introduces the actual reuse rate of software components as a measure of their reusability. To do so, we have analyzed the most popular projects included in the maven registry and have computed a large number of static analysis metrics at both class and package levels using SourceMeter tool  that quantify six major source code properties: complexity, cohesion, coupling, inheritance, documentation and size. For these projects we additionally computed their reuse rate using our self-developed code search engine, AGORA . The generated dataset contains analysis information regarding more than 24,000 classes and 2,000 packages, and can, thus, be used as the information basis towards the design and development of data-driven reusability evaluation methodologies. The dataset is related to the research article entitled "Measuring the Reusability of Software Components using Static Analysis Metrics and Reuse Rate Information
Michail D. Papamichail , Themistoklis Diamantopoulos and Andreas L. Symeonidis
"Measuring the Reusability of Software Components using Static Analysis Metrics and Reuse Rate Information"
Journal of Systems and Software, pp. 110423, 2019 Sep
Nowadays, the continuously evolving open-source community and the increasing demands of end users are forming a new software development paradigm; developers rely more on reusing components from online sources to minimize the time and cost of software development. An important challenge in this context is to evaluate the degree to which a software component is suitable for reuse, i.e. its reusability. Contemporary approaches assess reusability using static analysis metrics by relying on the help of experts, who usually set metric thresholds or provide ground truth values so that estimation models are built. However, even when expert help is available, it may still be subjective or case-specific. In this work, we refrain from expert-based solutions and employ the actual reuse rate of source code components as ground truth for building a reusability estimation model. We initially build a benchmark dataset, harnessing the power of online repositories to determine the number of reuse occurrences for each component in the dataset. Subsequently, we build a model based on static analysis metrics to assess reusability from five different properties: complexity, cohesion, coupling, inheritance, documentation and size. The evaluation of our methodology indicates that our system can effectively assess reusability as perceived by developers.
Eleni Poptsi, Emmanouil Tsardoulias, Despina Moraitou, Andreas Symeonidis and Magda Tsolaki
"REMEDES for Alzheimer-R4Alz Battery: Design and Development of a New Tool of Cognitive Control Assessment for the Diagnosis of Minor and Major Neurocognitive Disorders"
Journal of Alzheimer's Disease, pp. 1-19, 2019 Oct
Background:Subjective cognitive decline (SCD) and mild cognitive impairment (MCI) are acknowledged stages of the clinical spectrum of Alzheimer’s disease (AD), and cognitive control seems to be among the first neuropsychological predictors of cognitive decline. Existing tests are usually affected by educational level, linguistic abilities, cultural differences, and social status, constituting them error-prone when differentiating between the aforementioned stages. Creating robust neuropsychological tests is therefore prominent. Objective:The design of a novel psychometric battery for the cognitive control and attention assessment, free of demographic effects, capable to discriminate cognitively healthy aging, SCD, MCI, and mild Dementia (mD). Methods:The battery initial hypothesis was tuned using iterations of administration on random sampling healthy older adults and people with SCD, MCI, and mD, from the area of Thessaloniki, Greece. This resulted in the first release of the REflexes MEasurement DEviceS for Alzheimer battery (REMEDES for Alzheimer-R4Alz). Results:The first release lasts for almost an hour. The battery was design to assess working memory (WM) including WM storage, processing, and updating, enriched by episodic buffer recruitment. It was also designed to assess attention control abilities comprising selective, sustained, and divided attention subtasks. Finally, it comprises an inhibitory control, a task/rule switching or set-shifting, and a cognitive flexibility subtask as a combination of inhibition and task/rule switching abilities. Conclusion:The R4Alz battery is an easy to use psychometric battery with increasing difficulty levels and assumingly ecological validity, being entertaining for older adults, potentially free of demographic effects, and promising as a more accurate and early diagnosis tool of neurodegeneration.
Emmanouil G. Tsardoulias, M. Protopapas, Andreas L. Symeonidis and Loukas Petrou
Journal of Intelligent & Robotic Systems, 2019 Jul
The alignment of two occupancy grid maps generated by SLAM algorithms is a quite researched problem, being an obligatory step either for unsupervised map merging techniques or for evaluation of OGMs (Occupancy Grid Maps) against a blueprint of the environment. This paper provides an overview of the existing automatic alignment techniques of two occupancy grid maps that employ pattern matching. Additionally, an alignment pipeline using local features and image descriptors is implemented, as well as a method to eliminate erroneous correspondences, aiming at producing the correct transformation between the two maps. Finally, map quality metrics are proposed and utilized, in order to quantify the produced map’s correctness. A comparative analysis was performed over a number of image processing and OGM-oriented detectors and descriptors, in order to identify the best combinations for the map evaluation problem, performed between two OGMs or between an OGM and a Blueprint map.
Anastasios Tzitzis, Spyros Megalou, Stavroula Siachalou, Emmanouil Tsardoulias, Athanasios Kehagias, Traianos Yioultsis and Antonis Dimitriou
"Localization of RFID Tags by a Moving Robot, via Phase Unwrapping and Non-Linear Optimization"
IEEE Journal of Radio Frequency Identification, 3, (4), pp. 216 - 226, 2019 Aug
In this paper, we propose a new method for the localization of RFID tags, by deploying off-the-shelf RFID equipment on a robotic platform. The constructed robot is capable to perform Simultaneous Localization (of its own position) and Mapping (SLAM) of the environment and then locate the RFID tags around its path. The proposed method is based on properly treating the measured phase of the backscattered signal by each tag at the reader’s antenna, located on top of the robot. More specifically, the measured phase samples are reconstructed, such that the $2\pi $ discontinuities are eliminated (phase-unwrapping). This allows for the formation of an optimization problem, which can be solved rapidly by standard methods. The proposed method is experimentally compared against the SAR/imaging methods, which represent the accuracy benchmark in prior-art, deploying off-the-shelf equipment. It is shown that the proposed method solves exactly the same problem as holographic-imaging methods, overcoming the grid-density constraints of the latter. Furthermore, the problem, being calculations-grid-independent, is solved orders of magnitude faster, allowing for the applicability of the method in real-time inventorying and localization. It is also shown that the state-of-the-art SLAM method, which is used for the estimation of the trace of the robot, also suffers from errors, which directly affect the accuracy of the RFID localization method. Deployment of reference RFID tags at known positions, seems to significantly reduce such errors.
Kyriakos C Chatzidimitriou, Michail D Papamichail, Napoleon-Christos I Oikonomou, Dimitrios Lampoudis and Andreas L Symeonidis
"Cenote: A Big Data Management and Analytics Infrastructure for the Web of Things"
IEEE/WIC/ACM International Conference on Web Intelligence, pp. 282-285, ACM, 2019 Oct
In the era of Big Data, Cloud Computing and Internet of Things, most of the existing, integrated solutions that attempt to solve their challenges are either proprietary, limit functionality to a predefined set of requirements, or hide the way data are stored and accessed. In this work we propose Cenote, an open source Big Data management and analytics infrastructure for the Web of Things that overcomes the above limitations. Cenote is built on component-based software engineering principles and provides an all-inclusive solution based on components that work well individually.
Michail D. Papamichail, Themistoklis Diamantopoulos, Vasileios Matsoukas, Christos Athanasiadis and Andreas L. Symeonidis
"Towards Extracting the Role and Behavior of Contributors in Open-source Projects"
Proceedings of the 14th International Conference on Software Technologies - Volume 1: ICSOFT, pp. 536-543, SciTePress, 2019 Jul
Lately, the popular open source paradigm and the adoption of agile methodologies have changed the way soft-ware is developed. Effective collaboration within software teams has become crucial for building successful products. In this context, harnessing the data available in online code hosting facilities can help towards understanding how teams work and optimizing the development process. Although there are several approaches that mine contributions’ data, they usually view contributors as a uniform body of engineers, and focus mainlyon the aspect of productivity while neglecting the quality of the work performed. In this work, we design a methodology for identifying engineer roles in development teams and determine the behaviors that prevail for each role. Using a dataset of GitHub projects, we perform clustering against the DevOps axis, thus identifying three roles: developers that are mainly preoccupied with code commits, operations engineers that focus on task assignment and acceptance testing, and the lately popular role of DevOps engineers that are a mix of both.Our analysis further extracts behavioral patterns for each role, this way assisting team leaders in knowing their team and effectively directing responsibilities to achieve optimal workload balancing and task allocati
Kyriakos C. Chatzidimitriou, Michail D. Papamichail, Themistoklis Diamantopoulos, Napoleon-Christos Oikonomou and Andreas L. Symeonidis
"npm Packages as Ingredients: A Recipe-based Approach - Volume 1: ICSOFT"
Proceedings of the 14th International Conference on Software Technologies, pp. 544-551, SciTePress, 2019 Jul
Maria Kotouza, Fotis Psomopoulos and Periklis A. Mitkas
New Trends in Databases and Information Systems, pp. 564-569, Springer International Publishing, Cham, 2019 Sep
Nowadays, a wide range of sciences are moving towards the Big Data era, producing large volumes of data that require processing for new knowledge extraction. Scientific workflows are often the key tools for solving problems characterized by computational complexity and data diversity, whereas cloud computing can effectively facilitate their efficient execution. In this paper, we present a generative big data analysis workflow that can provide analytics, clustering, prediction and visualization services to datasets coming from various scientific fields, by transforming input data into strings. The workflow consists of novel algorithms for data processing and relationship discovery, that are scalable and suitable for cloud infrastructures. Domain experts can interact with the workflow components, set their parameters, run personalized pipelines and have support for decision-making processes. As case studies in this paper, two datasets consisting of (i) Documents and (ii) Gene sequence data are used, showing promising results in terms of efficiency and performance.
Konstantinos Panayiotou, Emmanouil Tsardoulias, Christopher Zolotas, Iason Paraskevopoulos, Alexandra Chatzicharistou, Alexandros Sahinis, Stathis Dimitriadis, Dimitra Ntzioni, Christopher Mpekos, Giannis Manousaridis, Aris Georgoulas and Andreas L. Symeonidis
"Ms Pacman and the Robotic Ghost: A Modern Cyber-Physical Remake of the Famous Pacman Game"
2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS), pp. 147-154, 2019 Oct
Christos Psarras, Themistoklis Diamantopoulos and Andreas Symeonidis
"A Mechanism for Automatically Summarizing Software Functionality from Source Code"
Proceedings of the 2019 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 121-130, IEEE, Sofia, Bulgaria, 2019 Jul
When developers search online to find software components to reuse, they usually first need to understand the container projects/libraries, and subsequently identify the required functionality. Several approaches identify and summarize the offerings of projects from their source code, however they often require that the developer has knowledge of the underlying topic modeling techniques; they do not provide a mechanism for tuning the number of topics, and they offer no control over the top terms for each topic. In this work, we use a vectorizer to extract information from variable/method names and comments, and apply Latent Dirichlet Allocation to cluster the source code files of a project into different semantic topics.The number of topics is optimized based on their purity with respect to project packages, while topic categories are constructed to provide further intuition and Stack Exchange tags are used to express the topics in more abstract terms
Stavroula Siachalou, Spyros Megalou, Anastasios Tzitzis, Emmanouil Tsardoulias, John Sahalos, Traianos Yioultsis and Antonis Dimitriou
"Robotic Inventorying and Localization of RFID Tags"
2019 IEEE International Conference on RFID Technology and Applications (RFID-TA), pp. 362-367, IEEE, 2019 Sep
In this paper we investigate the performance of phase-based fingerprinting for the localization of RFID-tagged items in warehouses and large retail stores, by deploying ground and aerial RFID-equipped robots. The measured phases of the target RFID tags, collected along a given robot’s trajectory, are compared to the corresponding phase-measurements of reference RFID tags; i.e. tags placed at known locations. The advantage of the method is that it doesn’t need to estimate the robot’s trajectory, since estimation is carried out by comparing phase measurements collected at neighboring time-intervals. This is of paramount importance for an RFID equipped drone, destined to fly indoors, since its weight should be kept as low as possible, in order to constrain its diameter correspondingly small. The phase measurements are initially unwrapped and then fingerprinting is applied. We compare the phase-fingerprinting with RSSI based fingerprinting. Phase-fingerprinting is significantly more accurate, because of the shape of the phase-function, which is typically U-shaped, with its minimum, measured at the point of the trajectory, when the robot-tag distance is minimised. Experimental accuracy of 15cm is typically achieved, depending on the density of the reference tags’ grid.
Anastasios Tzitzis, Spyros Megalou, Stavroula Siachalou, Emmanouil Tsardoulias, Traianos Yioultsis and Antonis Dimitriou
"3D Localization of RFID Tags with a Single Antenna by a Moving Robot and” Phase ReLock”"
2019 IEEE International Conference on RFID Technology and Applications (RFID-TA), pp. 273-278, IEEE, 2019 Sep
In this paper, we propose a novel method for the three dimensional (3D) localization of RFID tags, by deploying a single RFID antenna on a robotic platform. The constructed robot is capable of performing Simultaneous Localization (of its own position) and Mapping (SLAM) of the environment and then locating the tags around its path. The proposed method exploits the unwrapped measured phase of the backscattered signal, in such manner that the localization problem can be solved rapidly by standard optimization methods. Three dimensional solution is accomplished with a single antenna on top of the robot, by forcing the robot to traverse non-straight paths (e.g. s-shaped) along the environment. It is proven theoretically and experimentally that any non-straight path reduces the locus of possible solutions to only two points along the 3D space, instead of the circle that represents the corresponding locus for typical straight robot trajectories. As a consequence, by applying our proposed method ”Phase Relock” along the known half-plane of the search-space, the unique solution is rapidly found. We experimentally compare our method against the ”holographic” method, which represents the accuracy benchmark in priorart, deploying commercial off-the-shelf (COTS) equipment. Both algorithms find the unique solution, as expected. Furthermore, ”Phase ReLock” overcomes the calculations-grid constraints of the latter. Thus, better accuracy is achieved, while, more importantly, Phase-Relock is orders of magnitude faster, allowing for the applicability of the method in real-time inventorying and localization.
Konstantinos N. Vavliakis, George Katsikopoulos and Andreas L. Symeonidis
"E-commerce Personalization with Elasticsearch"
International Workshop on Web Search and Data Mining in conjunction with The 10th International Conference on Ambient Systems, Networks and Technologies (ANT 2019), Leuven, Belgium, 2019 Apr
Personalization techniques are constantly gaining traction among e-commerce retailers, since major advancements have been made at research level and the benefits are clear and pertinent. However, effectively applying personalization in real life is a challenging task, since the proper mixture of technology, data and content is complex and differs between organizations. In fact, personalization applications such as personalized search remain largely unfulfilled, especially by small and medium sized retailers, due to time and space limitations. In this paper we propose a novel approach for near real-time personalized e-commerce search that provides improved personalized results within the limited accepted time frames required for online browsing. We propose combining features such as product popularity, user interests, and query-product relevance with collaborative filtering, and implement our solution in Elasticsearch in order to achieve acceptable execution timings. We evaluate our approach against a publicly available dataset, as well as a running e-commerce store.
Christoforos Zolotas, Kyriakos C. Chatzidimitriou and Andreas L. Symeonidis
"RESTsec: a low-code platform for generating secure by design enterprise services"
Enterprise Information Systems, pp. 1-27, 2018 Mar
In the modern business world it is increasingly often that Enterprises opt to bring their business model online, in their effort to reach out to more end users and increase their customer base. While transitioning to the new model, enterprises consider securing their data of pivotal importance. In fact, many efforts have been introduced to automate this ‘webification’ process; however, they all fall short in some aspect: a) they either generate only the security infrastructure, assigning implementation to the developers, b) they embed mainstream, less powerful authorisation schemes, or c) they disregard the merits of the dominating REST architecture and adopt less suitable approaches. In this paper we present RESTsec, a Low-Code platform that supports rapid security requirements modelling for Enterprise Services, abiding by the state of the art ABAC authorisation scheme. RESTsec enables the developer to seamlessly embed the desired access control policy and generate the service, the security infrastructure and the code. Evaluation shows that our approach is valid and can help developers deliver secure by design enterprise services in a rapid and automated manner.
George Mamalakis, Christos Diou, Andreas L. Symeonidis and Leonidas Georgiadis
"Of daemons and men: reducing false positive rate in intrusion detection systems with file system footprint analysis"
Neural Computing and Applications, 2018 May
In this work, we propose a methodology for reducing false alarms in file system intrusion detection systems, by taking into account the daemon’s file system footprint. More specifically, we experimentally show that sequences of outliers can serve as a distinguishing characteristic between true and false positives, and we show how analysing sequences of outliers can lead to lower false positive rates, while maintaining high detection rates. Based on this analysis, we developed an anomaly detection filter that learns outlier sequences using k-nearest neighbours with normalised longest common subsequence. Outlier sequences are then used as a filter to reduce false positives on the FI2DS file system intrusion detection system. This filter is evaluated on both overlapping and non-overlapping sequences of outliers. In both cases, experiments performed on three real-world web servers and a honeynet show that our approach achieves significant false positive reduction rates (up to 50 times), without any degradation of the corresponding true positive detection rates.
Sotirios-Filippos Tsarouchis, Maria Th. Kotouza, Fotis E. Psomopoulos and Pericles A. Mitkas
"A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences"
IFIP International Conference on Artificial Intelligence Applications and Innovations, pp. 189-199, Springer, Cham, 2018 May
The identification of meaningful groups of proteins has always been a major area of interest for structural and functional genomics. Successful protein clustering can lead to significant insight, assisting in both tracing the evolutionary history of the respective molecules as well as in identifying potential functions and interactions of novel sequences. Here we propose a clustering algorithm for same-length sequences, which allows the construction of subset hierarchy and facilitates the identification of the underlying patterns for any given subset. The proposed method utilizes the metrics of sequence identity and amino-acid similarity simultaneously as direct measures. The algorithm was applied on a real-world dataset consisting of clonotypic immunoglobulin (IG) sequences from Chronic lymphocytic leukemia (CLL) patients, showing promising results.
Kyriakos C. Chatzidimitriou, Michail Papamichail, Themistoklis Diamantopoulos, Michail Tsapanos and Andreas L. Symeonidis
"npm-miner: An Infrastructure for Measuring the Quality of the npm Registry"
MSR ’18: 15th International Conference on Mining Software Repositories, pp. 4, ACM, Gothenburg, Sweden, 2018 May
Themistoklis Diamantopoulos, Georgios Karagiannopoulos and Andreas Symeonidis
"CodeCatch: Extracting Source Code Snippets from Online Sources"
IEEE/ACM 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE), pp. 21-27, https://dl.acm.org/ft_gateway.cfm?id=3194107&ftid=1982571&dwn=1&CFID=87644405&CFTOKEN=833260e7cb501a7d-48967D35-AFC5-4678-82812B13D64D3DD3, 2018 May
Anastasios Dimanidis, Kyriakos C. Chatzidimitriou and Andreas L. Symeonidis
"A Natural Language Driven Approach for Automated Web API Development: Gherkin2OAS"
WWW ’18 Companion: The 2018 Web Conference Companion, pp. 6, Lyon, France, 2018 Apr
Speeding up the development process of Web Services, while adhering to high quality software standards is a typical requirement in the software industry. This is why industry specialists usually suggest \\"driven by\\" development approaches to tackle this problem. In this paper, we propose such a methodology that employs Specification Driven Development and Behavior Driven Development in order to facilitate the phases of Web Service requirements elicitation and specification. Furthermore, we introduce gherkin2OAS, a software tool that aspires to bridge the aforementioned development approaches. Through the suggested methodology and tool, one may design and build RESTful services fast, while ensuring proper functionality.
Maria Th. Kotouza, Konstantinos N. Vavliakis, Fotis E. Psomopoulos and Pericles A. Mitkas
"A Hierarchical Multi-Metric Framework for Item Clustering"
5th International Conference on Big Data Computing Applications and Technologies, pp. 191-197, IEEE/ACM, Zurich, Switzerland, 2018 Dec
Item clustering is commonly used for dimensionality reduction, uncovering item similarities and connections, gaining insights of the market structure and recommendations. Hierarchical clustering methods produce a hierarchy structure along with the clusters that can be useful for managing item categories and sub-categories, dealing with indirect competition and new item categorization as well. Nevertheless, baseline hierarchical clustering algorithms have high computational cost and memory usage. In this paper we propose an innovative scalable hierarchical clustering framework, which overcomes these limitations. Our work consists of a binary tree construction algorithm that creates a hierarchy of the items using three metrics, a) Identity, b) Similarity and c) Entropy, as well as a branch breaking algorithm which composes the final clusters by applying thresholds to each branch of the tree. ?he proposed framework is evaluated on the popular MovieLens 20M dataset achieving significant reduction in both memory consumption and computational time over a baseline hierarchical clustering algorithm.
Michail Papamichail, Themistoklis Diamantopoulos, Ilias Chrysovergis, Philippos Samlidis and Andreas Symeonidis
Proceedings of the 2018 Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), https://www.researchgate.net/publication/324106989_User-Perceived_Reusability_Estimation_based_on_Analysis_of_Software_Repositories, 2018 Mar
The popularity of open-source software repositories has led to a new reuse paradigm, where online resources can be thoroughly analyzed to identify reusable software components. Obviously, assessing the quality and specifically the reusability potential of source code residing in open software repositories poses a major challenge for the research community. Although several systems have been designed towards this direction, most of them do not focus on reusability. In this paper, we define and formulate a reusability score by employing information from GitHub stars and forks, which indicate the extent to which software components are adopted/accepted by developers. Our methodology involves applying and assessing different state-of-the-practice machine learning algorithms, in order to construct models for reusability estimation at both class and package levels. Preliminary evaluation of our methodology indicates that our approach can successfully assess reusability, as perceived by developers.
Valasia Dimaridou, Alexandros-Charalampos Kyprianidis, Michail Papamichail, Themistoklis Diamantopoulos and Andreas Symeonidis
Charpter:1, pp. 25, Springer, 2018 Jan
Nowadays, developers tend to adopt a component-based software engineering approach, reusing own implementations and/or resorting to third-party source code. This practice is in principle cost-effective, however it may also lead to low quality software products, if the components to be reused exhibit low quality. Thus, several approaches have been developed to measure the quality of software components. Most of them, however, rely on the aid of experts for defining target quality scores and deriving metric thresholds, leading to results that are context-dependent and subjective. In this work, we build a mechanism that employs static analysis metrics extracted from GitHub projects and defines a target quality score based on repositories’ stars and forks, which indicate their adoption/acceptance by developers. Upon removing outliers with a one-class classifier, we employ Principal Feature Analysis and examine the semantics among metrics to provide an analysis on five axes for source code components (classes or packages): complexity, coupling, size, degree of inheritance, and quality of documentation. Neural networks are thus applied to estimate the final quality score given metrics from these axes. Preliminary evaluation indicates that our approach effectively estimates software quality at both class and package levels.
Themistoklis Diamantopoulos, Michael Roth, Andreas Symeonidis and Ewan Klein
"Software requirements as an application domain for natural language processing"
Language Resources and Evaluation, pp. 1-30, 2017 Feb
Mapping functional requirements first to specifications and then to code is one of the most challenging tasks in software development. Since requirements are commonly written in natural language, they can be prone to ambiguity, incompleteness and inconsistency. Structured semantic representations allow requirements to be translated to formal models, which can be used to detect problems at an early stage of the development process through validation. Storing and querying such models can also facilitate software reuse. Several approaches constrain the input format of requirements to produce specifications, however they usually require considerable human effort in order to adopt domain-specific heuristics and/or controlled languages. We propose a mechanism that automates the mapping of requirements to formal representations using semantic role labeling. We describe the first publicly available dataset for this task, employ a hierarchical framework that allows requirements concepts to be annotated, and discuss how semantic role labeling can be adapted for parsing software requirements.
Themistoklis Diamantopoulos and Andreas Symeonidis
Enterprise Information Systems, pp. 1-22, 2017 Dec
Enhancing the requirements elicitation process has always been of added value to software engineers, since it expedites the software lifecycle and reduces errors in the conceptualization phase of software products. The challenge posed to the research community is to construct formal models that are capable of storing requirements from multimodal formats (text and UML diagrams) and promote easy requirements reuse, while at the same time being traceable to allow full control of the system design, as well as comprehensible to software engineers and end users. In this work, we present an approach that enhances requirements reuse while capturing the static (functional requirements, use case diagrams) and dynamic (activity diagrams) view of software projects. Our ontology-based approach allows for reasoning over the stored requirements, while the mining methodologies employed detect incomplete or missing software requirements, this way reducing the effort required for requirements elicitation at an early stage of the project lifecycle.
A. Thallas, E.G. Tsardoulias and L. Petrou
"Topological Based Scan Matching – Odometry Posterior Sampling in RBPF Under Kinematic Model Failures"
Journal of Intelligent & Robotic Systems, 91, pp. 543-568, 2017 Nov
Rao-Blackwellized Particle Filters (RBPF) have been utilized to provide a solution to the SLAM problem. One of the main factors that cause RBPF failure is the potential particle impoverishment. Another popular approach to the SLAM problem are Scan Matching methods, whose good results require environments with lots of information, however fail in the lack thereof. To face these issues, in the current work techniques are presented to combine Rao-Blackwellized particle filters with a scan matching algorithm (CRSM SLAM). The particle filter maintains the correct hypothesis in environments lacking features and CRSM is employed in feature-rich environments while simultaneously reduces the particle filter dispersion. Since CRSM’s good performance is based on its high iteration frequency, a multi-threaded combination is presented which allows CRSM to operate while RBPF updates its particles. Additionally, a novel method utilizing topological information is proposed, in order to reduce the number of particle filter resamplings. Finally, we present methods to address anomalous situations where scan matching can not be performed and the vehicle displays behaviors not modeled by the kinematic model, causing the whole method to collapse. Numerous experiments are conducted to support the aforementioned methods’ advantages.
Miltiadis G. Siavvas, Kyriakos C. Chatzidimitriou and Andreas L. Symeonidis
"QATCH - An adaptive framework for software product quality assessment"
Expert Systems with Applications, 2017 May
The subjectivity that underlies the notion of quality does not allow the design and development of a universally accepted mechanism for software quality assessment. This is why contemporary research is now focused on seeking mechanisms able to produce software quality models that can be easily adjusted to custom user needs. In this context, we introduce QATCH, an integrated framework that applies static analysis to benchmark repositories in order to generate software quality models tailored to stakeholder specifications. Fuzzy multi-criteria decision-making is employed in order to model the uncertainty imposed by experts’ judgments. These judgments can be expressed into linguistic values, which makes the process more intuitive. Furthermore, a robust software quality model, the base model, is generated by the system, which is used in the experiments for QATCH system verification. The paper provides an extensive analysis of QATCH and thoroughly discusses its validity and added value in the field of software quality through a number of individual experiments.
Athanassios M. Kintsakis, Fotis E. Psomopoulos, Andreas L. Symeonidis and Pericles A. Mitkas
"Hermes: Seamless delivery of containerized bioinformatics workflows in hybrid cloud (HTC) environments"
SoftwareX, 6, pp. 217-224, 2017 Sep
Hermes introduces a new ”describe once, run anywhere” paradigm for the execution of bioinformatics workflows in hybrid cloud environments. It combines the traditional features of parallelization-enabled workflow management systems and of distributed computing platforms in a container-based approach. It offers seamless deployment, overcoming the burden of setting up and configuring the software and network requirements. Most importantly, Hermes fosters the reproducibility of scientific workflows by supporting standardization of the software execution environment, thus leading to consistent scientific workflow results and accelerating scientific output.
Cezary Zielinski, Maciej Stefanczyk, Tomasz Kornuta, Maksym Figat, Wojciech Dudek, Wojciech Szynkiewicz, Wlodzimierz Kasprzak, Jan Figat, Marcin Szlenk, Tomasz Winiarski, Konrad Banachowicz, Teresa Zielinska, Emmanouil G. Tsardoulias, Andreas L. Symeonidis, Fotis E. Psomopoulos, Athanassios M. Kintsakis, Pericles A. Mitkas, Aristeidis Thallas, Sofia E. Reppou, George T. Karagiannis, Konstantinos Panayiotou, Vincent Prunet, Manuel Serrano, Jean-Pierre Merlet, Stratos Arampatzis, Alexandros Giokas, Lazaros Penteridis, Ilias Trochidis, David Daney and Miren Iturburu
"Variable structure robot control systems: The RAPP approach"
Robotics and Autonomous Systems, 94, pp. 226-244, 2017 May
This paper presents a method of designing variable structure control systems for robots. As the on-board robot computational resources are limited, but in some cases the demands imposed on the robot by the user are virtually limitless, the solution is to produce a variable structure system. The task dependent part has to be exchanged, however the task governs the activities of the robot. Thus not only exchange of some task-dependent modules is required, but also supervisory responsibilities have to be switched. Such control systems are necessary in the case of robot companions, where the owner of the robot may demand from it to provide many services.
Fotis Psomopoulos and Pericles Mitkas
"Handbook of Research on Computational Grid Technologies for Life Sciences, Biomedicine, and Healthcare"
2, UK: IGI Global., Catanzaro, Italy, 2009 May
Andreas Symeonidis and Pericles A. Mitkas
"Agent Intelligence Through Data Mining (Multiagent Systems, Artificial Societies, and Simulated Organizations)"
Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2005 Jul