Publications



2023

Inbooks

Thomas Karanikiotis and Andreas L. Symeonidis
"Towards Extracting Reusable and Maintainable Code Snippets"
Charpter:-, Fill, Hans-Georg and van Sinderen, Marten and Maciaszek, Leszek A. edition, 1859, pp. 187-206, Springer International Publishing, Communications in Computer and Information Science, Cham, 2023 Jul

Given the wide adoption of the agile software development paradigm, where efficient collaboration as well as effective maintenance are of utmost importance, and the (re)use of software residing in code hosting platforms, the need to produce qualitative code is evident. A condition for acceptable software reusability and maintainability is the use of idiomatic code, based on syntactic fragments that recur frequently across software projects and are characterized by high quality. In this work, we propose a methodology that can harness data from the most popular GitHub repositories in order to automatically identify reusable and maintainable code idioms, by grouping code blocks that have similar structural and semantic information. We also apply the same methodology on a single-project level, in an attempt to identify frequently recurring blocks of code across the files of a team. Preliminary evaluation of our methodology indicates that our approach can identify commonly used, reusable and maintainable code idioms and code blocks that can be effectively given as actionable recommendations to the developers.

@inbook{icsoft2022karanikiotisbook,
author={Thomas Karanikiotis and Andreas L. Symeonidis},
title={Towards Extracting Reusable and Maintainable Code Snippets},
chapter={-},
edition={Fill, Hans-Georg and van Sinderen, Marten and Maciaszek, Leszek A.},
volume={1859},
pages={187-206},
publisher={Springer International Publishing},
series={seriesS},
address={Cham},
year={2023},
month={07},
date={2023-07-19},
url={https://doi.org/10.1007/978-3-031-37231-5_9},
doi={http://10.1007/978-3-031-37231-5_9},
isbn={978-3-031-37231-5},
keywords={Software engineering;Code Idioms;Syntactic Fragment;Software Reusability;Software Maintainability;Software repositories},
abstract={Given the wide adoption of the agile software development paradigm, where efficient collaboration as well as effective maintenance are of utmost importance, and the (re)use of software residing in code hosting platforms, the need to produce qualitative code is evident. A condition for acceptable software reusability and maintainability is the use of idiomatic code, based on syntactic fragments that recur frequently across software projects and are characterized by high quality. In this work, we propose a methodology that can harness data from the most popular GitHub repositories in order to automatically identify reusable and maintainable code idioms, by grouping code blocks that have similar structural and semantic information. We also apply the same methodology on a single-project level, in an attempt to identify frequently recurring blocks of code across the files of a team. Preliminary evaluation of our methodology indicates that our approach can identify commonly used, reusable and maintainable code idioms and code blocks that can be effectively given as actionable recommendations to the developers.}
}

2022

Inbooks

Thomas Karanikiotis, Kyriakos C. Chatzidimitriou and Andreas L. Symeonidis
"A Personalized Code Formatter: Detection and Fixing"
Charpter:-, Fill, Hans-Georg and van Sinderen, Marten and Maciaszek, Leszek A. edition, 1622, pp. 169-192, Springer International Publishing, Communications in Computer and Information Science, Cham, 2022 Jul

The wide adoption of component-based software development and the (re)use of software residing in code hosting platforms have led to an increased interest shown towards source code readability and comprehensibility. One factor that can undeniably improve readability is the consistent code styling and formatting used across a project. To that end, many code formatting approaches usually define a set of rules, in order to model a commonly accepted formatting. However, this approach is mostly based on the experts’ expertise, is time-consuming and ignores the specific styling and formatting a team selects to use. Thus, it becomes too intrusive and may be not adopted. In this work, we present an automated mechanism that can be trained to identify deviations from the selected formatting style of a given project, given a set of source code files, and provide recommendations towards maintaining a common styling across all files of the project. At first, source code is transformed into small meaningful pieces, called tokens, which are used to train the models of our mechanism, in order to predict the probability of a token being wrongly positioned. Then, a number of possible fixes are examined as replacements of the wrongly positioned token and, based on a scoring function, the most suitable fixes are given as recommendations to the developer. Preliminary evaluation on various axes indicates that our approach can effectively detect formatting deviations from the project’s code styling and provide actionable recommendations to the developer.

@inbook{icsoft2021karanikiotisbook,
author={Thomas Karanikiotis and Kyriakos C. Chatzidimitriou and Andreas L. Symeonidis},
title={A Personalized Code Formatter: Detection and Fixing},
chapter={-},
edition={Fill, Hans-Georg and van Sinderen, Marten and Maciaszek, Leszek A.},
volume={1622},
pages={169-192},
publisher={Springer International Publishing},
series={seriesS},
address={Cham},
year={2022},
month={07},
date={2022-07-18},
url={https://doi.org/10.1007/978-3-031-11513-4_8},
doi={http://10.1007/978-3-031-11513-4_8},
isbn={978-3-031-11513-4},
keywords={Source Code Formatting;Source Code Readability;LSTM;SVM One-Class;Code styling},
abstract={The wide adoption of component-based software development and the (re)use of software residing in code hosting platforms have led to an increased interest shown towards source code readability and comprehensibility. One factor that can undeniably improve readability is the consistent code styling and formatting used across a project. To that end, many code formatting approaches usually define a set of rules, in order to model a commonly accepted formatting. However, this approach is mostly based on the experts’ expertise, is time-consuming and ignores the specific styling and formatting a team selects to use. Thus, it becomes too intrusive and may be not adopted. In this work, we present an automated mechanism that can be trained to identify deviations from the selected formatting style of a given project, given a set of source code files, and provide recommendations towards maintaining a common styling across all files of the project. At first, source code is transformed into small meaningful pieces, called tokens, which are used to train the models of our mechanism, in order to predict the probability of a token being wrongly positioned. Then, a number of possible fixes are examined as replacements of the wrongly positioned token and, based on a scoring function, the most suitable fixes are given as recommendations to the developer. Preliminary evaluation on various axes indicates that our approach can effectively detect formatting deviations from the project’s code styling and provide actionable recommendations to the developer.}
}

2021

Inbooks

Thomas Karanikiotis, Michail D. Papamichail and Andreas L. Symeonidis
"Multilevel Readability Interpretation Against Software Properties: A Data-Centric Approach"
Charpter:-, van Sinderen, Marten and Maciaszek, Leszek A. and Fill, Hans-Georg edition, 1447, pp. 203-226, Springer International Publishing, Communications in Computer and Information Science, Cham, 2021 Jul

Given the wide adoption of the agile software development paradigm, where efficient collaboration as well as effective maintenance are of utmost importance, the need to produce readable source code is evident. To that end, several research efforts aspire to assess the extent to which a software component is readable. Several metrics and evaluation criteria have been proposed; however, they are mostly empirical or rely on experts who are responsible for determining the ground truth and/or set custom thresholds, leading to results that are context-dependent and subjective. In this work, we employ a large set of static analysis metrics along with various coding violations towards interpreting readability as perceived by developers. Unlike already existing approaches, we refrain from using experts and we provide a fully automated and extendible methodology built upon data residing in online code hosting facilities. We perform static analysis at two levels (method and class) and construct a benchmark dataset that includes more than one million methods and classes covering diverse development scenarios. After performing clustering based on source code size, we employ Support Vector Regression in order to interpret the extent to which a software component is readable against the source code properties: cohesion, inheritance, complexity, coupling, and documentation. The evaluation of our methodology indicates that our models effectively interpret readability as perceived by developers against the above mentioned source code properties.

@inbook{icsoft2020BookChapter,
author={Thomas Karanikiotis and Michail D. Papamichail and Andreas L. Symeonidis},
title={Multilevel Readability Interpretation Against Software Properties: A Data-Centric Approach},
chapter={-},
edition={van Sinderen, Marten and Maciaszek, Leszek A. and Fill, Hans-Georg},
volume={1447},
pages={203-226},
publisher={Springer International Publishing},
series={seriesS},
address={Cham},
year={2021},
month={07},
date={2021-07-21},
url={https://doi.org/10.1007/978-3-030-83007-6_10},
doi={http://10.1007/978-3-030-83007-6_10},
isbn={978-3-030-83007-6},
abstract={Given the wide adoption of the agile software development paradigm, where efficient collaboration as well as effective maintenance are of utmost importance, the need to produce readable source code is evident. To that end, several research efforts aspire to assess the extent to which a software component is readable. Several metrics and evaluation criteria have been proposed; however, they are mostly empirical or rely on experts who are responsible for determining the ground truth and/or set custom thresholds, leading to results that are context-dependent and subjective. In this work, we employ a large set of static analysis metrics along with various coding violations towards interpreting readability as perceived by developers. Unlike already existing approaches, we refrain from using experts and we provide a fully automated and extendible methodology built upon data residing in online code hosting facilities. We perform static analysis at two levels (method and class) and construct a benchmark dataset that includes more than one million methods and classes covering diverse development scenarios. After performing clustering based on source code size, we employ Support Vector Regression in order to interpret the extent to which a software component is readable against the source code properties: cohesion, inheritance, complexity, coupling, and documentation. The evaluation of our methodology indicates that our models effectively interpret readability as perceived by developers against the above mentioned source code properties.}
}

2020

Inbooks

Antonis G. Dimitriou, Stavroula Siachalou, Emmanouil Tsardoulias and Loukas Petrou
Charpter:7, pp. -, John Wiley & Sons, Inc., 2020 Feb

Localization of wirelessly powered devices is essential for many applications related to the Internet of Things and Ubiquitous Computing. The chapter is focused on deploying a moving robotic platform, i.e. a robot, which hosts radio frequency identification (RFID) equipment and aims to locate passive RFID tags attached on objects in the surrounding area. The robot hosts additional sensors, namely lidar and depth cameras, enabling it to perform SLAM – simultaneous localization (of its own location) and mapping of any (including previously unknown) area. Furthermore, it can avoid obstacles, including people and perform and update path planning. Thanks to its movement, the robot collects a huge amount of data related to received signal strength information (RSSI) and phase information of each tag, realizing the concept of a “virtual antenna array”; i.e. a moving antenna at multiple locations. The antenna‐equipped robot behaves similarly to a synthetic‐aperture radar. The main application is continuous inventorying and localization; focusing on warehouse management, large retail stores, libraries, etc. The main advantage of the robotic approach versus static‐reader‐antenna deployments arises from the equivalent cost‐reduction per square meter of target area, since a single robot can circulate continuously around any area, whereas a fixed RFID‐network would necessitate for infrastructure costs analogous to the size of the area. Another advantage is the huge amount of data from different locations (of the robot) available to be exploited for more accurate RFID localization. Compared to a fixed installation, the disadvantage is that the robot does not cover the entire area simultaneously. Depending on the size of the target area and the desired inventorying update rate, additional robots could be deployed. In this chapter, the localization problem is presented and linked to practical applications. Representative prior‐art is analyzed and discussed. The SLAM problem is also discussed, while related state‐of‐the‐art is presented. Moreover, experimental results by an actual robot are demonstrated. A robot collects phase and RSSI measurements by RFID tags. It is shown that positioning accuracy is affected by both robotics' SLAM accuracy as well as the disruption of tags' backscattered signal due to fading. Finally, techniques to improve the system are discussed.

@inbook{etsardouRfid2020,
author={Antonis G. Dimitriou and Stavroula Siachalou and Emmanouil Tsardoulias and Loukas Petrou},
title={Robotics Meets RFID for Simultaneous Localization (of Robots and Objects) and Mapping (SLAM) – A Joined Problem},
chapter={7},
pages={-},
publisher={John Wiley & Sons, Inc.},
year={2020},
month={02},
date={2020-02-04},
url={https://onlinelibrary.wiley.com/doi/abs/10.1002/9781119578598.ch7},
doi={https://doi.org/10.1002/9781119578598.ch7},
publisher's url={https://doi.org/10.1002/9781119578598.ch7},
abstract={Localization of wirelessly powered devices is essential for many applications related to the Internet of Things and Ubiquitous Computing. The chapter is focused on deploying a moving robotic platform, i.e. a robot, which hosts radio frequency identification (RFID) equipment and aims to locate passive RFID tags attached on objects in the surrounding area. The robot hosts additional sensors, namely lidar and depth cameras, enabling it to perform SLAM – simultaneous localization (of its own location) and mapping of any (including previously unknown) area. Furthermore, it can avoid obstacles, including people and perform and update path planning. Thanks to its movement, the robot collects a huge amount of data related to received signal strength information (RSSI) and phase information of each tag, realizing the concept of a “virtual antenna array”; i.e. a moving antenna at multiple locations. The antenna‐equipped robot behaves similarly to a synthetic‐aperture radar. The main application is continuous inventorying and localization; focusing on warehouse management, large retail stores, libraries, etc. The main advantage of the robotic approach versus static‐reader‐antenna deployments arises from the equivalent cost‐reduction per square meter of target area, since a single robot can circulate continuously around any area, whereas a fixed RFID‐network would necessitate for infrastructure costs analogous to the size of the area. Another advantage is the huge amount of data from different locations (of the robot) available to be exploited for more accurate RFID localization. Compared to a fixed installation, the disadvantage is that the robot does not cover the entire area simultaneously. Depending on the size of the target area and the desired inventorying update rate, additional robots could be deployed. In this chapter, the localization problem is presented and linked to practical applications. Representative prior‐art is analyzed and discussed. The SLAM problem is also discussed, while related state‐of‐the‐art is presented. Moreover, experimental results by an actual robot are demonstrated. A robot collects phase and RSSI measurements by RFID tags. It is shown that positioning accuracy is affected by both robotics\' SLAM accuracy as well as the disruption of tags\' backscattered signal due to fading. Finally, techniques to improve the system are discussed.}
}

2018

Inbooks

Valasia Dimaridou, Alexandros-Charalampos Kyprianidis, Michail Papamichail, Themistoklis Diamantopoulos and Andreas Symeonidis
Charpter:1, pp. 25, Springer, 2018 Jan

Nowadays, developers tend to adopt a component-based software engineering approach, reusing own implementations and/or resorting to third-party source code. This practice is in principle cost-effective, however it may also lead to low quality software products, if the components to be reused exhibit low quality. Thus, several approaches have been developed to measure the quality of software components. Most of them, however, rely on the aid of experts for defining target quality scores and deriving metric thresholds, leading to results that are context-dependent and subjective. In this work, we build a mechanism that employs static analysis metrics extracted from GitHub projects and defines a target quality score based on repositories’ stars and forks, which indicate their adoption/acceptance by developers. Upon removing outliers with a one-class classifier, we employ Principal Feature Analysis and examine the semantics among metrics to provide an analysis on five axes for source code components (classes or packages): complexity, coupling, size, degree of inheritance, and quality of documentation. Neural networks are thus applied to estimate the final quality score given metrics from these axes. Preliminary evaluation indicates that our approach effectively estimates software quality at both class and package levels.

@inbook{Dimaridou2018,
author={Valasia Dimaridou and Alexandros-Charalampos Kyprianidis and Michail Papamichail and Themistoklis Diamantopoulos and Andreas Symeonidis},
title={Assessing the User-Perceived Quality of Source Code Components using Static Analysis Metrics},
chapter={1},
pages={25},
publisher={Springer},
year={2018},
month={01},
date={2018-01-01},
url={https://issel.ee.auth.gr/wp-content/uploads/2019/08/ccis_book_chapter.pdf},
publisher's url={https://www.researchgate.net/publication/325627162_Assessing_the_User-Perceived_Quality_of_Source_Code_Components_Using_Static_Analysis_Metrics},
abstract={Nowadays, developers tend to adopt a component-based software engineering approach, reusing own implementations and/or resorting to third-party source code. This practice is in principle cost-effective, however it may also lead to low quality software products, if the components to be reused exhibit low quality. Thus, several approaches have been developed to measure the quality of software components. Most of them, however, rely on the aid of experts for defining target quality scores and deriving metric thresholds, leading to results that are context-dependent and subjective. In this work, we build a mechanism that employs static analysis metrics extracted from GitHub projects and defines a target quality score based on repositories’ stars and forks, which indicate their adoption/acceptance by developers. Upon removing outliers with a one-class classifier, we employ Principal Feature Analysis and examine the semantics among metrics to provide an analysis on five axes for source code components (classes or packages): complexity, coupling, size, degree of inheritance, and quality of documentation. Neural networks are thus applied to estimate the final quality score given metrics from these axes. Preliminary evaluation indicates that our approach effectively estimates software quality at both class and package levels.}
}

2012

Inbooks

Kiriakos C. Chatzidimitriou, Ioannis Partalas, Pericles A. Mitkas and Ioannis Vlahavas
"Transferring Evolved Reservoir Features in Reinforcement Learning Tasks"
Charpter:1, 7188, pp. 213-224, Springer Berlin Heidelberg, 2012 Jan

Lecture Notes in Artificial Intelligent (LNAI)

@inbook{2012ChatzidimitriouLNAI,
author={Kiriakos C. Chatzidimitriou and Ioannis Partalas and Pericles A. Mitkas and Ioannis Vlahavas},
title={Transferring Evolved Reservoir Features in Reinforcement Learning Tasks},
chapter={1},
volume={7188},
pages={213-224},
publisher={Springer Berlin Heidelberg},
year={2012},
month={01},
date={2012-01-01},
url={http://issel.ee.auth.gr/wp-content/uploads/2017/01/Transferring-Evolved-Reservoir-Features-in-Reinforcement-Learning-Tasks.pdf},
doi={http://issel.ee.auth.gr/wp-content/uploads/publications/chp_LNAI.pdf},
abstract={Lecture Notes in Artificial Intelligent (LNAI)}
}

Andreas L. Symeonidis, Panagiotis Toulis and Pericles A. Mitkas
"Supporting Agent-Oriented Software Engineering for Data Mining Enhanced Agent Development"
Charpter:1, 7607, pp. 7-21, Springer Berlin Heidelberg, 2012 Jun

Lecture Notes in Computer Science

@inbook{2012SymeonidisLNCS,
author={Andreas L. Symeonidis and Panagiotis Toulis and Pericles A. Mitkas},
title={Supporting Agent-Oriented Software Engineering for Data Mining Enhanced Agent Development},
chapter={1},
volume={7607},
pages={7-21},
publisher={Springer Berlin Heidelberg},
year={2012},
month={06},
date={2012-06-04},
url={http://issel.ee.auth.gr/wp-content/uploads/2017/01/Supporting-Agent-Oriented-Software-Engineering-for-Data-Mining-Enhanced-Agent-Development-1.pdf},
abstract={Lecture Notes in Computer Science}
}