Aristotle University of Thessaloniki
Department of Electrical and Computer Engineering
54124 Thessaloniki – GREECE
Tel: +30 2310 99 6365
Fax: +30 2310 99 6398
Email: mpapamic (at) issel [dot] ee [dot] auth [dot] gr
|12/2015 – today||PhD candidate Electrical and Computer Engineering Department Aristotle University of Thessaloniki, Greece PhD Thesis: “Application of artificaial intelligence and data mining techniques for software quality assessment”|
|09/2010 – 11/2015||Diploma of Electrical and Computer Engineering Electrical and Computer Engineering Department Aristotle University of Thessaloniki, Greece Diploma Thesis: “Design and development of a source code quality estimation system using static analysis metrics and machine learning techniques”.|
|02/2016 – today||Research Associate Electrical and Computer Engineering Department Aristotle University of Thessaloniki, Greece EU-funded Project: Mobile-Age (http://issel.ee.auth.gr/mobile-age/ )|
|06/2015 – 12/2015||Application/Technical Consultant in Veltio. (Oracle RPAS solutions consultant)|
|06/2014 – 09/2014||Paid internship position in the University of California, Irvine in the Secure Systems and Software Laboratory (SSL) under professor Michael Franz|
|10/2016 – today||Teaching assistant for “Pattern Recognition”, Electrical and Computer Engineering Department, Aristotle University of Thessaloniki, Greece|
- Source Code Quality
- Software Engineering
- Data Mining
- Machine Learning
- English: Proficient (Cambridge and Michigan Proficiency)
- German: Conversational (Goethe-Zertifikats B1: Zertifikat Deutsch (ZD))
- Member of the Technical Chamber of Greece
Michail Papamichail, Kyriakos Chatzidimitriou, Thomas Karanikiotis, Napoleon-Christos Oikonomou, Andreas Symeonidis and Sashi Saripalle
"BrainRun: A Behavioral Biometrics Dataset towards Continuous Implicit Authentication"
Data, 4, (2), 2019 May
The widespread use of smartphones has dictated a new paradigm, where mobile applications are the primary channel for dealing with day-to-day tasks. This paradigm is full of sensitive information, making security of utmost importance. To that end, and given the traditional authentication techniques (passwords and/or unlock patterns) which have become ineffective, several research efforts are targeted towards biometrics security, while more advanced techniques are considering continuous implicit authentication on the basis of behavioral biometrics. However, most studies in this direction are performed “in vitro” resulting in small-scale experimentation. In this context, and in an effort to create a solid information basis upon which continuous authentication models can be built, we employ the real-world application “BrainRun”, a brain-training game aiming at boosting cognitive skills of individuals. BrainRun embeds a gestures capturing tool, so that the different types of gestures that describe the swiping behavior of users are recorded and thus can be modeled. Upon releasing the application at both the “Google Play Store” and “Apple App Store”, we construct a dataset containing gestures and sensors data for more than 2000 different users and devices. The dataset is distributed under the CC0 license and can be found at the EU Zenodo repository.
Michail D. Papamichail, Themistoklis Diamantopoulos and Andreas L. Symeonidis
"Software Reusability Dataset based on Static Analysis Metrics and Reuse Rate Information"
Data in Brief, 2019 Dec
The widely adopted component-based development paradigm considers the reuse of proper software components as a primary criterion for successful software development. As a result, various research efforts are directed towards evaluating the extent to which a software component is reusable. Prior efforts follow expert-based approaches, however the continuously increasing open-source software initiative allows the introduction of data-driven alternatives. In this context we have generated a dataset that harnesses information residing in online code hosting facilities and introduces the actual reuse rate of software components as a measure of their reusability. To do so, we have analyzed the most popular projects included in the maven registry and have computed a large number of static analysis metrics at both class and package levels using SourceMeter tool  that quantify six major source code properties: complexity, cohesion, coupling, inheritance, documentation and size. For these projects we additionally computed their reuse rate using our self-developed code search engine, AGORA . The generated dataset contains analysis information regarding more than 24,000 classes and 2,000 packages, and can, thus, be used as the information basis towards the design and development of data-driven reusability evaluation methodologies. The dataset is related to the research article entitled "Measuring the Reusability of Software Components using Static Analysis Metrics and Reuse Rate Information
Michail D. Papamichail , Themistoklis Diamantopoulos and Andreas L. Symeonidis
"Measuring the Reusability of Software Components using Static Analysis Metrics and Reuse Rate Information"
Journal of Systems and Software, pp. 110423, 2019 Sep
Nowadays, the continuously evolving open-source community and the increasing demands of end users are forming a new software development paradigm; developers rely more on reusing components from online sources to minimize the time and cost of software development. An important challenge in this context is to evaluate the degree to which a software component is suitable for reuse, i.e. its reusability. Contemporary approaches assess reusability using static analysis metrics by relying on the help of experts, who usually set metric thresholds or provide ground truth values so that estimation models are built. However, even when expert help is available, it may still be subjective or case-specific. In this work, we refrain from expert-based solutions and employ the actual reuse rate of source code components as ground truth for building a reusability estimation model. We initially build a benchmark dataset, harnessing the power of online repositories to determine the number of reuse occurrences for each component in the dataset. Subsequently, we build a model based on static analysis metrics to assess reusability from five different properties: complexity, cohesion, coupling, inheritance, documentation and size. The evaluation of our methodology indicates that our system can effectively assess reusability as perceived by developers.
Kyriakos C. Chatzidimitriou, Michail Papamichail, Themistoklis Diamantopoulos, Michail Tsapanos and Andreas L. Symeonidis
"npm-miner: An Infrastructure for Measuring the Quality of the npm Registry"
MSR ’18: 15th International Conference on Mining Software Repositories, pp. 4, ACM, Gothenburg, Sweden, 2018 May
Michail Papamichail, Themistoklis Diamantopoulos, Ilias Chrysovergis, Philippos Samlidis and Andreas Symeonidis
Proceedings of the 2018 Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), https://www.researchgate.net/publication/324106989_User-Perceived_Reusability_Estimation_based_on_Analysis_of_Software_Repositories, 2018 Mar
The popularity of open-source software repositories has led to a new reuse paradigm, where online resources can be thoroughly analyzed to identify reusable software components. Obviously, assessing the quality and specifically the reusability potential of source code residing in open software repositories poses a major challenge for the research community. Although several systems have been designed towards this direction, most of them do not focus on reusability. In this paper, we define and formulate a reusability score by employing information from GitHub stars and forks, which indicate the extent to which software components are adopted/accepted by developers. Our methodology involves applying and assessing different state-of-the-practice machine learning algorithms, in order to construct models for reusability estimation at both class and package levels. Preliminary evaluation of our methodology indicates that our approach can successfully assess reusability, as perceived by developers.
Valasia Dimaridou, Alexandros-Charalampos Kyprianidis, Michail Papamichail, Themistoklis Diamantopoulos and Andreas Symeonidis
Charpter:1, pp. 25, Springer, 2018 Jan
Nowadays, developers tend to adopt a component-based software engineering approach, reusing own implementations and/or resorting to third-party source code. This practice is in principle cost-effective, however it may also lead to low quality software products, if the components to be reused exhibit low quality. Thus, several approaches have been developed to measure the quality of software components. Most of them, however, rely on the aid of experts for defining target quality scores and deriving metric thresholds, leading to results that are context-dependent and subjective. In this work, we build a mechanism that employs static analysis metrics extracted from GitHub projects and defines a target quality score based on repositories’ stars and forks, which indicate their adoption/acceptance by developers. Upon removing outliers with a one-class classifier, we employ Principal Feature Analysis and examine the semantics among metrics to provide an analysis on five axes for source code components (classes or packages): complexity, coupling, size, degree of inheritance, and quality of documentation. Neural networks are thus applied to estimate the final quality score given metrics from these axes. Preliminary evaluation indicates that our approach effectively estimates software quality at both class and package levels.