Completed diploma thesis, Andreas Symeonidis



Dompazis Christos
Analysis of sports performance using the REMEDES system – Football
The use of technology and computer science is transforming the field of sports. Combined with the explosion of data science and machine learning, models can be built to evaluate an athlete\'s performance and extract useful information. This diploma thesis uses a data recording system,the REMEDES, in the field of football. Using REMEDES, a testing battery is developed, experimental measurements are performed on athletes and with the use of data science, a system of evaluation and direct comparison is created, as well as a model of prediction of player’s position based on performance.
Georgios Kalantzis
Source code remodularization based on component dependency graphs
With the increasing integration of software in social structures, such as education and health, maintenance (e.g. debugging) of large software projects is becoming increasingly important. To this end, it is necessary to organize the source code into communities of artifacts (e.g. methods in classes, classes in packages, packages in libraries) to fa cilitate understandability and ease-of-navigation across implemented functionalities. If artifacts are organized into graphs whose edges reflect their dependencies, there ex ist commonly accepted software engineering principles of what constitutes high-quality source code organization, such as high connectivity of artifacts within groups and loose connectivity with artifacts of other groups. In this dissertation, we tackle the problem of automated reorganization of source code entities to optimize measures quantifying source code community organization quality. To this end, we apply community ex traction methods on dependency graphs. Our main hypothesis is that the existing hierarchical organization of source code comprises fragments of the optimal organiza tion and thus we use it as a first approximation that we refine with graph filters. This type of semi-supervised technique is compared to existing greedy optimization and un supervised genetic optimization approaches across 10 popular software projects. We find that, given a permutation of the ideal organization provided by developers, our approach derives the community organization closest to the ideal one. Additionally, it runs in near-linear times with respect to the number of dependencies that enable its application on dependency graphs of large projects.
Michael Karatzas
Applying Data Mining Techniques to Extract Fix Patterns for Static Analysis Violations
Contemporary Software products are getting larger and more complex. Dur ing the processes of software development and maintenance, developers spend a significant amount of their work time on detecting and fixing bugs. Static Analysis Tools automate the process of bug detection. Their application, however, is lim ited as the process of understanding and fixing of bugs, remains part of developer’s responsibilities. Lately, several research approaches aspire to extract useful bug fix patterns, or to automate the bug fixing process. The first approaches focus on understanding how developers face similar problems and frequently they serve as groundwork for systems for automated bug fixing. Our research aims at the extraction of useful bug fix patterns, for bugs that trigger the rules of the static analysis tool PMD. Initially, by querying the Github API, we search for commits that correspond to fixes of these categories of bugs. Both the before and after the commit versions of the commits’ files are downloaded. Then, by executing PMD on the two versions of each file, individual fixes are detected and a proper dataset is crafted. The dataset comprises fixes of bugs detectable from rules of PMD. The fixes are analyzed, and by utilizing srcML code representation and tree edit distance algorithm Gumtree, a representative sequence is extracted from each fix. Afterwards, by utilizing the metric of longest common subsequence between two sequences of two fixes, we develop a similarity scheme for the dataset’s fixes. This similarity scheme, operates as the base for the clustering of fixes and pattern extraction. In order to cluster the fixes, two separate experiments were conducted, one with K-medoids and one with the DBSCAN algorithm. In both experiments, but mostly with the DBSCAN algorithm, almost each cluster groups mostly bug fixes of a certain PMD rule. Alongside, by computing the number of commits and repositories from which the fixes of each cluster come from, it becomes obvious, that most of the clusters arise from fixes coming from a large number of commits and repositories. Thus, the extracted patterns correspond to the way in which similar problems are faced, by a number of different developers. Consequently, our extracted patterns, can be utilized as groundwork for an automated bug fixing system, where PMD will serve for bug detection.
Langaris Sotirios
Optimizing e-commerce conversion rate with dynamic pricing techniques
E-commerce is growing rapidly and is constantly gaining momentum towards being the dominant source of commercial transactions. The pricing policies and pricing strategies of businesses are of paramount importance for surviving in this highly competitive market, achieving sell-out goals and maximizing profits. Towards this end, various dynamic pricing algorithms have been proposed and adapted to the continuously changing conditions of online markets. These algorithms are based on the abundance of data available to the online stores about market conditions as well as customer’s preferences and consumption habits. Effectively analyzing this data and being able to integrate them into dynamic pricing strategies can give a significant competitive advantage to businesses. The purpose of this thesis is the development of a system for dynamic pricing of products of e commerce stores. We proposed an improved hybrid model that is used to solve the univariate timeseries predictions problem, in order to predict future sales. The proposed model uses a deep neural network (LSTM), which has shown promising results in the lasts years compared to classic feedforward neural networks. Moreover, we proposed an optimization algorithm for product pricing that optimizes the conversion rate and the profit margins of e-commerce stores. Finally, we evaluated our system be creating a simulated marketplace using real, anonymous data.
Anastasios Mouratidis
Employing Machine Learning and Intelligent Information Management Techniques for efficient Software Requirements Elicitation
Proper definition of functional requirements is a prerequisite for succesful software project development. Inaccurate and/or missing functional requirements are among the top reasons that lead to failure of the software development process, since incomplete definition of functional requirements results in erroneous scheduling of necessary tasks and subsequently failure in the implementation of the software project. This dissertation initially builds a dataset of functional requirements of software projects from various sources, which is missing from bibliography. Then an ontology is defined, that captures the static view of a software project. The functional require ments of the dataset are mapped to the defined entities and the data is efficiently stored using the ontology format. In the next step, machine learning algorithms are employed in order to extract recommendations for better software requirements elicitation. For the evaluation of their performance the models are fed with take a new software project with incomplete functionality as input and the extracted recommendations are evaluated.
Athanasios Paraskevas
Aspect-Based Sentiment Analysis for Reviews
User-generated online content has met a significant increase in recent years, as comments, online conversations, wikis etc. are an integral part of everyday human life. The storage capability improvement of such data, combined with evolution of machine learning techniques, has led to the creation of Natural Language Understanding-Processing (NLU NLP) systems. These systems are capable of extracting useful information, without any human intervention. The sentiment that is expressed in the content contributed by users is valuable information for analysis and a number of efficient analysis systems have been built. However, there are few systems that perform sentiment analysis for the Greek language. The lack of datasets hinders the research in that direction. Within the context of the present diploma thesis, a data annotation system is developed, which is used to create a dataset containing technology product reviews and the sentiment that is expressed towards their aspects. The dataset is utilized to train sentiment analysis models. Firstly, a comparison between already existing models for Greek is conducted. Then, a new architecture for Aspect Category Detection (ACD) from reviews is proposed, which is used along with an existing architecture for Sentiment Polarity (SP) to forge an end-to-end model. Furthermore, a web interface is implemented, with the purpose of analyzing text from any given review and presenting the respective results in a user-friendly graphical interface. Experimentation with the trained models shows promising results. The end-to-end model manages to accurately recognize aspects that are included in a review and analyze the sentiment expressed towards them.
Aristeidis Pilianidis
Automated digital transformation of HR business processes to web applications
Digital transformation is the process of using digital technologies to create new - or modify existing - business processes to respond to changing market needs. So, the need is obvious, especially now, because of the ongoing pandemic, where companies need to adapt quickly to the required changes in the way they operate. This diploma thesis aspires to respond to the above need, automating the digital transformation of HR business processes and producing specialized web applications for each of them. As part of this effort, in this diploma thesis, MDE (Model Driven Engineering) is utilized. More specifically, once an abstract process model is defined, a series of transformationstakes place, resulting in a functional full-stack application for it. In this way, the software development process is accelerated, and software is produced more reliably. In this diploma thesis, the EzProcess system is implemented, where the user through a friendly graphical user interface, sets parameters for the business processes he/she wants to transform. These processes refer specifically to the description of a job, the evaluation of the candidates who applied for the above position and finally their onboarding into the company with their successful evaluation. In addition, the above processes are interdependent in the EzProcess system, resulting in fewer human mistakes and at the same time even easier and faster software development. Based on these parameters the user has set, the executable code that performs the digital transformation of the processes is created and the corresponding web applications, which include both the client and the server part, are generated. The latter provide in the background additional capabilities of user identification, sorting and database search of candidates, by communicating with the generated server API.
Alexandros Tsironis
Algorithmic Stable Coin Without Collateral Assets
Bitcoin and other cryptocurrencies’ price volatility is one of the largest entry barriers that cryptocurrencies face today. In contrast to fiat money, cryptocurrencies, do not have a central bank which could implement some monetary policy to maintain their buying power. This in turn is reflected as huge volatility in the price of cryptocurrencies. If users are not confident that the buying power of their money is going to stay relatively the same from one day to the next, they will never adopt a cryptocurrency as their main median of exchange and they would prefer a stable alternative. Furthermore, given this price volatility, it is very difficult for a credit and debt system to be built on cryptocurrencies, since every agreement which includes future payments must consider the risk that arises from the price volatility and charge a large premium against it. Although there is a large amount of research related to the technical issues of cryptocurrencies like transaction throughput, smart contract security and optimization concerns, there is very little research regarding the price stability of cryptocurrencies which is the largest barrier for mass adoption. In this dissertation we present the Dolar Market Token protocol (DMT), a cryptocurrency whose tokens can reliably maintain a stable exchange rate with any other asset, and at the same time operate in a fully decentralized manner. Specifically, we attempt to define 1 DMT as being equivalent to 1 USD and keep this exchange rate stable albeit the change in demand. In theory, DMT could even be independent from the dollar and its exchange rate to be based on a consumer-based index or a basket of goods in the same way that the central banks calculate inflation in order to decide on what policy to follow. DMT accomplishes its price stability by adjusting its total circulating supply based on its algorithm specifications. This way it implements some form of monetary policy similar to those implemented by the central banks around the world. At the same time, it operates as a decentralized algorithm which is fully transparent and based only on its protocol specification, independent of any external human intervention. Because of this, DMT can also be perceived as a decentralized central bank.
Michail Doinakis
Real time news assistant
Hundreds of news articles are published daily on our micro-internet, making it impossible to read much of them in the fast pace of everyday life. Moreover, the relevance of the articles to the news decays over time, as the news are constantly changing even if they are referring on the same subject. It is therefore necessary to monitor them in real time. Automated retrieval of this information is necessary and can be achieved using natural language processing and understanding techniques. The current thesis studies the development of a real-time news assistant. The assistant is responsible for searching and finding answers to the user’s questions, which are identified during the conversation and the search is performed through a question-answering (QA) system. The use of the assistant makes the implementation more flexible and user-friendly. The input of the system is defined as anything the user inserts in his conversation with the digital assistant. The news supported are politics, sports, technology, movies and computer games. Articles are provided by the simulation of an external system and are fed into a classifier to determine their category. They are then stored in a database which is referenced by the QA system to extract the answer. The final system consists of the classifier, the digital assistant and the QA system. Each of these components can be replaced and optimized separately, thus creating a modular system. To ensure the proper functioning of the digital assistant and its maintenance, an auxiliary tool for conversation-driven development (Conversation-Driven Development) was used. Through this, the evaluation of the overall system in real conversations are carried out, and are used to optimize both the assistant and the other components of the system. Experiments were conducted on specific datasets in order to evaluate the efficiency of several machine learning models for natural language processing. By evaluating the results of the experiments, the optimal hyper-parameters for the operation of the overall system were selected.
Christos Emmanouil
Continuous implicit authentication of a smartphone user based on gesture and sensor data
Smartphones have become an important assistant to everyday life chores and the information stored in them is constantly increasing. This fact raises the issue of the security of data exchanged through these devices, which is crucial to ensure the protection of the owner from malicious users. These days, most devices offer a level of security using various authentication methods, which however have been identified as vulnerable and thus the need has arisen for the development of new, more secure, methodologies. Thus, a lot of recent approaches are targeted towards continuous – implicit authentication techniques, i.e., systems that run continuously in the background of the device, without the need to perform actions on the part of the user. These systems typically use various data from a mobile phone or other devices, model the behavior of the user and then provide a unique or complementary level of security, which examines whether the current user\'s behavior is in line with that of the owner. Within the context of this diploma thesis, the developed system relies on sensor data available on most smartphones, such as the accelerometer, gyroscope and touch screen. The behavior of the phone owner is modeled with these data through the use of machine learning models, which can then make appropriate decisions. The proposed system differentiates from similar approaches through the use of a set of One Class Support Vector Machines, with a range of values for the parameters for each data type, which produces the probability that a behavior is in line with that of the owner, which is then used by a confidence system to decide if the device will be locked. As it turns out, such a system is easy to develop, can be adapted to the type of data available at any time and thus can bring significant improvements in user authentication in a continuous but non-invasive way.
Georgia Giannokosta
Leveraging sentiment extraction and analysis with digital assistants
The term Artificial Intelligence refers to the development of intelligent computer systems that have the ability to understand their environment and can make decisions to achieve their goals. Essentially the purpose of artificial intelligence systems is to mimic the human way of thinking and human behavior to a satisfactory degree. This thesis focuses on the development of a digital assistant used in the health domain. The assistant can potentially be used by medical staff that wants to monitor the progress of a variety of patients. The digital assistant gives the user the opportunity to fill in information about their mental health, any symptoms they experience, if they took their medication, as well as information about daily activities related to their well-being. An additional feature of this assistant is its ability to extract and analyze the emotion of the user it is talking with. Rasa software is employed to develop the digital assistant. In addition, three Machine Learning models are implemented and compared, in order to integrate the most efficient one into the digital assistant. The chosen model is responsible for sentiment analysis of the user input. Finally, Dash software is employed to generate a medical report for either the medical staff or the user. Results indicate that the roll out of digital assistants that employ Artificial Intelligence capabilities can be achieved easily when using the variety of software tools available. In addition, besides having an assistant with basic functionalities, digital assistants can easily incorporate additional capabilities to simulate human behavior in an even better way. To conclude, the development of this type of technology has dozens of applications to the multidimensional needs of people nowadays and we await for its further evolution.
Zafeiria Iatropoulou
Optimization of traffic lights timing using Reinforcement learning to minimize car queueing time
Artificial intelligence is one of the most important areas in recent years, as the development of reinforcement learning, heavily influenced by human nature and psychology, bridges the gap between technology and humans. It overcomes the problem of data acquisition by almost completely eliminating the need for data. Reinforcement learning involves training a model to find an optimal solution to a problem, making decisions independently and interacting with the environment. Through rewards, it learns to judge which actions to take to achieve its goal. Traffic congestion is increasing worldwide and the problem needs to be addressed. In a dynamically changing and interconnected transport environment, current traffic regulations are not adaptable. An intelligent transport system is needed to improve the efficiency of the road network of smart cities. The present Diploma Thesis proposes a system for calculating the timing of traffic lights in order to minimize the waiting time of vehicles. Each traffic light at an intersection is trained to learn to change its phase according to traffic. The proposed road system has a flexible structure that is modified by adding more intersections to the original structure of the simple intersection. Q-learning is an RL algorithm used to select the next optimal signal action in a given state. It works by sequentially improving the rewards for the state-action pairs, which are stored in a Q-table as traffic light information. The tool SUMO was used to simulate the road networks. The models were trained and studied in the environments of road networks with N intersections, where N = 1,2,4,6, and the traffic lights of each intersection were trained to reduce traffic. The results of the training are compared with the responses of the current traffic management models. In addition, Q-tables of simple structures (N = 1,2) are applied to the most complex networks to assess the correspondence of systems with the experience of simple structures. According to the results of the training of the models and the experiments, all models responded efficiently to a variety of traffic situations, although the training time increases with complexity. An optimal model requires more training time than a simply good model, so there is a trade-off between training time and optimal response that every researcher should consider.
Dimitrios Nikitas Nastos
Design and Development of Greek Open-Domain Question Answering System
One of the most important and fastest growing sectors of Computer Science is Artificial Intelligence. One very important and fundamental issue it deals with is Natural Language Processing, which refers to the analysis and understanding of human languages and the ability of interaction between human and ”intelligent” systems using these languages. As the volume of information is constantly increasing and people need more and more information, a very important field of research in Natural Language Processing is Question Answering. Since the beginning of the use of computers, the ability to pose questions and receiving answers from them was a fundamental objective. A very important category of question answering systems is the open-domain question answering systems, which are able to answer general knowledge questions based on an external source of knowledge, such as Wikipedia. The develoment of Transformers and BERT-based models has led to improvements in the performance of Question Answering Systems. Although these models contributed decisevely in the development of Question Answering, the fact is that most question answering systems and especially the open domain ones, work in English, while the number of systems in other languages is very limited. The present diploma thesis attempts to design and develop an open-domain question answering system in Greek. For this purpose, in the absence of the necessary datasets in Greek, machine translation is performed on some of the most suitable question answering datasets from English to Greek. Moreover, a series of models are trained for Question Answering and Information Retrieval, which is a very important part of the open-domain question answering system. Then, the overall system, which is based on the greek Wikipedia, is installed. The system is accessed by the users via a web application that has been designed and developed for this purpose. Finally, the results of the performance evaluation of the system and its components are presented.
Nikolaos Saoulidis
Automated Task Assignment using Topic Modelling Techniques on Project Management Data
The modern agile software development process relies on task tracking systems, responsible for organizing the development process and allocating the workload to the team members. Assigning tasks to the most suitable member (triaging) is a critical and demanding process and is implemented by evaluating the features of a task report (title, description, labels, importance etc.). Previous attempts to tackle the complicated and time-consuming problem of triaging are usually limited to bug reports analysis. This dissertation introduces a method of automating the triaging process, without constraints on the type of task. Specifically, it aims to investigate the possibility of predicting the most suitable programmer to complete a task just by using the task report. Data from different repositories are used, with the main focus on text data (title, description, labels). Our method is based on applying text processing and data analysis techniques. In contrast to existing research, limited to simple text preprocessing methods (tokenization, lemmatization etc.), topic mod elling techniques (LDA) are also applied in order to extract the topics of each report and enhance its labels. Finally, data are broken down into training and test sets and are used as input for classification models (Naive Bayes and SVM). The pro posed method proved effective, accurately assigning tasks, with the topic modeling techniques contributing significantly to efficiency improvement.
Vadikolias Dimitris
Keyword-based software library recommendation in order to bootstrap software development
Software library reuse promotes efficient and effective software development, as it leads to improvement in the overall quality, reduces time-to-market and lets developers write application specific code instead of reinventing the wheel. The availability of a huge amount of reusable libraries facilitates effective software development. Code repositories provide an increasingly large number of such libraries. However, manually identifying which ones are relevant for a specific implementation is a fastidious and time consuming task for developers. In this work we focus at the early stages of software development, where finding appropriate libraries is based solely on a keyword-based description of the software. Existing recommendation systems rely on the similarity of such keywords with the descriptions of reusable libraries. This method, however, does not take into consideration the popularity of each library and the semantic similarity of searched keywords with other software entities, other than the library itself, like, for example, the descriptions of projects that use these libraries. In order to encompass in one model both the semantic similarity and the popularity of reusable libraries, we propose a collaborative filtering approach. More specifically, we organize in a relational graph keywords and libraries, such that an edge between a keyword and a library corresponds to the usage of this keyword in the description of a software project that uses this library. Given this structure, we use variants of the PageRank algorithm in order to rank the nodes of this graph depending on their relevance to a set of keywords that describe the software we want to develop. Based on this ranking, we recommend the libraries with the highest rank. We compare our method to existing library search methods on two datasets that list dependencies of Java projects, where we use the Project title as a short description. Our method performs better than the simple similarity based approach, with an execution time of a fraction of a second and could be modified in order to make heterogeneous predictions.
Vafiadis Georgios
Crowd-Sourcing Techniques in Autonomous Vehicles, Aiming to Provide Web-Based GIS Layers
The number of autonomous vehicles in use has been increasing. A growing number of companies invest in the development of algorithms that allow users to rely on intelligent systems to avoid accidents, park their vehicles or, even, hand over the navigation of their vehicles. Driver vulnerabilities, such as fatigue, emotional driving, violation of traffic reg ulations and slow reflexes, often lead to accidents. In contrast, the fast response time of autonomous vehicles alongside compliance with traffic regulations promise, according to existing literature, a significant drop in the number of such accidents. For this reason, the market moves towards autonomous vehicles. Autonomous vehicles are equipped with sensors which provide a variety of in formation about their environment. A new opportunity thus arises with regards to vehicle interactions with information of the road infrastructure. This has given rise to a new research field exploring the Internet of Vehicles (IoV), including vehicle to-vehicle (V2V) communication and vehicle-to-infrastructure (V2I) communication. The former concerns vehicle communication, whereby each vehicle reports its status and intentions to nearby vehicles. The latter, which is the focus of this diploma the sis, concerns communications between each vehicle with a central server in the road infrastructure. This information, after being processed, may be sent to the users of the road network to aid the navigation of vehicles. Crowd sourcing (CS) applications, such as Google Maps, monitor road traffic in real time and allow the development of methods for the effective interconnection of a source with the server. These methods take into account the size of information that each source must send, the structure used to process the data, the security of user data and the robustness of the system. In this diploma thesis, CS methods are used in autonomous vehicles for the structuring of data layers on a map (GIS layers). In particular, information is being extracted by LiDAR sensors as to the existence of free parking spaces or parked cars and is used to inform the users in real time as to the availability of parking spaces. This was implemented using a CARLA simulator that simulates an urban environment. Python programming language in the ROS ecosystem was used for the processing of vehicle sensor data. The processed data were sent and accessed via the MongoDB database and the visualization of information on a map was implemented through QGIS software.
Anestis Varsamidis
Evaluating code readability models in incremental changes and developing a new model
Measuring and improving code readability is important, since a lot of human effort is required during the project lifetime in reading and understanding code. Various models have been proposed to automatically evaluate how easy it is for a human to read or understand a piece of source code. In this thesis, we are looking into which models and metrics are sensitive to small changes in readability (one commit). After searching in open-source code repositories, we found readability-improving commits and also selected some random non-readability improving commits. For each changed file we calculated various metrics and readability models before and after the commit. Then, we measured the difference in each metric before and after the commits, and also between readability and non-readability improving commits. We also developed a new model that is sensitive to such small changes. To build our candidate models, we employed Support Vector Regression with linear, RBF, or polynomial kernels, and cross-validation for training. To determine the input features we applied Sequential backward selection. We found that most metrics show no statistically significant changes after readability commits, and the rest had a very small effect size. When comparing changes after readability commits to non-readability commits, the effect size is larger: almost all metrics have a noticable change and at least a small or very small effect size. The SVR code readability model that we trained employs 9 features, and has approximately the same or slightly larger differences after readability commits, compared to the existing readability models.
Evripidis Chondromatidis
Design and Implementation of an IoT Device for Training and Evaluation of Physical and Mental Activities
Today, an increasing part of the modern way of life is based on the use of smart devices based on IoT technologies. These devices, through data collection, offer an unprecedented ability to manage our daily lives. One of the fastest growing areas of IoT applications is Sports, where smart devices provide us with a more interesting and personalized sports experience, while allowing us to constantly monitor our performance through commonly accepted metrics. However, the existing implementations have high costs and a limited range of possibilities, aiming to cover a specific type of exercises. In this Diploma Thesis we attempt the complete development and construction of a low-cost modular device, which will be the basis for the integration of a large number of sports exercises through a common interface. The development of this device includes the design of the hardware (PCB) based on the ESP32 microcontroller, the programming of the hardware through the Riot-os real-time operating system, the bridging of the communication between the device and the internet through a Raspberry Pi computer node using IoT protocols and build the original casing of the device with the help of 3D printing. Then follows the evaluation process of the device during which we make measurements related to its energy consumption and autonomy. Finally, we will try to integrate a number of sports applications that are found in research and commercial implementations of devices in order to explore the adaptive ability of our device.


Christiana Galegalidou
Task Importance Assessment based on Project Management Data
The contemporary software development process dictates the use of issue tracking systems. These systems allow the enrichment of issues with semantic characteristics that optimize the software development process. One of these characteristics regards the importance of each issue, which in turn affects the priority of the issue with respect to the development sprints, as well as the expected duration for the completion of the issue itself. Since there are no clear instructions on how the level of importance is defined, it is up to the personal judgement of each developer. Consequently, issue reports with erroneous importance values arise frequently, a fact which perplexes the management process of a software project under examination. Researchers have attempted to automate the aforementioned process through the design and development of recommender systems. However, these systems specialize in predicting the importance of software bugs only, and not generally the issues of a software project, failing to filter the used information, germane to the time frame to which it belongs and not considering the fact that the classes of importance are ordinal. The current diploma thesis extends previous work in order to tackle the above mentioned aspects not covered by other research. More specifically, it proposes a system capable of automating the process of assigning importance values to issue reports, by utilizing the available information found in Issue Tracking Systems. For the implementation of this system, a data set was used, which was exported from the Jira platform. For each issue report the attributes title, description, type, and assignee-id are extracted. Subsequently, a multifactorial approach is followed, which entails the design and development of three distinct machine learning models, aggregated to a final model. In this context, four different types of prediction models were investigated. In particular, KNN and SVR classifiers were used, along with two Neural Networks, which were trained for every project. The purpose of developing the different models was to find the optimal case, after comparing their respective results. Results show that our system can successfully serve as a basis for defining the proper value for the importance of issues.
Evangelos Zikopis
Bug Fix Time Classification on Open Source Repositories
Nowadays, software development teams follow modern principles regarding the software development life cycle and use many tools, such as version control systems, bug tracking systems etc., in order to improve their productivity. The popularity and the intensive use of such tools and systems, has gathered a big amount of information regarding every stage of the software development process. By utilizing and analyzing these data, we can extract valuable information and build tools that contribute to the field of qualitative software development. New trends around software development processes aim at proper distribution of tasks to the team, flexibility in dynamic situations and development of a timetable that corresponds to reality. The achievement of the above in large open source organizations can be achieved through the analysis of software development methods and the design of systems that automate relevant processes. This diploma proposes an end-to-end system that contributes to the research of bug fix time prediction, by applying information retrieval techniques. More precisely, the designed system collects and analyzes data from GitHub repositories. The system classifies software issues according to their predicted fix time. Our approach is multilevel, taking into consideration the features title, description, assignee and labels of a bug report. A subsystem is designed for each of these features. Subsystems analyze previous data and generate a score that represents the probability of participation for each issue in every class. Finally, classification is performed by a neural network that aggregates every subsystem\'s scores. Moreover, data processing techniques are used in order to cope with the particularities shown in the datasets of open source software repositories. The proposed system is trained and evaluated in a dataset that consists of 11099 issues from 26 big Java repositories in GitHub. Experiments show that our system has satisfactory efficiency, especially when it comes to binary classification where high evaluation metrics are observed.
Nikolaos Kagiafas
Applying Data Mining Techniques on Software Repositories to Extract Design and Evolution Patterns
Close collaboration between software developers is considered essential in order to build innovative software projects. For this reason, there are several online program-hosting platforms, which enable their users to watch each other’s changes, recommendations and comments towards the improvement and evolution of code. These platforms also control different versions of the software code so that the developer can revert to previous ones if desired. All the modifications performed at a given time by a member of the software development team are bundled in a commit, where the main reasons behind them are also recorded. As a consequence, it goes without saying that these series of changes include a lot of useful information about the way a software project evolves. Applying data mining techniques on public software repositories and the data we discussed above could unveil some common bug fixes, systematic edits, frequent types of changes in a project’s architecture and frequently-used design patterns either known or unknown ones. An extensive bibliographic research in this domain reveals that the majority of scientific efforts has focused on bug fixes and systematic edits ignoring some more coarse-grained (high-level) code evolution or design patterns. In this context, this dissertation tries to extract the relationships between the classes of an object-oriented program, while also seeking to monitor the way they evolve over time. To achieve these goals, this diploma thesis adapts a Relationship Extractor tool based on the Abstract Syntax Trees analysis of some of the most popular software projects in Github web platform. After analyzing and processing those syntax trees, useful information is extracted concerning the operation, the abstraction level as well as the inheritance of classes. This information is then modeled as graphs (with classes as nodes and the connections between them as edges). These steps are not only executed for the latest version of a project, but also in each and every commit with a view to extracting the difference in relationships between the versions of a project before and after the specific commit. Finally, gSpan, which is a frequent-subgraph mining algorithm, is applied, in order to detect code design and evolution patterns used by the software community worldwide.
Dimitris Delemissis
Detection Of Abnormal User Behavior In Web Applications Using Sequence Classification Machine Learning Techniques
In recent years the development of the internet and its applications has increased rapidly, and its use occupies continuously more and more a larger part of people daily lives. Today, the internet is a basic and necessary means for communication, entertainment, information, shopping and many other functions that are now done through it. Unfortunately, with the development of these features, illegal activities as cheating other users, accessing confidential and secret information, promoting certain products and even interrupting the deployment of websites from the internet, have also increased, since hackers are exploiting the vulnerabilities in the security of web applications and systems. Cybersecurity focuses on the development of protection systems and methods that aim to detect and identify an impending cyber-attack, thus contributing drastically to protection against malicious actions. On the other hand, the field of Machine Learning focuses on developing techniques that allow a computing system to \"think\" and \"decide\", and not just explicitly execute commands that have been dictated to it by the programmer. The field of Machine Learning is widely used in various domains, such as cybersecurity, which this dissertation deals with. In the context of this dissertation, a system was modeled and developed to receive necessary and useful information about user behavior in an online e-commerce application, and after storing and processing the data in a specific way, finally feeding them into a sequential classification machine learning model to characterize the user behavior as either benign or malicious.
Antonios Eleftheriadis
Design and development of a Machine Learning based attack detection system for web applications
The increasing use of web applications and the popularity of Software-As-A-Service has created room for major vulnerability issues in systems which up until recently were “running” in restricted networks: information (sensitive or not) is now available on the internet. As a consequence, using appropriate software security procedures is the only way to protect it. Security checks must be performed in many and different layers, like the network layer, the OS layer, and also the application layer. In light of this, the objective of this diploma thesis is the design and development of a system that detects possible security attacks using machine learning algorithms. The goal is the use of machine learning algorithms to detect “good” and “bad” behaviors at the application layer. The analysis will be dynamic (at runtime) and a decision mechanism will be developed.
Horafas Christos
Design and implementation of an Automation Mechanism for the configuration of robotic devices for the simulator Gazebo
In the age of rapid development of technology, the use of robotic systems is wide throughout the spectrum of modern life and the automation achieved through the use of robotic systems, yields large and faster production at relatively lower costs. However, robots often behave inconsistently during testing, given the complexity of the systems and the large variability of the environment. Robotic simulations provide the solution to this problem, as they provide a low-cost, easily accessible virtual robot development environment. They are used to quickly evaluate the design of a robot, simulate virtual sensors, provide a reduced model for predictable model controllers, and an architecture for real-world control of robots, and so on. Robotic simulations take place in special software, robotic simulators. A robotics simulator is a simulator, used to create an application for a physical robot without depending on the actual machine, thus saving cost and time. In some cases, these applications can be transferred to the physical robot (or rebuilt) without modification. One of the most popular applications for robotic simulators is the 3D modeling and rendering of a robot and its environment. This type of robotics software has a simulator, which is a virtual robot, which is capable of mimicking the motion of a real robot in a real situation. Some robotic simulators even employ physics engines for a more realistic robotic motion output. There is a large number of robotics simulators, each serving, either different or same purposes. However, while robotics simulators offer a wealth of benefits, the need to produce high quality applications and software has become more pressing than ever. Increasing productivity, reducing errors (debugging), auditing, verifying and maintaining software play a crucial role in the quality of the final product. A solution to this problem derives from automation software engineering. Modern software, on the other hand, is very complex, as it often consists of hundreds of lines of code, distributed in many different files, and depends on numerous libraries. Changing a single line of code can affect the functionality of the entire system and result in errors, which is very likely, since most software requires a large number of people to develop them. The solution to the issue of quality comes from software automation. By utilizing automation software engineering, we have the ability to produce software, tested and ready to use, in less time and at reduced cost, thus achieving increased productivity and quality. Ιt is as if, in a way, the complexity of the system is being eliminated, as the degree of dependence on the human factor is reduced. One advantage of automation is that it also gives non-skilled engineers the opportunity to determine the operation of a software, omitting extra steps in its development that would otherwise be necessary.(continue in full text)
Anthi Palazi
Continuous Implicit Authentication of smartphone users based on behavioral analysis
The increasing popularity of smartphones has raised serious safety concerns. This is due to the fact that these devices hold sensitive personal and often pro fessional information and existing authentication schemes have proven inefficient. Password patterns and PIN codes, in particular, can easily be acquired by attackers with shoulder surfing techniques, while all widely-employed user authentication mechanisms, in general, offer one-time authentication, leaving the device unpro tected after the login stage. In this thesis, a continuous and implicit authenti cation (CIA) approach is introduced that can act as a complementary authenti cation method. This approach is supplemented by developing a methodology of personalising authentication criteria by analysing how different users behave based on the context of the screen they are browsing. This last addition serves as the greatest contribution of this thesis in the field of continuous and implicit authentication, since not many ways of optimizing authentication schemes have been explored yet. As a means of pursuing the aforementioned goals, a behavioral biometrics dataset, containing several users’ gestures, was utilized. Two types of gestures were examined, swipes and taps, on how they can serve as a way of distinguishing users. One-Class SVM played a key role in developing this methodology as it allows training with the use of only one user’s gestures, something that can be deployed in real-life scenarios. The problem of determining the behavioral variance that each user indicates (based on the context of the screen he/she is browsing) was handled as a clustering problem, addressed by the k-means algorithm. The method proved to be efficient, especially when analysing swipe gestures, and the incorporation of contextual-behavioral information can offer substantial improvements in user authentication schemes.
Paraskevopoulos Iason
Domain specific language for controlling sensors and actuators in IoT devices, using model driven engineering approaches
The Internet of Things (IoT), has been growing at an exponential rate in the last couple of years. Every year new devices invade human daily life and waiting to be controlled. Controlling software must be developed to interact with these devices and new applications could be built on top of them. Many people can’t experience the true advantages of IoT as they are unable to build applications since they lack the required technological background. Model-Driven Engineering (MDE) can help these people as it solves software engine ering problems using models of the physical and virtual world. There aren’t many attempts, which try to use MDE in the world of IoT. There are even less attempts that try to help the technology illiterates to build IoT applications. This diploma thesis proposes some tools to model IoT devices and the connections between them. In addition it provides a textual grammar for the definition of those models. Further, it develops a library for driving IoT devices through a common API. Also, using automated code source generation it proposes a way of controlling these devices through a raspberry pi and communication endpoints.
Panagiotis Siatos
On analyzing the importance of Google Lighthouse performace metrics
Ιnternet has become an integral part of humans’ everyday life, an indispensable part of information gathering, means of socialization, provision of services, purchasing and selling products. The plethora of available websites providing similar or even different services has created a new reality where each user can find sites that fulfill their needs. Therefore, sites of similar content and services focus on optimizing User Experience to attract more users. Particularly, User Experience refers to user interactions with a website and focuses on the overall experience a site provides. There are various factors that influence User Experience. This thesis employs Google Lighthouse, an automated tool for measuring the quality of web pages, and explores the very features that influence performance metrics pertaining to User Experience. Particularly, 85 features were extracted from a dataset of 200K websites, data resulting from Google Lighthouse reports. These features describe quantitatively the composition, structure and resources of each web page. After having used a regression model for predicting performance metrics scores, as defined by the simulation software, an analysis-extraction of the most important features used by the model was performed. The ultimate objective of the thesis is to enable a front-end website developer to prioritize and focus on those features that improve Google Lighthouse’s performance metrics scores, this way improving user experience.
Zikopis Evangelos
Bug Fix Time Classification on Open Source Repositories
Nowadays, software development teams follow modern principles regarding the software development life cycle and use many tools, such as version control systems, bug tracking systems etc., in order to improve their productivity. The popularity and the intensive use of such tools and systems, has gathered a big amount of information regarding every stage of the software development process. By utilizing and analyzing these data, we can extract valuable information and build tools that contribute to the field of qualitative software development. New trends around software development processes aim at proper distribution of tasks to the team, flexibility in dynamic situations and development of a timetable that corresponds to reality. The achievement of the above in large open source organizations can be achieved through the analysis of software development methods and the design of systems that automate relevant processes. This diploma proposes an end-to-end system that contributes to the research of bug fix time prediction, by applying information retrieval techniques. More precisely, the designed system collects and analyzes data from GitHub repositories. The system classifies software issues according to their predicted fix time. Our approach is multilevel, taking into consideration the features title, description, assignee and labels of a bug report. A subsystem is designed for each of these features. Subsystems analyze previous data and generate a score that represents the probability of participation for each issue in every class. Finally, classification is performed by a neural network that aggregates every subsystem’s scores. Moreover, data processing techniques are used in order to cope with the particularities shown in the datasets of open source software repositories. The proposed system is trained and evaluated in a dataset that consists of 11099 issues from 26 big Java repositories in GitHub. Experiments show that our system has satisfactory efficiency, especially when it comes to binary classification where high evaluation metrics are observed.
Bellos Vassilios
Basketball data analytics via Machine Learning techniques using the REMEDES system
Data science although pre-existing, now days dominates and may do the same in the future. The existence of huge storage space and powerful processors capable of managing corresponding-sized databases, have enabled information to be collected in every workplace, from the medical and engineering sectors to the arts and sports. In this diploma thesis we will stand in the field of professional sport and in particular basketball. Initially basic knowledge about the sport will be presented, some of the information collection tools will be mentioned and we will analyze the importance and role of data in training, in the preparation of athletes and in the decisions of coaches. Subsequently, having collected data using the \"REMEDES\" system on a specific basketball set of drills, a sports performance evaluation system was developed using the \"Python\" programming language and various Data Preprocessing and Machine Learning techniques. The purpose of this system is to evaluate as representatively as possible non-athletes, athletes and Basketball athletes in a specific set of drills that obviously concern basketball. During the analysis and through careful monitoring of the results we have drawn some very interesting conclusions that will be presented and interpreted in this report along the way.
Iosif Hadjikyriakou
Development of an automatic procedure for Continuous Integration
In recent years there has been a rapid growth in the field of cloud computing which has aroused the interest of many companies, with their demand constantly growing as well as the number of providers offering these services. However, despite the fact that the use of cloud computing has been established, offering many advantages, various challenges arise, such as data security. A key element of the software development process is the frequent testing of the application, in order to ensure quality and minimize bugs, which is achieved through Continuous Integration (CI) systems. Upon successful execution of the automated tests, CI deploys the latest version of the code in a pre-production (staging) or production environment automatically through Continuous Deployment (CD) and Continuous Delivery (CDE). The purpose of this thesis is comparing cloud providers, and then developing a method that simpifies the usage of a CI + CD/CDE system. Our approach also integrates static code analysis and evaluation. CI and CD/CDE processes are implemented through Gitlab, an open source software, with ready-to-use pipelines(Templates) supporting Node.js and Django web applications, while static analysis is performed through Code Quality which is embedded in Gitlab and is based on the Code Climate tool. The automatic installation of the prerequisites for the application deployment, in other words the server setup, and the first deployment , are performed through the Ansible software configuration management tool. Moreover, is given the capability to the user to deploy the app on the cloud platform Heroku without the need of using Ansible. The outcome of the thesis is aimed primarily at students or software developers with little experience who want to get involved and take their first steps with Gitlab CI.
Giokotos Konstantinos
A graphical application development methodology for remote robots in the context of cyber-physical systems
Just as the Internet has transformed the way people interact with information, cyber-physical systems are transforming the way people interact with computational systems. Cyber-physical systems integrate sensing, computation, control and networking into physical objects, connecting them to the Internet and to each other. A typical example of such systems are robotic systems, as they combine interaction with the environment and computational abilities. Even though robotics is closely tied to the manufacturing industry, in recent years it has branched out to other fields, such as medicine and autonomous exploration, and even in aspects of our daily life, such as for domestic use. A growth of similar scale can be seen in the Internet of Things (IoT) domain, where everyday objects are equipped with sensors to collect data from the environment and are able to connect to the Internet to share this data. We envision that, due to the mobility offered by robotic systems, their integration with IoT would enable better interaction with the environment, and simultaneously allow robots to make decisions based on data from other devices. To make this possible, there are certain limitations that must be overcome. On one hand, it is especially important to have the ability to control and monitor the robot remotely. Unfortunately, the Robot Operating System (ROS), the most widespread middleware for robotics development, restricts the management of the robot to the local network. On the other hand, it is desirable for users to have the ability to create their applications without having extensive robotics and programming knowledge. This thesis focuses on developing a system to address the aforementioned limitations. To establish the communication between the robot and the remote computer, the RabbitMQ message broker is used. At the same time, application development and the integration of the robot with the IoT world are accomplished through Node-RED, a tool for building applications for IoT systems through a graphical interface, thus simplifying the programming procedure. Furthermore, various use cases are presented, which showcase the capabilities of the system for developing robotic applications as part of the IoT.
Andreas Goulas
Knowledge Distillation into BiLSTM Networks for the Compression of the Greek‐BERT Model
In recent years, pre-trained language models, such as BERT, have achieved state of-the-art results in several natural language processing tasks. However, these models are typically characterized by a large number of parameters and high demands on memory and processing power. Therefore, their use in limited resource environments, such as on-the-edge applications, is often difficult. Within the context of this diploma thesis, various knowledge distillation tech niques into simple BiLSTM models are investigated with the aim of compressing the Greek-BERT model. The term ”Knowledge Distillation” refers to a set of techniques for transferring knowledge from a large and complex model to a smaller one. Greek BERT is a monolingual BERT language model, which has proven to be very efficient in various natural language processing problems in Modern Greek. For this purpose, GloVe word embeddings in Modern Greek, which were not previously available, are trained and evaluated. GloVe is trained on a huge corpus of texts in Modern Greek, totalling over 30GB. In order to make a fair comparison, the text corpus was crawled from the same web sources used for the pre-training of Greek-BERT. The models are evaluated on the XNLI dataset and on a text classifi cation dataset from the newspaper ”Makedonia”. In order to maximize knowledge transfer from Greek-BERT into the BiLSTM models, a data augmentation algorithm is developed, which is based on the GloVe word embeddings. It is proven that this process significantly improves the perfor mance of the models, especially for small datasets. Experiments indicate that knowledge distillation can improve the performance of simple BiLSTM models for natural language understanding in Modern Greek. The final single-layer model is 28.6x times faster, achieving 96.0% of the performance of Greek-BERT performance in text classification tasks and 86.9% in NLI tasks. The two-layer model is 10.7x times faster, achieving 88.4% of the performance of Greek-BERT in NLI tasks.
Korkouti Dionys
Optimal route planning of autonomous vehicles in dynamic environments.
Autonomous driving is developing and evolving rapidly in the recent years. In the industrial sector, there are many companies that want to establish themselves first in the market, and create the ”ideal” autonomous vehicle. Improving citizens’ safety, reducing travel time and smoothly operating traffic, drives the need for efficient and effective autonomous driving solutions. In an ideal scenario, people will be able to cross roads without paying attention to passing vehicles, car accidents will diminish and traffic lights and other road signs will no longer be necessary as cars will be able to exchange information with each other through a communication network. The development of such a technology though, is quite complex, as it requires having control of many random and unpredictable conditions. In order to develop such a solution, it is necessary for the autonomous vehicle to have an excellent perception of the surrounding space, adapt to it, and be able to respond instantly to any changes in the surrounding environment. The vehicle must navigate safely on the road and respond to static and dynamic obstacles. In addition, it should and evaluate decision scenarios and then, according to the circumstances, select the appropriate response. Thus, autonomous vehicles need to be equipped with special sensors that identify and map the surrounding area of the vehicle, are armed with complex control and decision making systems and appropriate behavioral prediction systems. This dissertation focuses on the design and development of such an autonomous driving system. The purpose of the system is to safely navigate the vehicle, from a starting point to a final destination, in a city filled with vehicles and pedestrians while at the same time it calculates the best and shortest road in compliance to traffic rules. The system has been developed in the form of a modular, ego-only system. The software is written in the Python programming language and the ROS middleware. The Carla simulator was selected, which offers cities, cars and the desired physics to conduct the results. The developed system consists of the individual systems of 1) construction of the global path, 2) perception, 3) behavior prediction, 4) construction of local paths, 5) behavior selection and 6) control of the kinematic behavior of the vehicle. These systems communicate accordingly with each other in order to achieve autonomous driving. In the global path construction system, a guided graph of the map is created and the algorithm A* is used to search for the optimal route. (continue in full text)
Anastasios Papadopoulos
Understanding the importance of demographic background for the website aesthetics through deep learning techniques
Web pages nowadays constitute the most popular source of information, business and entertainment provision. Inarguably, their aesthetics comprise an integral part of the design of a website, playing a multidimensional role. Initially, web aesthetics support the content and functionality of a website, while at the same time striving to pique the interest of the targeted user categories. The objectives of this diploma thesis are to investigate and highlight the importance of demographic characteristics when evaluating web design aesthetics, through the use of deep learning algorithms. For this, two different approaches have been applied. The first approach concerns the training of three different architectures of convolutional neural networks (CNN) across the available data set, the AlexNet, VGG16 and Xception architectures. AlexNet has been re-evaluated on this set and provides reliable results while VGG16 is presented as an improved solution. On the other hand, Xception is a contemporary architecture which is being tested for the first time on this dataset and has surpassed the literature results. The second approach involves splitting the dataset by demographic groups and training convolutional networks for each group separately. In this way the respective models can model the aesthetics preferences of each demographic group. These models are merged using various ensemble methods and the best one is opted for the evaluation and comparison of the findings. In the experiments performed, comparisons are made between the models of each approach, as well as a presentation of various relative examples is given for better understanding. The purpose of this thesis is to point out the determinant role and importance of demographic characteristics, while also highlighting the contribution of advanced deep learning algorithms to the achievement of reliable predictive results regarding subjective issues, such as web site aesthetics.
Maria Psarodimou
Punctual fault identification through Machine Learning techniques
The technology uprising in the premises of the 4th industrial revolution has led to the modernization of the maintenance field and the migration from preventive to predictive maintenance through machine learning methods and techniques. This diploma thesis aims, through research of classical and state of the art algorithms in the timeseries anomaly detection and classification domain, to the development of a user friendly and accurate tool of fault identification. To achieve this, it is essential to research for the most suitable machine learning techniques and consequently implement, adjust and evaluate their results in a real industrial environment.
Christos Ververis
Design and Implementation of a Mechanism that automates the generation of Software Systems capable of Deductive Reasoning
Today, the development of technology and its utilization in all areas of human life, creates the need for software that is easily customizable, presentable, solves many types of problems, is economical and reliable. Model-Driven Engineering (MDE), ie software development based on models, the automatic production of code based on these models, the ability to graphically display the software, in combination with the techniques of Automated Reasoning meet the above needs. In the current diploma thesis, in order to meet the aforementioned needs, all the above techniques were utilized for the construction of a complete software tool, on the Eclipse platform. More specifically, in the framework of Model-Driven Engineering (MDE), a meta model was constructed which constitutes the core of the system and incorporates terms from the field of Logic. Expanding on this, a graphical interface was created, in the Sirius environment, which allows the interested party to construct, in a graphic way, the model he wants. The construction of the model is done in the form of equations, correctly formulated in the standards of First Order Logic (FOL). From this model, Java code is automatically generated, which utilizing functions and objects of the TweetyProject library, is properly configured to be a valid input for the built-in prover of the same library, that can perform logical tests in the standards of Automated Logic. Some more functions written in Java, complete the software tool of this diploma thesis. All of the above, constitute the software tool developed in this diploma, capable of being used by various mechanisms that automatically produce systems, in order to check the validity of the systems under design, without the need to implement additional software that draws logical conclusions.
Xanthopoulos Konstantinos
Domain specific language for asynchronous message-driven architectures
The introduction of new technologies in the domain of Internet of Things (IoT) combined with their extended use has raised some concerns to the developers. One of those, refers to the interoperability of systems caused by the heterogeneity of the various protocols and communication interfaces. This is the reason for increased difficulty in developing and maintaining applications and systems composed by multiple devices and entities. A major factor in facing those challenges can be Model Driven Engineering (MDE), mainly because it raises the level of abstraction in order to avoid addressing the details and the restrictions of the specific domain by the user. Moreover, it speeds up the software development process and its quality by allowing the design and development of reusable code. For now though, the presence of MDE in IoT is still insignificant. In this sense, the present Diploma Thesis describes a solution based on models for the interoperability in message-driven IoT systems. The result of this research is MECO (Modeling Entities and COmmunications), a Domain-specific Language (DSL) that allows users to design those kind of systems without any significant programming knowledge. Moreover, there is a Model-to-Text transformation to automatically generate software that implements the communications and a Model-to-Model transformation that generates documentation diagrams and files that improve the monitoring of the described systems.
Dimitrios Karageorgiou
Python metaprogramming in linear time language for automated runtime verification with graph neural networks
The term runtime logic verification defines a field that ranges from software verification for compliance with a set of specifications to assuring the adoption of good coding practices. Under this scope, we created lovpy, a novel metaprogramming library for python, that introduces to its ecosystem the capabilities of runtime logic verification. Definition of expected behavior is performed using the intuitive specifications language Gherkin, while using the library requires no code modifications. For its implementation we utilized a broad set of tools, ranging from the domains of graph theory, formal languages theory and temporal logic to deep learning, with specific focus on graph neural networks. We also, provided the mathematical foundation for a new type of graph, designed for representing temporal specifications. Based on it, we defined a set of mathematically proved logic algorithms.(continue in full text)
Odysseas Kyparissis
Abstract - Mining Source Code Change Patterns from Open-Source Repositories
Nowadays, there is a rapid growth of open-source version control systems and repositories. A large number of new software projects are implemented, developed and maintained through these systems. Τhis way, software engineers can collaborate directly with each other, organize effectively and maintain an up-to-date history of the project’s evolution. Therefore, the volume of information stored is significant and its harnessing can lead to the development of smart and efficient systems. Within the context of this diploma thesis a machine learning system is developed, which stores, processes and groups source code changes that have taken place during the development stage, with the goal of extracting source code changes patterns. These patterns can act as recommendations for new projects, in order to optimize code development and/or fix potential bugs found repeatedly in project repositories. The proposed methodology was applied on the GitHub code hosting platform. GitHub tracks changes of source code files contained in a repository. These changes are represented as Abstract Syntax Trees (ASTs), so that the calculation of a similarity metric for the algorithmic structure can be achieved. Additionally, their semantic similarity is calculated and thus final clustering of source code changes is possible. Clusters that meet specific criteria, contain patterns of source code changes that can be used to provide recommendations for new software projects.
Athanasios Manolis
Model-driven development for low-consumption real-time IOT devices
Internet of Things (IoT) is a field that is evolving rapidly, especially in recent years. There is the possibility of developing even more applications which prove to be useful for many people, whether they have to do with simple functions in automation systems, or with larger scale applications in the industry. Therefore, more and more people want to work in this field. The process of developing an IoT system involves code development to control the system’s devices. In fact, in most cases fast response is of the utmost importance, so low-level code development is required, as well as the use of real-time operating systems (RTOS). Also, due to the great heterogeneity of IoT devices on the market, it is necessary to understand the capabilities that each device can offer, in order to make the appropriate choice of one, tailored to the needs of the system to be implemented. These requirements may seem complicated to some users, especially to people who are technologically untrained, i.e. do not have the necessary programming skills, but still want to build an IoT system e.g. for their personal use. This results in a large portion of people wanting to get involved with IoT, being discouraged to do so. Model Driven Engineering (MDE) is here to solve the problems that, those who want to get involved with IoT, may face, but also to simplify the software production process in general, as it can provide the developing of IoT systems to a more abstract level, which is more user friendly. Through this diploma thesis, one is given the opportunity to describe, using models, IoT devices, through two domain specific Languages (DSL) developed for the description of devices and the connections between them. From the models, a Model to-Text (M2T) transformation is performed for the automated code generation, for a variety of IoT devices, adapted to the characteristics that the user wishes for it to have. The software for controlling the IoT devices that is produced implements the process of taking measurements from sensors, and sending them to a broker, but also the process of controlling actuators through the broker. It also consists of low-level code, as it has been designed according to the requirements of a real time operating system, named RIOT. Finally, a Model-to-Model (M2M) transformation takes place in order to produce diagrams that provide a visualization and thus a better understanding by the user, of the wiring and intercommunication of their system.
Theofilos Panayiotou
Design and development of a tool for automating scenario production of digital assistants
The development of advanced Artificial Intelligence techniques in recent years has allowed digital assistant technologies to emerge. From customer service centers to medical diagnostics, digital assistants find application in many areas and are used daily by users. More and more companies are trying to integrate them in their framework and the technologies behind them are constantly evolving. In addition, Open Source technologies bring digital assistant tools closer to developers, allowing them to experiment with them. One such tool is Rasa, an Open Source technology for creating industrial-level digital assistants with Artificial Intelligence. However, the use of Rasa requires a high level of programming software knowledge expertise. As digital assistants become more and more necessary in everyday applications, the barrier of know-how limits the number of people who are involved with them. The present Diploma Thesis focuses on the development of an easy-to-use scenario creation tool for Rasa with the aim of rapidly creating digital assistants. Using Python and specifically the framework Django, it presents the implementation of a full-stack application, from views and resource paths to models and back-end processes. This application makes it easy to create and edit digital assistants by automating most Rasa features. In addition, the application is used by creating digital assistants, simple and complex. First the design of the scenarios and stories that the discussion will take is presented and then they are implemented in the system. Finally, the assistants are tested and the result is evaluated from the examples of discussions. According to the results, the application can successfully create digital assistants that contain the basic components of Rasa. However, as digital assistants become more complex, some human intervention becomes necessary for the desired function to be implemented. Thus, although the application works as we want in simple and complex scenarios, when the operator needs something quite demanding in complexity, it is still necessary to know programming skills.
Stefanos Papadam
Development of a graphical interface of an autonomous vehicle for driving behavior parameterization and remote controlling
The transportation of people over the years was an integral part of their daily lives. For this reason, the first efforts to manufacture a car began in the 18th century. Over the years, the industries were constructing more and more modern cars to offer the people easy and comfortable transportations. In recent years, the most known companies and universities perform research aiming to create autonomous vehicles, which will change the way traditional cars work. Some of the main problems that the technology of self-driving cars will contribute drastically, are the saving of significant time in people’s daily lives, the reduction of road accidents and consequently the safer transportations, the contribution to fuel economy and the reduction of environment’s pollution. To date, a fully autonomous vehicle has not yet been constructed, which operates without any human intervention. When this technology becomes a reality, the vehicle will have a full and precise perception of the external environment’s conditions, will make the right decisions every time, and will be able to exchange information with other vehicles to cooperate for better operation of the overall traffic. However, a lot of effort and research is demanded in the sector of autonomous driving to create a vehicle that successfully corresponds to numerous scenarios and conditions that occur inside the traffic. The implementation of such a system requires solving the problems of the external environment’s perception, the right behavior choice, and the safe and smooth transition to the final destination by obeying the traffic rules and avoiding dynamic and static obstacles. To solve the aforementioned problems, suitable equipment is needed that includes state-of-the-art sensors which will take as input the environment’s measurements. The measurements will be analyzed by a central processing unit and finally, the right decision will be taken. (continue in full text)
Stavros Papadopoulos
Image Inpainting Detection through Artificial Intelligence Techniques
Image inpainting is the process of repairing an area in an image, from which a part of the semantic information is missing and consequently there is a lack of semantic continuity. Image inpainting was initially designed to effectively repair damaged areas in images. Ηowever, it was quickly used for the purpose of forgery and deception. In recent years, methods of applying image inpainting through artificial intelligence techniques came up and achieved high quality results, producing images where the presence of inpainting is almost impossible to detect with the human eye. Therefore, it is of critical importance to develop a method that will detect the affected areas in inpainted image. For this reason, the present thesis focuses on the study of image inpainting detection methods and the implementation of an artificial neural network capable of detecting areas where an image has been tampered by inpainting. A total of eight convolutional neural networks, based on two state of the art architectures, were trained and tested. The training process was based on two configurations sets (10 and 50 epochs respectively) adopting the binary cross entropy (BCE) as a loss function. Furthermore, it was also studied to what extent the use of a training dataset consisting of images that have been inpainted in semantic areas helps more than one whose images have been inpainted in random-form areas helps more in the image inpainting detection. For this reason, two training sets were created. The first one, is consisting of images with random-form inpainting masks, while the second one is consisting of images with semantic masks (objects). To evaluate the trained models, a test set consisting of both forms of masks were created in order to give an objective interpretation of the results. The aim is to train a model, capable of producing a predicted mask Mo as output, given an image I as input. Finally, the two commonly used pixel-wise metrics, IoU and AUC, were adopted to evaluate the performance. The metrics were calculated by using the ground truth Mg and the predicted mask Mo and by making a 1-1 comparison of their corresponding pixels. Τhe study proved that, models trained with a set of images that have been tampered in random areas (random masks) achieve better results comparing to models that were trained with a train set of images that have been tampered in semantic areas (semantic masks).
Konstantinos Vergopoulos
Analyzing code bugs based on method call graphs
The increasing size and complexity of modern software projects often leads to the appearance of runtime errors (crashes), for instance due to coding inaccuracies or unforeseen use cases. Since errors affect software usability, quickly dealing with them has become an important maintenance task tied to the success of software projects. At the same time, processes for parsing user feedback, for example by dedicated teams, to understand errors or other bags and initiate maintenance operations can prove time consuming. To mitigate associated costs, an emerging trend is to automate (parts of) error understanding with machine learning systems, for example that perform auto matic tagging. In this thesis, we focus on understanding errors through extracted latent represen tations; these can be inputted in machine learning systems to predict error qualities, such as recommending which tags errors should obtain. To achieve this, existing ap proaches in the broader scope of automated bug understanding make use of natural language processing techniques, such as word embeddings, to understand feedback texts. However, in the case of errors, we propose that available stack traces leading up to crashing code segments also capture useful coding semantics in the form of paths within function call graphs. Thus, we investigate whether graph embeddings—extracted from error stack traces—can be used to obtain a better understanding of errors. To test our hypothesis, we developed a system that extracts latent error represen tations of software projects that combine textual and stack trace embeddings. To verify that these improve error understanding compared to using textual features only, we experimented on three popular software GitHub projects, where we extracted error rep resenations and used them to predict error tags (e.g. high priority) with neural network predictors. We found that, given a robust selection of predictor and enough example errors to train from, our approach improves text-based tagging by a significant margin across popular recommendation system measures.
Zisis-Milis Emmanouil
Implementation of a full stack tool in Kubernetes environment to automate the application of filters on messages using message broker technology
The transition of internet technologies to microservice architectures and the de velopment of the Internet of Things (IoT) have significantly increased the need for new methods of efficient communication between heterogeneous and distributed systems. Brokered messaging methodologies work better than REST (Representational State Transfer) and RPC (Remote Procedure Call) technologies / approaches in producer consumer (messaging) communication systems where both high-throughput trans mission of large volumes of data is desirable as well as the abstraction of producer and consumer subsystems. A lightweight and reliable technology that offers the benefits of brokered mes saging is RabbitMQ. By using it, complex and efficient systems can be built under conditions of asynchronous communication, unreliable networks and within big data application environments. This dissertation focuses on the full-stack development of a tool, which uses brokered messaging technology to implement filters on the messaging of a system. The automation of these functions through the tool, makes the effects of the involved technologies accessible to the users, regardless of the degree of their experience in the specific technologies. Messaging is carried out via a Rabbitmq Server which implements the brokered messaging technologoy. Finally, to facilitate the management of the entire system, this was set up in the context of Kubernetes, which offers the automated orchestration of the parts of the system. For the establishment of the Kubernetes environment, the minikube technology was chosen as it offers easy and fast creation of a Kubernetes environ ment. System performance was tested for different values of message input load and number of applied filters. The measured parameters refer to the frequency of mes sage entry, the frequency of message consumption, the frequency of message logging to the Database and the number of messages stored in the broker queues. From the experiments it is concluded that it is particularly important to select the appropriate number of applied filters according to the available processing power and memory resources of the system.


Vasileios Matsoukas
Automated Task Assignment Using Knowledge Extraction Techniques on Open Source Repositories
Modern software projects are constantly increasing in size and complexity, which has made software development process particularly demanding. Additionally, the evolution of cloud computing and the existence of numerous code repositories, has led to a big rise in open source software projects, where the development team members can be dispersed around the world. Therefore, the need for efficient project management and coordination within the software development team has gained a lot of importance. Among the most popular tools for this purpose is GitHub, a collaborative development environment for open source software, built on the principles of versioning control and issue tracking. To maintain high standards of quality in the process of software development, attention is paid to task assignment and management. Open source platforms with issue tracking functionality, support a formal method of recording and reporting them. However, with lots of submitted reports every day, especially in large scale projects, make task assignment an intense and time-consuming process for the responsible engineers. The current diploma thesis proposes a system capable of automating the process of task assignment, by employing information tracked from code repositories. In this approach, GitHub is used as source of information. Specifically, we utilize features such as issue title, body, labels as well as comments and commits to determine developers’ suitability in undertaking a new issue. The suggested implementation is evaluated in 100 GitHub repositories. Evaluation results show that the system performs significantly well in a wide range of repositories as configured by the size of the software development team and can therefore constitute a useful tool in the development of software projects.
Pavlos Avgoustinakis
Video retrieval based on audio content from large scale collections using deep learning
The main goal of the thesis is to develop a video retrieval system based on the audio content, using deep learning techniques. The method developed within the context of the thesis, constitutes the adjustment on the audio content of the state of the art method ViSiL [1]. ViSiL establishes a video similarity learning architecture and captures the spatiotemporal relations between videos. The proposed method is called ViSiLaudio. In order to extract representative video descriptors, transfer learning from a convolutional neural network trained on a large scale dataset of audio events is employed. A similarity matrix is produced by compairing the descriptors of two videos, that contains the similarity scores between each time frame of the one video with each time frame of the other. This matrix is further provided to a convolutional neural network, in order to capture temporal structures in the similarity matrix between the videos. The output of the above network is summarized using Chamfer Similarity to a final similarity score between the compared videos. The proposed network is trained using the triplet loss function, that increases the similarity score between two relevant videos and decreases the similarity between videos that are irrelevant. In order to test the efficiency of ViSiLaudio on the problem of video retrieval based on audio content, annotation of the audio relations between videos on dataset FIVR-200K was carried out. Also, in terms of evaluating the proposed method, two state of the art methods are re-implemented. Regarding the new dataset that occured, method ViSiLaudio outperforms competition by 14% and 34% respectively. Also, the proposed method was evaluated on three visual based video retrieval datasets. In two of the three datasets, ViSiLaudio outperforms the competition, while on the third dataset, one of the compared methods outperforms marginally ViSiLaudio. Finally, the hypothesis audio methods in combination with visual ones can enhance the results, is investigated. This combination improves the results, but the improvement is marginal.
Lampros Makrodimitris
Design and Development of an Automated Energy Broker for the PowerTac Competition
Alexandros Delitzas
Understanding website aesthetics using deep learning
Website aesthetics play an important role in attracting users and customers as well as in enhancing user experience. In this work, we propose a tool that automatically measures website aesthetics. For this purpose, we developed deep learning models which present high correlation to human perception. These models were developed using two different datasets. The first dataset was created with a rating-based ranking. Thus, it contains user judgements on websites in the form of an explicit numerical value in a scale. Based on this, we developed models following three different approaches and managed to outperform previous works. In addition, we created a novel dataset with comparison-based ranking, which is a more reliable and natural data collection method. In this case, users were asked to compare two websites at a time and choose which is more attractive. The data collection was performed via a web application that we especially designed and developed for this purpose. In the experiments that we conducted, we evaluated each model and compared the two data collection methods. This work aims to indicate the effectiveness of deep learning as a solution to the problem as well as to highlight the importance of comparison-based ranking in order to achieve a reliable result. In the final phase, we developed a tool which measures the aesthetics of a website demanding only a URL as an input. This tool can serve as a reliable guide in the hands of designers and developers during the design process.
Jason Hadzikostas
Wolie: A mobile app for handling loyalty programs
Over the last few years, the Quick Service Restaurants (QSR) industry and more specifically takeaway coffee shops and fast food delivery shops are experiencing a very large upturn with the size of the market reaching 1.5 billion euro. Of course, the rapid growth of technology has been a catalyst for the development of new tools and services for stores in order to increase their turnover. Brilliant examples are efood and deliveras, two companies that have managed to own almost 90% of the Greek market. The main goal of the current thesis is to demonstrate the development and evolution of an idea can have while transforming into a real product when we follow the Lean Startup methodology (Build - Measure - Learn). To be more specific, we will focus on the QSR industry and examine the value of centralised loyalty programs. In order to implement this idea, we have developed a mobile application which allows the end user to subscribe to multiple different venues, scan the personalised QR code of each store, collect points or stamps and earn gifts once he/she reaches the required threshold. There is also the option for users to trade points between themselves if they are both subscribed to the same store. Last but not least, the store has the option to accept and refuse the user requests and also define which loyalty program they want to run and the specifications they wish to have.
Konstantinos Papadopoulos
A Framework for Improving Network Device Visibility in Industrial Control Systems
Cyber threats to Industrial Control System (ICS) networks are on the rise. ICS are used in critical infrastructure affecting the lives of millions, however, when designed they were mostly developed with availability and not security in mind. Device visibility and monitoring in IT (Information Technology) and OT (Operational Technology) networks are an important tool both for protecting against cyber threats and for maintaining the proper operation of any industrial network. Currently, there is a variety of platforms on the market that provide this functionality. Specifically, there are non-intrusive network monitoring and situational awareness tools that ensure in-depth visibility and cyber resilience for ICS and SCADA networks. In this process of device visibility and cybersecurity, automatically identifying the devices that are in a network has a meaningful role. The aim of this diploma thesis is to design enhanced evidence-based methods for the automatic classification of networking devices. We developed a module that classifies the network’s assets by collecting evidence as input from different components on the network, such as passive or active sensors that monitor traffic and user-given feedback. Up until now, the classification process was imperfect mainly since the evidence collected was not aggregated to a central entity in order to make the classification based on the entire set of evidence from all the sources. This led in some cases to conflicts, with devices being classified with multiple values and with no way of knowing how to evaluate the results. In the new implementation, the goal is to assign a specific value to the assets (e.g. PLC), determine the confidence level of this assignment (e.g. 70%) and provide the logic behind this decision (e.g. nmap query), by collecting evidence from the various sources. These techniques will be researched and tested on OT networks but may be generalized to IT and IoT (Internet of Things) networks as well.
Evangelos Papathomas
Semantic Code Search in Software Repositories using Neural Machine Translation
Nowadays, software source code is mainly stored in software repositories and programming content websites. However, searching for code is troublesome, and, usually software engineers are forced to use conventional search engines which do not evaluate the usefulness of the results. The overall process proves to be highly time consuming and inefficient. To face these issues, software developers prefer recommendation systems that target query analysis and return relevant code examples. That undertaking, though, turned out to be extremely difficult due to syntactical differences between natural language and source code. Many of those systems present weaknesses regarding the form of the input queries (e.g. they do not receive input in natural language) and the quality and performance of their results. In addition, most of these systems utilize simplistic architectures and, as a result, do not capitalize the query and code semantics. A close analysis of the aforementioned problems resulted in the design and implementation of CODEtransformer, a system that improves upon the flaws of many code recommendation systems. Our system ensures data quality by mining code examples from popular GitHub repositories. These data are subject to preprocessing in order to maximize the extracted information. Afterwards, we train a state-of-the-art Neural Network which has the ability to accept natural language as input and perform semantic analysis on the code examples. We, then, construct a vector space consisting of code examples that ensures the best possible temporal response of each search. Ultimately, our system is not only evaluated by its performance compared to similar systems, but also through natural language queries, which are derived from Stack Overflow.
Stylianos Poulakakis-Daktylidis
Applying Data Mining Techniques on Open Source Repositories for Finding and Fixing Software Bugs
Lately, the evolution of the Internet and the introduction of the open source initiative has changed the way software is developed. Source code hosting facilities with version control are actively promoting collaboration and are now extensively used by developers to build better open source software. As a result, projects nowadays are built collaboratively by the application of successive revisions, which form patterns of code evolution. These patterns can indicate generic and reusable code edits, and therefore the need for an efficient detection method is paramount. However, despite the benefits of version control, developers are still facing difficulties when searching for source code snippets, often consuming an immense amount of their time in the process. Consequently, code recommendation systems are a neccessity for automating the aforementioned search process and augmenting collective developer knowledge. The current diploma thesis, proposes a system for mining GitHub commits in order to extract source code changes that can indicate useful code edits, or even drive bug fixing automation. Our approach operates on a large set of projects and utilizes the Abstract Syntax Tree (AST) for each block of code coupled with the commit message. Subsequently, a similarity scheme for commits is being devised, which can be efficient for identifying similarities in source code fragments, and therefore reveal substantial patterns and recurrent bug fixes. Finally, we develop a commit recommender tool which can successfully be used to provide useful recommendations for ready-to-use source code changes and messages in different scenarios.
Georgios Balaouras
Data Collection and Analysis of Energy Consumption of Mobile Phones using Machine Learning Techniques
In a modern-day society there is the consensus that smartphones have a dominant role in everyday life. By just pressing a button, someone can not only get up to speed with the current events on a global scale, but also get in touch with people all over the world and find various forms of entertainment. In particular, one of the features that makes smartphones so attractive is the portability they offer, since they utilize batteries. However, batteries have a certain amount of charges in their disposal, consequently the lifespan of a device is directly correlated to its utilization, as well as its charging strategy. The current thesis focuses on the analysis of mobile phones’ usage and the prediction of the battery’s energy drain. To begin with, for data collection the application “BatteryApp”, which periodically keeps record of the device’s usage and the battery information, was developed. The next step is the grouping of similar uses of devices through Hierarchical Clustering, which does not require an a priori selection for a specific cluster number and does not set limitations regarding the chosen distance function. After that, it was assessed based on its content in order to select the clusters with the higher information value. Lastly, the prediction of the energy drain was constructed by employing a simple linear model, two variants of linear regression, where the penalty concept is introduced (Ridge and Lasso Regression), and a non-linear model, which belongs to the Ensemble Learning category (eXtreme Gradient Boosted trees), with the parameters’ learning procedure being applied to each selected cluster individually.
Rafael Brouzos
Automatic ROS2 systems generation via model-driven engineering (MDE) software techniques
The Robotic science has been useful in our lives on many ways. Despite its huge potential, the installed industrial robotic applications are not following the fast development of the academic researches. The prototypes, the platforms and the capabilities of the best practices in robotics belong to specific users and this fact increases the cost and decreases the agility of these systems. So, there is a need of automatic software generation for robotic purposes to deliver robotic solutions to more audience. The Model-Driven Engineering promises an automatic software generation as well as the validation of the software, based on models a user creates. Models are mainly expressed in Domain Specific Languages. These languages are constructed to help domain experts to express easier, to validate their models and to make the development process easier and faster. Model-Driven Engineering techniques could reduce the cost and the time of the robotic software development process, making the robotic solutions suitable and agile. ROS 2 is a popular robotic middleware, used to raise the abstraction layers of a robotic software. It is independent of the hardware and it is used worldwide, mainly for academic purposes. Its complexity and the experience it requires from the user is the main reason it is not utilized as it could be in the industry. This work, in the one hand, studies the capabilities of ROS 2 and in the other hand, studies the capabilities the Model-Driven Engineering and their integration. It is a study on how could ROS 2 extend to a Model-Driven Software Development framework. In this work Generos, a ROS 2 system generation software, is introduced. Generos comes with GRS, a Domain Specific Language, used to express models of ROS 2 systems. Generos is able to provide structured ROS 2 system, requiring only robotic skills from its user. It makes the development process faster, easier and safer as it simplifies the process and it validates the models.
Theofilos Georgiadis
Library recommendation system for the reuse of software parts
This system helps developers when searching for python libraries. The developer constructs the query in natural language and the system returns the 10 most relevant libraries. It is based on a graph, its nodes are constructed by keywords and libraries that was extracted from a set of open source projects. For every keyword that is present with a library we connect the two nodes with an edge. For every time that a keyword is present with a library, the weight of the edge is increased by one. Using this graph we extract representations of the graph\\\'s nodes. Lastly using these representations and a method for calculating the similarity, we calculate the similarity of each library with each keyword and we extract a recommendation for the 10 libraries with the highest value of similarity.
Dimitris Gougousis
Development of an automated machine learning system for predicting the optimal values of hyperparameters using meta-learning
The benefits of machine learning are undeniable in most, if not all the aspects of human activity. From weather forecasting to classifying a tumor as benign or malignant the use of machine learning speeds up and facilitates solving the problem at hand. However, the deployment of the most suitable machine learning model for the problem at hand is a time consuming process that requires knowledge gained through experience and continuous practice on the subject. The aforementioned obstacles can be removed by means of automated machine learning which, as the name suggests, attempts to automate the development process of machine learning models, and thus the benefits of the former to become broadly available. Contributing towards that end is the field of meta-learning, which studies the performance of different machine learning models in a wide range of tasks and uses this experience to \"predict\" the most suitable model for a given task, avoiding the process of trial and error. This thesis deals with the task of automating machine learning by employing meta-learning techniques specifically for regression problems. The aim of this work is the development of a system able to decide the exact optimal values for the hyperparameters of three algorithms, given the data to process. In that way a great deal of time is saved during the deployment of the model and the use of machine learning becomes available to more groups of people.
Eirini Pantelidou
Design and development of a system for incremental static analysis of software projects
Nowadays software technology has made great progress. A detailed Internet research revealed that, there are open source software repositories (GitHub), which contain a plethora of software projects and most of them have a high level of complexity. As we refer to open source software products, we realize that these are projects, which are continuously changing in real-time, both in the number and content of their source code files. Therefore, the scheduled quality control of the entire project from the beginning, once a day or week is not enough, because delays the feedback of the developers. Generally, static analysis aims to detect and report to the developer code errors, bugs, security vulnerabilities and violations of programming rules. But every developer needs to know, how every change of a file affects the total quality of the software project. This purpose is served by static analysis with the production of quality metrics. The need for timely knowledge of the quality change of the software project, led to the realization of this diploma thesis, which aims to calculate the static analysis metrics only of changing code files and not the entire software project from the beginning. Thus, a system was designed, which directly isolates the changing and affected files from all files in the repository and then using a static analysis mechanism, extracts the new values of quality metrics exclusively for changing files. The results of the use of this system show that, it contributes dynamically to the continuous quality control of software projects, as it provides immediate and targeted information about the changes. Also, enables to time and resource savings and helps in the quality optimization of the products, as it prevents the developer from the wasteful building of software project on non-quality code parts.
Alexandros Delitzas
Understanding website aesthetics using deep learning
Website aesthetics play an important role in attracting users and customers as well as in enhancing user experience. In this work, we propose a tool that automatically measures website aesthetics. For this purpose, we developed deep learning models which present high correlation to human perception. These models were developed using two different datasets. The first dataset was created with a rating-based ranking. Thus, it contains user judgements on websites in the form of an explicit numerical value in a scale. Based on this, we developed models following three different approaches and managed to outperform previous works. In addition, we created a novel dataset with comparison-based ranking, which is a more reliable and natural data collection method. In this case, users were asked to compare two websites at a time and choose which is more attractive. The data collection was performed via a web application that we especially designed and developed for this purpose. In the experiments that we conducted, we evaluated each model and compared the two data collection methods. This work aims to indicate the effectiveness of deep learning as a solution to the problem as well as to highlight the importance of comparison-based ranking in order to achieve a reliable result. In the final phase, we developed a tool which measures the aesthetics of a website demanding only a URL as an input. This tool can serve as a reliable guide in the hands of designers and developers during the design process.
Charis Eleftheriadis
Towards evaluating Deep Neural Networks Robustness to Adversarial Examples
Deep Learning and Artificial Neural Networks achieve remarkable performance in various tasks, so this is the reason why they are preferred in most Artificial Intelligence applications. Although, it is observed that very small perturbations of the original input, can lead this specific category of algorithms to behave in unpredictable manner. This situation raises several scientific questions regarding the security and reliability of the analogous systems that Deep Neural Networks (DNNs) are deployed, and the phenomenon riches significant proportions of concerns if one considers the significance of these systems. Self-driving cars, Identification Systems and Voice recognition are just some examples of applications where security is vital. For that reason, the study of the possible methods of attacking these systems through Adversarial Attacks has increased and so the methods creating robust models against malicious initiatives. In this Master Thesis, the state-of-the-art attacking methods are being examined and the evaluation of adversarial robustness of DNNs with different level of complexity is taking place. Towards this direction, a new alternative method is proposed, in witch is possible to achieve robustness against a category of attacking methods that have not confronted yet.
Ioannis Flionis
Development of a security vulnerability scanning tool for web applications through functional testing execution
The continuous outspread and rising popularity of the internet nowadays has resulted in successive breakthroughs on multiple fields of human life. E-commerce, e-banking, social networking and message exchanging are only some of the unlimited services that modern web applications offer and consist an inextricable part of human everyday life in the 21st century.One of the most vital needs of the contemporary web is that of security. Web application development and continuous maintenance is a repetitive and sometimes painful process that demands constant validation of the offered functionality in the desired manner and with the desired result, e.g. with no bugs, and that no vulnerabilities are left to be exploited, as well. The present diploma thesis aims to contribute to the combination of the automated functional and security testing execution, for enhanced vulnerability detection. Towards this direction, a web application was developed, that applies functional and security testing on other web applications in order to reveal as many of their vulnerabilities as possible, by employing a state-of-the-art penetration tool. The developed application analyses the results and visualizes them into comparative lists and graphical charts, providing app security insights over time.
Nikolaos Giannopoulos
Development of personalization techniques for showing advertisements in electronic stores via machine learning
The rapid technological development of recent years, the improvement of computer systems, and the familiarization of a large percentage of the world\'s population with the digital world have given an enormous boost to e-commerce, which is continually evolving and serving more needs. Simultaneously, the significant increase of users and products, coming as a result of this progress, and the dynamic entry of machine learning and data science in the field of information technology has allowed e-commerce sites to improve the browsing experience significantly. Nowadays, e-commerce sites provide users with personalized product suggestions that meet their preferences, which means a simultaneous increase in sales for online stores. In addition to personalized direct product recommendations to consumers, there are also advertising views (or banners). They are quite common on e-commerce websites, aiming to help and promote consumer product groups to the consumer according to his preferences or by categorizing him according to key elements of his electronic imprint. Personalized banner recommendations have not been studied to the same degree as product personalization and are more applicable to large e-commerce platforms. This dissertation aims to design and build a real-time personalized banner recommendation system for a medium-sized online e-shop with real-time data based on machine learning methods and algorithms. In the context of the work, we propose a novel framework that takes into account the actions of the users during their navigation, known as \"clickstream\" data. The proposed framework effectively recognizes user interests and suggests banners that correspond to their preferences.
Akanthopoulos Ilias
Robustness of deep neural networks
While technology evolves rapidly, more and more responsibility is delegated to automated decision – making from computer systems. The application of Machine Learning techniques to various domains has led to breakthroughs in everyday chores and tasks. Apart from the typical domains where Machine learning is applied, authentication applications through face recognition or fingerprints is an interesting field where Machine learning is applied. More specifically, Computer Vision is the branch of Artificial Intelligence that deals with the perception of the computer through image input, with an ultimate goal to simulate the operation of the human eye. However, although computer systems are now in a good position to identify objects, very often we come up with the case where some of those objects are evaluated incorrectly. When it comes to authentication, the importance of the decisions that the computers are called to make is very big and one can easily understand that there is no margin for errors. To remedy such problems, research is performed around the vulnerabilities of machine learning algorithms employed, so that necessary measures are taken. Every decision-making model perceives several objects and evaluates them according to its acquired training. In the context of this diploma thesis, an ensemble of Deep Neural Network models (DNN) is being studied and evaluated in terms of robustness, which is their ability to cope with conditions that can lead to wrong decision – making. Also, through this study, the designation of the most effective methods for creating robust models is sought, as well of methods that intend to further shield already existing models.
Mpekiaris Theofanis
Optimal route planning and on-road autonomous vehicle navigation with and without dynamic obstacles.
The technology of autonomous driving has been extensively studied in recent years and especially in the last decade. Both the scientific community and industry are making significant efforts to develop the necessary sophisticated technology and ultimately to achieve autonomous driving. Vehicles and transportation play a key role in the development of trade and consequently in the development of societies as are known. Autonomous driving will dramatically increase citizens’ safety in the coming years, reduce transportation time and traffic congestion. The technology of autonomous driving is quite complex and its development is a challenge for the scientific community. The operation of autonomous vehicles requires a very good understanding of their environment, immediate response to changes in it and therefore, a reliable assessment of their position in space. The vehicle must have the appropriate technology to enable it to navigate the road network and execute complex driving scenarios. For this purpose, the vehicles are equipped with appropriate stateof-the-art sensors and also with the appropriate control and decision-making system. Implementing such systems is a quite complicated process, since they consist of individual systems that are specialized in solving specific problems of autonomous driving. This dissertation focuses on the development of an autonomous driving system. The purpose of the system is to navigate the vehicle optimally and safely from a starting point to a destination point, within a city where there are vehicles and pedestrians, with respect for traffic rules. The system was developed in the form of an ego-only system and also the form of the modular system was chosen. The software was built using the Python programming language and the ROS middleware. The Carla simulator was used for the system development, with which the tests, the experiments and the simulation of autonomous driving were performed. The system was developed in a modular form and consequently consists of the following individual systems a) perception, b) behavior selection, d) behavior prediction, e) construction of a basic path (or route), f) construction of local paths, h) vehicle control. Each of these systems is responsible for performing specific processes that are necessary for successful autonomous driving. The behavior selection system uses multi-criteria decision analysis to evaluate and select the appropriate behavior for the vehicle based on the environment in which it moves. (continue in full text)
Rafael Brouzos
Automatic ROS2 systems generation via model-driven engineering (MDE) software techniques
The Robotic science has been useful in our lives on many ways. Despite its huge potential, the installed industrial robotic applications are not following the fast development of the academic researches. The prototypes, the platforms and the capabilities of the best practices in robotics belong to specific users and this fact increases the cost and decreases the agility of these systems. So, there is a need of automatic software generation for robotic purposes to deliver robotic solutions to more audience. The Model-Driven Engineering promises an automatic software generation as well as the validation of the software, based on models a user creates. Models are mainly expressed in Domain Specific Languages. These languages are constructed to help domain experts to express easier, to validate their models and to make the development process easier and faster. Model-Driven Engineering techniques could reduce the cost and the time of the robotic software development process, making the robotic solutions suitable and agile. ROS 2 is a popular robotic middleware, used to raise the abstraction layers of a robotic software. It is independent of the hardware and it is used worldwide, mainly for academic purposes. Its complexity and the experience it requires from the user is the main reason it is not utilized as it could be in the industry. This work, in the one hand, studies the capabilities of ROS 2 and in the other hand, studies the capabilities the Model-Driven Engineering and their integration. It is a study on how could ROS 2 extend to a Model-Driven Software Development framework. In this work Generos, a ROS 2 system generation software, is introduced. Generos comes with GRS, a Domain Specific Language, used to express models of ROS 2 systems. Generos is able to provide structured ROS 2 system, requiring only robotic skills from its user. It makes the development process faster, easier and safer as it simplifies the process and it validates the models.
Konstantinos Letros
Real-Time Detection Of Abnormal User Behavior In Web Applications Using Machine Learning Techniques
Given the rapid development of artificial intelligence in recent years, a question arises as to whether it could be used to improve the security of online services. In recent years, the internet is increasingly becoming a large part in people\'s lives, resulting in being one of the main means in media, entertainment, communication and more. However, there are many who use the internet with the intention to increase profit, not always through legal or moral means. By detecting and exploiting system vulnerabilities, malicious internet users can gain access to sensitive data, resell them to third parties, or even attack websites of large corporations or governments. At the same time, however, with the increase in computing power and the development of mathematics and statistics, significant progress has been made in the field of artificial intelligence (AI). AI is a field with applications in many different fields like education, economics, science, health and more. In recent years, it has also been used in the cybersecurity domain to develop security systems that seek to predict or detect an impending attack while minimizing the possibility of intrusion and data leakage. Within the context of this diploma thesis, a system has been developed that runs in parallel to a web application aiming to detect abnormal user behavior in real time, while also taking measures to exclude malicious users.
Argyrios Papoudakis
Idioms Extraction from Code Repositories
In the context of this thesis, we worked on mining idioms from repositories. With the term idiom, we mean a small fragment of code that recurs in repositories and has a specific semantic purpose. Idioms are characterized by their readability and reusability to perform a specific task. Experienced developers aspire to write idiomatic code, which leads to better performance and easier maintenance applications. The importance of idioms is being realized as integrated development environments such as Eclipse and IntelliJ have specific tools that offer idioms to users. Our research to address the problem of the automatic idioms extraction is focused on clustering snippets of code from high level software projects. These projects were extracted from the version control system, GitHub based on their popularity. For the representation of the snippets, Abstract Syntax Trees have been used that retain both the structural information of the code and the semantic information with variables and methods names. The comparison of the source code fragments was performed with the pq-grams algorithm which is a method of measuring the distance of trees. Then the most representative code snippets of the resulting clusters were converted to a generalized format retaining the semantic content of the code. The results from the above procedure were evaluated based on a test set and were very encouraging.
Andreas Siailis
Sales forecasting using a hybrid ARIMA-LSTM model
Predicting demand of a product or service can be crucial for the well-being of most companies. Inventory planning, production scheduling, cash flow planning, decisions concerning staffing levels and other kinds of decisions can all depend on the precision of forecasts. Making these predictions as precise as possible leads to better customer service and higher satisfaction levels making the customers more likely to buy again. In addition, some kinds of costs get lower due to the prevention of unplanned emergency restocking and there is a lower possibility of excessive stock levels and unsold products. However, despite the importance of accurate demand forecasts, surveys have shown that the most used method for demand forecasting (48%) between companies in the USA depends on spreadsheets, while at the same time only 11% of the companies use specialized software. Available forecasting software uses a big variety of methods that suit the nature and category of the product and accuracy metrics are used as a way to compare different methods. The purpose of this undergraduate thesis is to develop and explore the appropriateness of an improved hybrid model, which was initially proposed to solve the problem of univariate timeseries predictions, in the case of predicting number of sales. The proposed model uses a modern neural network which has shown promising results in the lasts years compared to a classic feedforward neural network. In addition, an extra feature is used with the hope to further improve the predictions produced by the neural network.
Konstantinos Strantzalis
Detection of a DC Motor Operating Conditions Using Neural Networks in Microprocessor Structures
The field of Edge AI refers to the development of artificial intelligence that has the capability of processing data and run locally on hardware devices, without necessarily connecting them to the internet. Therefore, processes such as data creation can be performed without the need of uploading or downloading data from the Cloud. A main consequence of the above is the reduction of the response time of a system to extract results on a process. This triggers the development of artificial intelligence applications at the edge. Specifically, on the field of predictive maintenance at industrial level, applications of artificial intelligence at the edge can provide operational state recognition for machines in real time. This diploma thesis presents two methodological approaches to detect three states of operation, for a DC motor. These states are named as good, broken, and heavy load. Initially, for both approaches, features are extracted on the audio data of the IDMT-Isa-ELECTRIC-ENGINE dataset, after undergoing the appropriate pre-processing. A different neural network is then trained with CNN approach. Subsequently, the two models are subject to a post-training quantization process and an appropriate conversion and compression process in order to be inserted into Stm32 Discovery Kit IoT node board. After the completion of the implementations, an experimental application shall be carried out using the board to check the performance of the models on the recognition of the three sound states of the engine’s operation, as well as their response in cases of real-time change of the states. In conclusion, the results of the above procedures are presented, and conclusions are drawn on the performance of the models.
Angeliki-Agathi Tsintzira
Continuous implicitauthentication of smartphone users using navigation data
The goal of this study is to propose a methodology for continuous implicit authentication of smartphone users, using the navigation data, in order to improve security and ensure the privacy of sensitive personal data. Privacy and security are 2 interrelated concepts. Privacy refers to the right that the user has regarding the control of his/her personal information and how they are used. Security refers to the way personal data is protected from any unauthorised third-party access or malicious attacks. Smartphones contain a wealth of personal data such as photos, chats, medical data, bank details, personal passwords and information related a person\'s close circle (contacts, work, hobbies, activities). It is of vital importance to protect the above information from third parties. Protecting personal data using pin codes or biometrics is not always enough. In case the device is stolen or lost, the attacker can bypass the security code in many ways. The violation of biometric authentication, such as face recognition or fingerprint, is difficult but not impossible. The solution to this problem is achieved through continuous implicit authentication. The system processes the user’s behaviour it collects from the sensors, as a background process. If the behaviour does not belong to the owner of the device, the device is locked. This behaviour protects the data and the device. Each user\'s behaviour is unique. Subsequently, the device remains locked and personal data is protected until the correct behaviour is recognised. Within the context of this study, the accelerometer and gyroscope sensors were selected to model the way a user interacts with its smartphone. The measurements were collected in uncontrolled environment from an application downloaded from the Store. Two machine learning models were trained, one for each sensor and then, the results were combined to produce the final system’s performance. The performance of the final system exceeded the performance of the literature. The One Class SVM algorithm in its best experiment achieved FAR equal to 1.1% and FRR equal to 5.7%, while the Local Outlier Factor algorithm in its best experiments achieved FAR equal to 0.7%, FRR equal to 8.1% and FAR equal to 2.9% with FRR equal to 5%. The proposed system achieved the best percentage of metric FAR compared to other studies, while the metric FRR had one of the best percentages. The results show that the proposed approach provides an additional level of security and privacy and can ensure that 99% of unauthorised users will be denied access to the device and users personal data.
Kamtziridis Georgios
Software Security Analysis for an Initial Coin Offering Process on the Ethereum Blockchain
Blockchain is one of the technologies that was developed during the last decade. Due to its many advantages and, especially, its prospects on implementing decentralized, distributed and secure transactions, Blockchain managed to infiltrate in the technology world in a short amount of time. Initially, it was solely focused on financial transactions through the help of Bitcoin. Later, Ethereum did the next step which made Blockchain applicable to a wide variety of environments, like health and supply chain systems. However, the main advantage Blockchain can leverage relies on the creation of economically decentralized models. According to these models, every financial transaction takes place in a distributed system which is not part of any regulator, for example a bank. This system operates independently with the help of its users, while offering secure transaction handling in real time. An alternative funding mechanism has been proposed for this model. This mechanism is known as Initial Coin Offering or ICO and tries to aid financially small and medium-sized enterprises with an automated, secured and decentralized manner. However, as of any mechanism, an ICO can be easily designed using erroneous security methods and, thus, leading the enterprise to an economic failure. The goal of this diploma thesis is the technical analysis of an ICO process that takes place on a Blockchain platform. The analysis will be concentrated on the technical parts that can sabotage and diminish the security and the credibility of a such process and will present some solutions that can applied as countermeasures. At last, there will be an implementation of an ICO process which will, strictly, follow the analyzed material from the previous chapters.


Elpida Falara
Issues Assignment Optimization through the Analysis of Contributions in Open Source Repositories
We are currently experiencing spectacular results of technological progress. The discovery of the use of a series of technological innovations has resulted in the name of the present era as “the information age”. With the rise of cloud computing and the Internet, open source software platforms have been boosted. They are mainly based on the philosophy of distributed version control systems. Among the most popular ones is GitHub, which acts as a service that hosts millions of software projects and users. Collaborative development environments have changed the way the software development process is conducted; additionally, they allow continuous monitoring of software projects, which is of pivotal importance. An important part of the software development process is monitoring of the software projects for possible errors and improvements. Collaborative open source platforms usually have a bug repository where all bug reports are recorded there. However, software engineers are usually overwhelmed by the number of reports submitted daily, which have to be assigned to the appropriate software engineer to deal with the bug/issue. More automated process for assigning bugs/issues to software engineers would definitely ease the work for the software team. This diploma thesis aims to create a system that will propose the suitable software engineer in a software development team to resolve a reported bug. The system was built by retrieving information from open source software repositories in GitHub. The assignment process was carried out by implementing models based on the similarity of the bugs solved by each software engineer in the project, as well as its contribution to the programming level. Text analysis and classification algorithms applied to model the above parameters. In the end, the performance of the system was evaluated individually, but also in combination.
Aristotelis Mikropoulos
Implementation of a Source Code Quality Evaluation System for multi-language software projects
Ioannis Loias
Measuring Semantic Similarity of Software Projects Using Comments
The internet has radically changed the means and speed of information transmission. In the field of code development, it has created new prospects by creating software repositories, which host a large number of software projects that can be used by developers. However, developers often plagiarize source code that they have not developed without attributing the original creator. In this work we develop a system that can discover source code similarities from a multifaceted per- spective that analyzes the semantics and structure of both the source code and its related comments. The system we develop employs a multitude of widespread source code processing techniques to com- pare software projects and produce similarity values for different features. For example, it supports vectoring algorithms, such as word processing of source code through bag-of-words and tf-idf features, for the purpose of applying vector comparison methods at the file and function levels. The outcome of vectoring can also be refined with Latent Semantic Analysis (LSA) that reduces the noise resulting from the use of different terms with the same semantics. Our system also implements a number of existing and novel graph-based methods to compare function call trees. Finally, all methods are able to account for source code comments when calculating similarities between software projects. We tested our system on two datasets of software projects extracted from GitHub; one comprised of software projects and their forks related to the keyword ‘Pacman’ and one comprised of software projects and their forks across different domains. We asserted that employing source code comments helps better detect whether two projects of the ‘Pacman’ dataset are forks of the same project, producing more descriptive evaluation results than algorithms that do not use them. Then, for the second dataset, we used the version of our methods that account for source code comments to help discover which software projects were forks of the same one in this dataset. All algorithms were found to be approximately equal in yielding higher similarities for forks of the same projects. However, algorithms that compare function call trees could more reliably discover similar projects and yielded almost zero similarities when comparing forks of dissimilar ones.
Panagiotis Mouzenidis
Modeling and expansion of the robotic architecture R4A for automated generation of user interfaces
Model Driven Engineering (MDE) aims at solving problems by building models. These models can describe a system without taking into account platform specific limitations. Then, by using executable model transformations a model can generate a platform specific model. This way MDE aspires to solve more generic problems, that can be applied for solving smaller problems by using the right transformations. Obviously, since a generic model can apply to many problems it is cost effective and produces quality source code. In this diploma thesis an Eclipse plugin is designed and developed that generates user interfaces to control robots. The user can easily add or remove functionality and generate the code of the user interface. The generated user interfaces communicate with the robots via the API of a framework that is called R4A. The user interfaces are completely platform independent as they are web based and use only HTTP calls to communicate with the robot. The proposed approach has been tested on the NAO robot which is a humanoid robot developed by Aldebaran Robotics in 2008 for research, educational and entertainment purposes. The target group of the plugin are engineers that want to have a simple user interface to control a robot in no time, but also others who just want to control a robot for entertainment.
Giorgos Tsamis
Automatic code generation for robotic applications based on structured text input
The use of robots is becoming ever more wide-spread, enabled by constant research on the field of robotics and artificial intelligence. Robots, having been used in the industrial sector for decades now, are also becoming accessible to ordinary people. This creates the need to find new human-robot interaction styles, that are not overly complicated or difficult, for use even by users without a technical background. For this reason, new systems are being developed, in order to improve the communication between users and robots, be it with natural language recognition, gesture recognition etc. Towards that end, a system was developed in the context of this diploma thesis, that facilitates the assignment of tasks to the NAO robot based on a description of its desired actions, given by the user in a structured text format, and the automatic generation of the corresponding code. The implemented system was based on another code generation system for the same robot, in which graphical symbols were used for the input of the desired actions and the way they were connected. The main topic addressed by this thesis was the temporal part of the above problem; determining the correct sequence of the actions and the transitions between them (serial or in parallel execution, conditional or random transition, loop, preemption), based on the text phrases provided by the user. Determining the type of each action was not important in the scope of this work and so it was annotated by the user during the text input. Based on that input, the system orders the actions, sets their connections, generates the corresponding code and shows the result to the user using a graph. Then, the generated code can be executed on the robot. The evaluation of the aforementioned system showcased that it is possible to effectively describe all the different ways of linking the actions, including their combinations, with the structured text format, designed for the system input. It also showed that it is possible to easily convert natural language scripts to the proposed input format, in order to produce the desired robotic application
Dimitra Ntzioni
Automatic generation of high - level interface s to collect robot sensor data using the R4A platform
In software engineering , the term automatic programming describes a mechanism that creates a program which in turn allows scientists to code at a higher level of abstraction . Nowadays , robot applications , both in business and home environments are gaining t r a c ti o n , increasing the need for automatic software generation without errors. It is well know n that robots are equipped with multitude of sensors, which play a key role in their operation and in accomplishing certain tasks. For this reason, it is often necessary to control the produced data, in order to build software systems. This diploma thesis aspires to take the first steps towards the process of automating the development of ready - to - run interfaces to collect robot sensor dat a . Towards this direction , MDE (Model Driven Engineering) is employed . More specifically, once a subtractive model has been defined, a series of transformations takes place, resulting in a fully functional system . T his way, the software development process is accelerated and software is produced with greater reliability. Within the contex t of this diploma thesis , we have designed and implemented C oRSeDA ( Collecting Rob ot Sensor Data Automatically ) , a system where the user interacts through a friendly graphical user interface and defines the features for the desired sensors of a robot. Based on the sensors and the ir parameters, the system generates automatically executable c ode , based on the R4A platform , to collect data from the specific robot . A t the same time , a fully functional interface is generated providing informa tion for the whole system. The data of the system and its sensor s are displayed, along with any other information generated, in a web appli cation created for that purpose . To test and evaluate this system, experiments w ere performed on the robot NAO, an autonomous, programmable humanoid robot developed by Aldebaran Robotics .
Aggelou Evaggelos
Detection and Transfer of Objects by a Robot & Design and Construction of a Power Supply Circuit
The field of robotics has been growing rapidly in recent years and is becoming more and more a part of human life. People nowadays assign small, everyday tasks to autonomous robotic agents, thus facilitating their lives. A widespread application of this kind is robotic brooms, which help cleaning the house. Also, autonomous robotic agents can be used to explore unknown spaces, which are inaccessible due to some, natural or not, disaster. Finally, by using a robotic arm in such agents, we can expand their capabilities and gather various items from their surroundings. The present diploma thesis proposes a solution to the operation of an autonomous robotic agent, which has the capability to explore and cover (by a camera) an a priori unknown space. Also, by using a robotic arm, the robot can gather objects located scattered within the environment. For the implementation, code was developed which, using data from a laser sensor, creates the map of the area where the robot is located and through camera data, calculates the field that the camera covers. Finally, code was developed that, through RGB camera data, recognizes a specific type of objects within the space, approaches them and then, using a robotic arm, collects and stores them until it returns to the point where it began the exploration, having covered the whole area in which it operates. At the same time, a circuit was developed that aims to supply with power the peripheral devices of the robot, namely the computer and the robotic arm that are attached on it, via a LiPo (Lithium - Polymer) battery. Also, using an appropriate sensor, current, voltage and power consumption data are measured, and then being sent to the computer on the robot. In addition, a LED light is used, that indicates the voltage level of the battery. Finally, this circuit was printed on a circuit board (PCB - Printed Circuit Board) and was placed on the robot.
Christopher Bekos
Design and implementation of Model-Driven mechanism to automate graphical Web-UI generation for domain specific application modeling
This thesis describes the creation of the Simplified Web Sirius Framework (SWSF), a framework designed to overcome Sirius\'s inability to run independently of the Eclipse environment. The SWSF provides a graphical environment where the user can process models using the widgets defined by a GDSL. The graphical environment is automatically generated, depending on the GDSL given as input and is similar to the environment provided by Sirius. SWSF has been constructed using both MDE and web technologies. MDE technologies are necessary in order to automatically generate these parts of code, which describe the differences between the graphical environments, created for each GDSL. In addition, the usage of web technologies provides users with the ability to create and edit models using a graphical environment equivalent to Sirius, while experiencing the benefits of a web app.The SWSF supports some of the most basic and frequently used features of Sirius. An .odesign file as well as an .ecore file containing the definitions of a GDSL and the concepts of a particular field, respectively, are given as input to the framework. Then, an xml parser is used, in order to transform the information of these two files into a model that conforms to the gdslMetamodel metamodel. Next, this model is used as an input to an M2T transformation, which generates two python modules. These modules will be used in conjunction with the executable form of appMetamodel and a UI (User Interface) in order to produce a graphical modeling environment, just as described by the GDSL. The creation of a model which conforms to gdslMetamodel, as well as M2T transformation are key elements of SWSF, which make it capable of automatically produce different graphical environments, according to a given GDSL. SWSF integrates the above-mentioned systems, by providing an online application based on the client-server architecture. The server holds information about the model, its graphical representations and tools describing modifications on it. The client is a UI, which displays graphical representations to the user, using the corresponding information provided by the server. In addition, the client undertakes to handle the interaction with the user and to forward the appropriate requests to the server, in order to properly update both the model and its graphical representations.
Maria Ioanna Sifaki
Applying Data Mining Techniques to Extract Evolution Patterns in Question-answering Systems
Developers still face difficulties and obstacles when using reusable code snippets, which are mainly related to their functionality and the occurrence of errors. This is because the validity of the answers in such communities is not checked by experts but only by the members themselves. Therefore, this imposes an effort to improve code snippets and to timely correct their errors. Our research has focused on identifying and then clustering the most common edits (and mostly errors) found in the history of answers of the most popular community named Stack Overflow. The SOTorrent dataset was the basis on which our system was implemented as well as its qualitative assessment. In this way, we have managed to make each individual solution of an answer to a generic solution offering ultimately useful and utilizable information to developers so as to avoid similar mistakes in the future. Finally, the visualization of the data and their clustering have significantly helped to strengthen the evaluation of this work, as the answer edits groups were highly coherent. By creating an edit recommender, we have confirmed the accuracy of the research results and, moreover, we have succeeded in improving the content of some generic edit comments by proposing other more appropriate ones in their place.
Georgia Pantalona
Identifying software engineer profiles using data extracted from version control systems
The software development process is constantly evolving, and this development is indicated by the ever-growing need for producing new software. The need for the fastest deployment of new features and the effort of integrating the customer into the software development process has led to the development of new software development models (eg, agile) where collaboration plays a leading role. These different types of models require the engineers to have developed different types of skills, whether those are hard skills, such as coding, or soft skills, such as communication or project management. Therefore, it is clear that there is a growing need to recognize skills of engineers and assess them as to the extent to which they meet the requirements of a role in a software development team. Finally, with the emergence of agile software development methods, the use of version control systems was also increased, and such systems include a large amount of data related to the software development process. The purpose of this diploma thesis is to exploit the data from the version control systems for the recognition of skills with a view to more objective evaluation and the better role assignment. In this direction, data from GitHub are extracted and used to create metrics that reflect the skills of engineers. Additionally, a benchmarking of the metrics is carried out in order to properly categorize the proficiency of each engineer for the different types of skills. Finally, those metrics are presented in the form of graphs in an application. This application presents the profile of each engineer in an easily understandable way to enable the best possible assignment to a role based on the skills of the engineer. After evaluating the system for a capable set of software engineers from GitHub, we can conclude that our system produces useful results. The profile overview is satisfactory as it gives an insight into the role and skills of each engineer, thus contributing to their more efficient exploitation within the software development team.
Panagiotis Sakkis
Automated application of linting rules through machine learning
The last few years, the vast growth of the Internet, has added many new possibilities and has altered the way of writing code. The reuse of parts of code is an everyday phenomenon in the field of software development. Thus, writing “clean” code, that is code that follows the principles of readability, maintainability and extendibility, is of upper importance. A lot of research is being held constantly into the reassurance of these values. One of the basic tools being used towards that goal is the linter. Linters are crucial nowadays, especially when the programming languages lack the part of interpretation, such as JavaScript and Python. Researchers throughout the community, lately aim their work into improving those tools and making them more efficient. The most popular linter right now is ESLint. It is an open source tool, that is fully configurable and as a result it offers huge possibilities to the programmers. Our goal in the thesis is to examine the usage of ESLint, done by the programming community, extract useful conclusions based on that and propose improvements. WE will attempt to apply modern machine learning techniques in order to give a solution to some of the problems faced when using ESLint. For our analysis we used some open source repositories from GitHub, from which we extracted useful data that we have available for future use for researcher that want to involve in the subject. Alongside, based of those data we implemented some tools, using machine learning algorithms, that offer direct solutions to the problem of configuring the rules of ESLint.
Orestis Georgiadis
News popularity prediction with image/text content
The prediction of popularity of online content (text/image/video), is a topic of great interest for the related research community, and also for the companies that publish that kind of content. Current work focuses on news articles, and its goal is to predict the popularity of the article as long as it gets published. The prediction is achieved by a machine learning system, that receives data about the articles as input, processes them accordingly and trains/tests a model in order to estimate the number of impressions of each article. In this thesis two different models have been developed. The first one is a regression model that estimates the exact number of impressions. The second one is a classification model, that classifies the articles into four different clusters depending on the number of the impressions. Both models use the same input data. Firstly, the titles of the articles are being processed and a vocabulary that contains the word embeddings is created. Those word embeddings come from the pre-trained library fastText, which uses the Continuous-Bag-of-Words (CBoW) method for training. Then, a Python dictionary is created that contains the image labels of the articles, along with the probability of each label to fit the image. Those labels are created by the ImageNet database, and were trained by the ResNet50 classification method. Finally, the last system input is the publisher of the article, which is a website. After that, a convolutional neural network was developed that was used to train both models. The results are promising and they seem to improve the basic methods that are not using neural networks for the prediction. One may safely say that the regression model is the most efficient of the two, and that neural networks lead to more efficient predictions on news popularity.
Kosmas Tsiakas
Autonomous aerial vehicle localization, path planning and navigation towards full and optimal 3D coverage of a known environment
The present Diploma Thesis focuses on implementation of algorithms for solving the problem of fast, reliable and low-cost inventorying in the Logistics industry. The usage of drones simplifies this procedure and aims to determine every product’s position with a few centimeters accuracy. This problem consists of two subproblems: a) the position estimation in the indoor environment and b) the autonomous full coverage of the area. In order to successfully tackle the problems described above, a known 3D map in OctoMap format is used. During the research, a Particle Filter based algorithm that uses anarrayofdistancesensorsaroundthedronewasimplemented,inordertotrackthepose of the robot against the known map. Navigation is based on a PID position controller that ensures an obstacle free path. As for the full coverage, an extraction of the targets and then their optimal succession is performed. Finally, a series of experiments were carried out to examine the robustness of the positioning system in three types of motion, as well as different speeds in each of these cases. At the same time, various ways of traversing the environment were examined by using different configurations of the sensor that performs the area coverage. The experiments were entirely performed in a simulated environment.
Floros-Malivitsis Orestis
Natural Language Understanding for Human-Robot Interaction: the NAO Robot Case
This diploma thesis aim store cognize inside a natural language text actions that belong to a predefined list and map them to an already existing robotic platform. It is not attempted to synthesize a fully working algorithm that reflects the logic of the text; rather, a static mapping of the given sentences to actions is performed. The output of the system could be processed by an independent application for the final production of executable code. Fortheabove-mentionedpurposes,we have developed a natural language under standing (NLU) system, r4a-nao-nlp, that recognizes the supported actions of the R4A-NAO metamodel. We have implemented a modular software pipeline that segments the text using semantic role labeling to identify multiple user intents per sentence. In addition, the system utilizes the results of coreference resolution throughout the text to enhance the performance of intent classification and slot filling in sentences that include mentions. Since the dataset for training the NLU system had to be created fromscratch,our approach has been designed to cope with a low-data regime; there are no requirements in the dataset for sentences that combine multiple intents since that would result in polynomial growth of the dataset size. The final output of our pipeline is a directed graph that encompasses all detected actions and connects them with the original conjunctions of the text. This implementation benefits from its modularity since the used models, with the exception of those that perform intent classification and slot filling, come pre-trained on much larger datasets and concern major natural language processing tasks and therefore are bound to improve with the further development of the related technology. We believe that our approach can be utilized, without the need to increase training data, by task-oriented dialog systems or other related applications that often lack the ability to recognize multiple intents per sentence. Inconclusion, we have developed a system that can prove useful to the final user who can obtain optimal results if they learn about its limitations and idiosyncrasies. This procedure is not considered to demand technical or esoteric knowledge on r4a-nao-nlp.
Pantelis Photiou
Design and implementation of a hybrid system to satisfy execution time constrains for software built with ROS1, ROS2 and IoT frameworks
The enormous development of robotics has created the necessity of the remote monitoring of these machines and at the same time the remote monitoring of the parameters of a controlled environment such as temperature, pressure or humidity, using sensors. Here is where the Internet of Things (IoT) comes into the picture. In the concept of the IoT, it is very important to connect the \"Things\" to a network so that they can send and receice data. \"Things\" can be devices that are used in everyday life, have computational power and allow connection to the Web. The rapid evolution of the IoT industry over the past few years and the large number of low-cost smart devices, has created the need for the development of protocols and tools allowing the communication between them and their connection to the Web. In addition to the existing protocols, new have been developed to efficiently transfer data between all these devives, as well as remotely monitor them through Cloud. This diploma thesis presents the development of tan IoT application, which allows communication between smart devices and robots. For the creation of the system, the use of the ROS2 framework has explored. ROS2 is the latest version of the Robot Operating System (ROS), the most used robotic framework in our day, which implements/ uses the DDS (Data Distribution Service) communication protocol. DDS is a real-time, data-centric, publish-subscribe protocol created specifically to meet the needs of a fully distributed IoT system. In addition to the infrastructure, experiments were performed to evaluate DDS as a communication protocol for robotic applications, as well as an integration with an IoT platform, drawing useful conclusions.
Dimitrios Tatsis
Test case automated generation through dynamic analysis and symbolic execution
With the current technological advances there is arisingneed for testing the reliability and the security of software. Security testing usually requires manual work from a highly specialized security researcher.However,lately a number of program analysis techniques have been employed in order to automate parts of the process. In this thesis an automatic system is developed that uses dynamic analysis and symbolic execution in order to produce testcases for programs. Dynamic analysis is used in order to collect run time information like the current program state,the usage of input data by the program and the paths that have been executed by the program. This information is used in order to accelerate symbolic execution and produce a test case file that is able to pass various program checks and execute more program code, thusincreasingcodecoverage.These test case can be used in further analys is of the program. The system that was developed gives encouraging results producing partial input files with the structure that is expected by the programs analyzed, automatically and without any prior knowledge.However,the limitations of symbolic execution quickly become apparent since the analysis complexity has exponential complexity .As a result additional methods must be used in order to surpass these limitations in larger programs.
Kelesakis Dimitrios
Enhancing the conversion rate of e-shops with dynamic pricing techniques
In recent years, e-commerce has been established as one of the dominant ways of makingcommercial transactions. Efficient pricing policy strategies employed by businesses are critical fortheir survival in highly competitive markets, in order to achieve their goals and maximise theirprofits.Various dynamic pricing algorithms have been implemented and adapted to the continuouslychanging conditions of the online markets. These algorithms benefit from the abundance of dataavailable to the online stores, data related to market conditions as well as customers\\\' preferencesand consumption habits. Utilizing the above data and integrating them into dynamic pricingstrategies can give a significant competitive advantage to businesses. However, so far thesetechniques have been applied to limited business domains, e.g. airline and hotel bookings.This diploma thesis focuses on the development of dynamic pricing methods for online stores thattake into account demand, competition, available stock, as well as user profiles. The system createdcombines the mentioned data and uses neural networks in conjunction with optimization andpersonalization methods and algorithms in order to set dynamically the price for each product percustomer in order to optimise the conversion rate.
Nikolaos Malamas
Full Coverage of Known Area with Unmanned Ground Vehicle using Path Patterns and Semantic Map Annotation
Over the last years a rapid growth of the robotics industry has been noticed. Despite the fact that they were primarily used for military applications, nowadays many robotic applications have emerged trying to help people to deal with both everyday and professional tasks.An important field of robotics applications is the unmanned ground vehicle navigation in known or unknown environments. There is a vast variety of such systems that have already been developed, such as autonomous vehicles, automated house cleaning robots, autonomous real time inventorying, mapping unknown areas etc.The present Diploma Thesis focuses on studying and solving the problem of the fast and optimal autonomous inventorying of any known 2D warehouse. This problem consists of three sub problems: a) the separation of the known area into subareas, b) the computation of the visiting sequence of these subareas, and c) the computation of the full coverage path in each subarea. The area coverage is accomplished using sensors with a priori unknown characteristics. A 2D occupancy grid map representing the environment has been used, in order to face these problems. First, a topological analysis of the map is implemented to locate the area\\\'s different rooms, according to which the area is separated. Next, the optimal room sequence is computed. Then, a coverage path for each room is computed through many stages of optimization. The evaluation metrics of the process are the complete area coverage and the execution time of both the computations and the navigation. In addition, in the present Diploma Thesis two different navigation strategies have been developed and compared.Finally, a series of experiments were carried out at each stage of the implementations in order to thoroughly test each part. Maps with different topologies and sensors with different configurations were used to obtain robust results and test the developed process. As for the experiments, they were solely executed in simulation environments.
Despoina Touska
Video Forgery Detection using Autoencoder and Recurrent Neural Networks
In the modern post-industrial era it is undeniable that image is a dominant element of modern man’s life. It captures a wealth of information from the external environment, providing a tangible representation of reality capable of influencing public opinion, shaping personalities and consciousness. Making an image is not a spontaneous process. Among other things, many sociological examples have a direct and intense influence on how creators of an image construct the reality to their product. In other words, the phenomenon of distorting an image by people with malicious intentions who aim at presenting another truth from the original one, is a very often case. The above action poses many risks and has a direct impact on people and society. Against this backdrop, we come across branches of forensic science that specialize in detecting such violations. In this direction, the current thesis provides a way to detect tampered regions of videos and demonstrate them with the corresponding heat maps. To this end, techniques of steganalysis were used to extract feature vectors from the data, which eventually analyzed by machine learning. It is used an architecture based on autoencoders and recurrent neural networks. A training phase on a few pristine frames allows the autoencoder to learn an intrinsic model of the source. Then, forged material is singled out as anomalous, as it does not fit the learned model, and is encoded with a large reconstruction error. Recursive networks, implemented with the long short-term memory model, are used to exploit temporal dependencies. Preliminary results on forged videos show the potential of this approach.


Vasilios Politiadis
Automated Production and Execution of Tests on RESTful Web Services
The REST architectural style first appeared in Roy Fielding\'s doctoral dissertation in 2000. The basic idea behind REST architecture is that all objects managed by a web service can be considered as system resources. Since it’s b ased on client - server logic, a clien t may send a request to the service to create, retrieve, update, or delete a resource representation , through a URI that is associated with it, and by properly using the CRUD verb s of HTTP protocol. In recent years, thanks to its simplicity and power, the REST architectural style has been embraced by the global software industry, has conquered the field of web services and is now the dominant model for their development. Meanwhile, the growing need for easy and fast development of reliable software has led software engineers to deploy methodologies such as Model Driven Engineering ( MDE ) , with the goal of increasing performance, productivity, and automation in the process o f software development . The idea promoted by MDE is to use models at different levels of abstraction while designing systems and to automate the process by using model transformations from higher to lower abstraction levels , until the final executable code is produc ed . One of the most important issues throughout the software development process is to test the quality and reliability of the produced software . Software testing is in principal performed following two different approaches in regard to the test er\'s knowledge of the internal structure and design of the tested software. The White Box approach dictates that the tester knows the internal structure of the system and performs tests from the developer ’s point of view , while the Black Box approach dicta tes that the tester considers software as a black box , where he implements inputs and analyzes its responses , from the end user ’s point of view . T he context of this diploma thesis is the development of a programming tool that helps automate the generation and execution of tests on RESTful web services developed using the S - CASE MDE Engine. Given the S - CASE PSM meta - model and implementing the PSM model of a given service as input , a Model - To - Text transformation is performed , which produces a dedicated Black Box testing application. Finally, the execution of the former application generates detailed test reports and test results stored in JSON format .
Anastasios Kakouris
Continuous User Authentication in Web Applications through Behavioral Biometrics
The use of continuous authentication systems that employ behavioral biometrics are gradually gaining ground as the preferred method of authentication. This is due to the limitations imposed by standard authentication methods, which are unable to guarantee user identity beyond initial authentication and conceal serious security issues such as impersonation and exposure of personal data to third parties. On the other hand, a user’s behavioral biometric trait is very difficult to be copied or intercepted in any way, and as a result it can be used appropriately for the implementation of a continuous authentication system, ensuring this way a secure session for the user. Within the context of this thesis we choose as a behavioral biometric trait the way a user interacts with his/her keyboard. Practically, we use keystroke dynamics to identify and authenticate users. Specifically, we analyze the keystroke digraphs of a user that are collected from the typing of words, and extract three features: hold time of the first key of a digraph, hold time of the second key and time elapsed since the release of the first key and the presssing of the second key. In order to test and evaluate the implemented system we collected a total of 59000 keystroke events, from 37 subjects within a period of 12 weeks. Using that data, we tested various pattern recognition models such as classification and outlier detection, while experimenting at the same time with data pre-processing techniques for the reduction of feature vector dimensions (PCA). The best results are obtained by employing an One-Class SVM for a 3-d feature vector, achieving 0.61% False Accept Rate (FAR) and 0.75% False Reject Rate (FRR), and by employing Gaussian Mixture Models for a 2-d feature vector, this way achieving 1.35% FAR and 1.71% FRR. Our results show that keystroke dynamics can be used effectively in a continuous authentication system.
Anastasios Loutroukis
Employing semantic analysis methods for personalizing recommendation in e-commerce systems
The digitization of markets has made the e-commerce industry the dominant way of performing human transactions. A key challenge for e-commerce systems is the high volume of data they manage; appropriate techniques need to be designed and developed in order to personalize the content available to consumers, offering them information on products of interest to them. Recommendation Systems for e-commerce, which employ machine learning and data analysis techniques in order to generate appropriate personalization content models, have been developed in this direction. The main problem characterizing existing systems is the lack of semantic understanding of the provided proposals, as most of the literature algorithms focus on proposing products solely based on the analysis of users’ rating patterns. Within the context of this diploma thesis we have focused on the design and development of semantically aware personalization techniques for e-commerce. These techniques focus on the content that characterizes the products and the interests of the users. The model that is generated employs a set of natural language processing methods and achieves, initially, the categorization of products based on their thematic content and, secondly, the assignment of users to the exported product categories based on the users’ interests.
Alexandra Ampartsoumian
An alerting system for the elderly by utilizing the NAO social robot
The social and technological inclusion of the elderly, as well as their psychological and physical support, have recently become a major issue, due to the fact that senior citizens comprise an ever-increasing percentage of the general population. Within this context, numerous scientific studies have been conducted worldwide, engaging socially assistive robots, in order to find solutions that will enhance the autonomous living and the overall quality of the seniors’ life. In the present diploma thesis, the humanoid robot NAO assumes the role of a socially assistive robot, in order to assist the elderly towards quality living. In particular, NAO reminds seniors of their medication and general events of their everyday life, as well as plays songs associated with their past memories and experiences. Furthermore, each time a medication event reminder is triggered, the proper medicine image is simultaneously being displayed on a computer screen. Appropriate information of medication and music tracks can be inserted by the caregiver of the elderly person via the application graphical interface, while scheduling of all types of events is possible via the Google Calendar UI. Additionally, in the context of the present application, the assistive robot can operate as a recreational companion so as to contribute to the improvement of the mental and emotional well-being of the elderly. Its entertaining role is accomplished through a set of activities that allow the elderly user to be informed about upcoming events of their everyday life and to listen to music pieces of their choice on demand. The interaction of the elderly with the robot is accompanied by a series of interactive images displayed on a computer screen so that the activity is as pleasant and user-friendly as possible. The evaluation of the implemented robotics application by a psychologist specialized in issues related to the elderly, can be found at the end of this document.
Napoleon Christos Oikonomou
Call by Meaning: Calling Software Components Based on Their Meaning
Software development today involves code reusability, to a great extent. Software components to reuse are often difficult to be fully understood, since they are written by third parties and are usually designed to solve more abstract and general problems. This makes component-based software development a tedious process, as it requires developers to first find the component they need, then understand exactly how it works and make it compatible with their system and lastly, continuously upgrade it to stay compatible when the component changes. Software developers have long realized that even the creation of a simple application is now quite complex. This is because we still rely on the description of various components based on their name. The problem is that this name consensus cannot be easily expanded outside the environment it was created. As a result, the process of discovering and making compatible software components, as well as the ability of the application to respond to changes in its external environment, becomes difficult. This Diploma Thesis deals with the creation of an infrastructure that, potentially, replaces arbitrary naming conventions, creating an environment in which component discovery is based on the -generally accepted- assumption that any method can be described analytically and uniquely if the description of its inputs and outputs is sufficiently detailed. Aside from discovery, having in mind installation and compatibility issues, it seemed logical that this infrastructure should be cut off from any local development environment and be a more universal part of the ecosystem.
Ιoannis Μaniadis
UI Persinalization in E-Commerce through User Interest Analysis
In recent years, the increase of consumer activity on the interest, as well as an increase in processing power made available to smaller business, have created the potential and necessity for websites to do an in depth study of their visitors, in an effort to approach them in better and more intuitive ways. Specifically with regard to e-commerce websites, machine learning techniques are being applied to make specific personalized products and/or product categories recommendations to visitors, based on their recorded activity and/or their known traits (age, gender, etc.). This process is one of the ways through which web personalization is applied, and to achieve this a number of methods have been developed, with each aiming at dealing with specific aspects of the issue. The objective of this diploma thesis is to design and develop a method which efficiently analyzes the recorded activities (history) of an e-shop\'s visitors, and makes predictions about their interests when they revisit the website. Analysis is performed on read data from Visitor the website, as defined by the website\'s administrator. This is a novel approach which deviates form typical methods found in the relevant literature. Based on these data, our system applies machine learning techniques to predict which sections of the website are most likely to be of interest to each user in their next session. Within this framework, different techniques are tested, evaluated and compared, and the ones that yield the best results are presented.
Valasia Dimaridou
Analysis of Human Presence and Behavior at Points of Interest
The widespread use of image recording systems has le a d to the implementation of integrated systems for human behavior observation at various points of interest. Meanwhile , research towards targeted advertisement has increase d during the past years. The combination of the two reported science tendencies along with the continuous evolution of the hardware responsible for image recording and processing has led to the design, develop ment and validati on of an innovative system responsible for analyzing human presence at points of interest . The above need is summarized in a methodology that conducts statistical analysis of the number of people crossing a point of interest (e . g . in front of billboards), while estimating and recording some biometrics (such as age and gender) and the direction of each person’s glance, provided his/her position and pose allow it . Within the context of the present diploma thesis , an extensive research was carried out concerning the previous work on methods used to detect and analyze individual s passing by a place . Furthermore , our work proposes a method of handling videos including people in order to solve the problems of foreground and background separation, face detection , finding distinctive landmarks in them, calculating the rotation in three degrees of freedom and providing biometric lab e l s . The second part of t he methodology is responsible for incorporating the proposed methodology into a tracking system which uses depth images . The proposed methodology is meant for usage as a real time embedded system , so its’ implementation is as computationally inexpensive as possible. The current work includes an extensive evaluation methodology which demonstrates the capability of using the proposed system for commercial purposes .
Spyridon Papatzelos
Study on Cost of Application Execution and Storing Process in Blockchain environments
In a world that is evolving faster than ever, information, data or intelligence are the most valueable of all the goods. How fast is the information available and to whom are the keys that made Blockchain transform from an idea to a useful tool, which is expanding year after year of its very short life. Trace-ability, i.e. the \"what\", the \"when\" and the \"where\" of the procedures of the transaction, and transparency, i.e. every user\'s right and privilege to access the Blockchain database, are two of the most important advantages this technology has to offer. Discovering this innovating technology makes the discoverer want to learn more, even if he/she is not in the computer and data professions. The main goals of this thesis are the analysis of Blockchain systems and searching optimization techniques from the stand point of data storing cost and application execution in a Blockchain environment. Two objectives have been targeted. First, the implementation of Blockchain application on the subject of agricultural logistics and, second, the research of optimization techniques on this application. The first stage began with constant study of various papers on the characteristics of the operation of Blockchain system. The Blockchain system is a general concept which means that it is not an actual implementation. It was introduced to the world of the applications by Bitcoin, the first implementation of Blockchain in 2009. Bitcoin combined a variety of existing technologies to make Blockchain as it is known today. In order to make the understanding of Blockchain more efficient the study immersed in different environments such as Bitcoin, Ethereum, Hyperledger. Open character, security and user-community made the choice of Ethereum optimal. The second stage includes a further examination of Ethereum characteristics. Ethereum allows the application implementation through Smart Contracts in the Ethereum Virtual Machine (EVM), the execution environment. Smart Contracts were developed using the programming language Solidity. All the acquired knowledge was used to design an agricultural logistics application. Finally, various scenarios were tested in order to result in cost optimization of Blockchain execution and storing process. The optimization techniques were aplied to the agricultural logistics application.
Dimitrios Rakantas
Implementation of a robotic application platform using a robotic web simulator
Nikos Oikonomou
Extracting Semantics from Online Source Code for Software Reuse
The widespread use of the Internet and the convenience of information sharing, which comes as a consequence, resulted in essential changes regarding software distribution and development. Software examples are now in abundance in code repositories, as well as in various programming related websites. However, the search for code examples (aiming in code reuse) proves to be problematic task when using conventional search engines, as the Software Engineer is force to set his project aside and waste a lot of time examining the usefulness of the results. In order to cope with the aforementioned difficulties and provide a more specialized solution to the problem, development begun in Recommendation Systems in Software Engineering (or RSSEs). The fundamental objective of these systems is the ability to recognize, whten given a query, code examples with relevant content. Nevertheless, this is difficult for most systems due to the lexical gap between search queries, usually expressed in natural language, and retrieved documents, often expressed in code. In addition, the majority of systems often require composition of complicated search queries and provide results in a non-optimal order. Finally, most of the systems are based on simple Vector Space models (VSM) and do not make user of semantic information for the retrieval of useful code snippets. The need for an efficient solution to the problem led us to the design and development of a new recommendation system called StackSearch. Our system uses as a data source the Stack Overflow Website. After careful preprocessing of both textual and code data, we trained certain Vector Space Models. Using the aforementioned models, our system accepts search queties in natural language form and is able to take advantage of the semantic information of text that accompanies code snippets, thus, achieving to present the user with relevant examples. Those code examples have earlier been mined from the Stach Overflow posts and checked for syntax errors. Finally, we evaluate the system by comparing our ranking algorithm with existing solutions from recent research to ensure its efficiency.
Triantafyllia Voulibasi
Test Routine Automation through Natural Language Processing Techniques
Artificial Intelligence and Big Data concern a big portion of the technology research community nowadays. The question is how to move from research to an actual \"intelligent\" implementation. This work utilizes numerous novel Big Data manipulation and Artificial Intelligence techniques in text mining in order to build a productivity tool prototype for Software Testing Engineers to produce automated tests. The tool finds its foundation on Recommender Systems, where Deep Learning approaches are calculating the semantic similarity between a search query and the results, after taking into account massive software related documentation data. Association Analysis empowers the system with the ability to \"remember\" and improve itself, as older inputs are stored and processed to assign better scores to future queries. This tool addresses test engineers working with Model-Based Testing, where building blocks can be teamed to implement an automated test. The user can create a test scenario that will transform into a test ready to run and supports automatic requirement tracing. Experiments conducted in the dataset of the European Space Agency’s Ground Segment test scenarios demonstrate the ability of this domain-specific tool to produce results close to human thinking and ease testing procedures
Adamantidou Eleni
Development of an application that provides services based on speech recognition
In an era where technology is a big part of most people\'s everyday life, verbal communication between a human and a machine could make the use of technological products easier, even by older people. For that reason, a speech recognition application which provides users with information is implemented in this project. The user makes an oral question to the application, the application translates the question into text and responds to the user providing him with information that receives from the communication with the corresponding web service. The application consists of 7 individual stages and it is designed to be easily extensible, making the addition of a new service possible with a minor change to the existing code. In this thesis, 3 services that inform the user about hospitals, pharmacies or about the weather were implemented. In the matter of speech recognition, a new, specific, greek model was trained in order to improve the performance of speech recognition against other models. The new model is trained on recordings of people asking questions whose content is relevant with the 3 services above. The conducted experiments show the significant improvement of the application-specific speech recognition system against a generic system as well as the efficient response of the application to the user\'s questions.
Fengomytis Thomas
Source code quality analysis in multi-language software projects
The rapid development of technology and the widespread use of the internet have resulted in the evolution of software development, which is now dominated by the concept of code reuse. This has been contributed by numerous open source software which are distributed freely in online repositories and are easily accessible by developers. So, reusing code to develop a new software creates the need for quality evaluation of software components. Additionally, the wide adoption of component-based software engineering techniques has led to the emergence of multi-language software projects, namely software projects containing code fragments written in different programming languages, where developers aspire to optimize the exploitation of the capabilities of each language. This further increases the complexity of software quality evaluation. To this end, existing practices focus on analyzing and evaluating source code of single-language software projects. In practice they employ static analysis tools that can evaluate software projects only in one programming language. Within the context of this diploma thesis a quality evaluation system for multilanguage software projects was designed to take account of the calls between the source code sections of different programming language. The methodology of this thesis is based on adapting static metric analysis techniques for quality evaluation, by differentiating their calculation of static metrics based on the the various multi-language calls. Applying our methodology to multilingual software projects (implemented in Python and Java) has shown that the system is able to provide a comprehensive and representative quality evaluation model and can therefore be a useful tool for developers
Eystratios Narlis
User-perceived quality evaluation of user interfaces in web applications through the identification of dominant design patterns
Web pages have become an indispensable part of gathering and providing information in all areas of everyday life. Whether a user is working on a computer at the office, entertaining himself on a video game console, communicating with others on a smartphone, or entering an address in a GPS device while driving a car, people are constantly interacting with GUIs through which they exchange information. The plethora of available web applications has led to a new reality where each user can find applications that meet every need. In the majority of cases, the available applications that cover certain functionality are dozens, which makes the design of the interfaces crucial to the end user\'s choice. Better design in terms of both attractiveness and usability significantly increases the eligibility of an application by end-users. Based on the above, this diploma thesis aims at providing assistance in order to improve the design of graphical interfaces of web applications by proposing an automated tool that can evaluate the design of a website by modeling how best design is perceived by end users. To this end, training data was collected using data mining techniques in a data set containing the 5000 most popular websites. Static analysis was performed on these websites in order to identify common design templates for their structural elements and then used these templates to create a tree model of web site design assessment. The system developed, in addition to rating websites, is able to propose specific design changes based on the prevailing standards that have been exported and can therefore be a useful tool for developers
Dimosthenis Kitsios
In computer science, the term automatic programming identifies a kind of programming in which a mechanism creates a program that allows scientists to create software applications at a higher level of abstraction. Model-driven engineering is a software development methodology that focuses on creating and exploiting models that are conceptual models of all issues associated with a specific problem. Therefore, it emphasizes and aims at abstract representations of activities that govern a particular field of application instead of computational (i.e. algorithmic) concepts. In this diploma thesis, an Eclipse-based method of automated code generation was designed to allow the use of model-driven approaches and is used to create executable code from Ecore models defined by a metamodel. Modelbased software technology aims to reduce development effort by creating executable code from high level models. The aim of the diploma thesis is to create a pleasant and friendly graphical user interface through which the user interacts and selects the functions that define the commands to be executed by a robot. To test and evaluate the system that was designed, experiments were performed on the Nao robot. Nao is an autonomous, programmable humanoid robot developed by Aldebaran Robotics in 2006.


Sofia Sysourka
Design and development of an aesthetics quality evaluation system of web applications based on structural analysis
Graphical User Interfaces (GUI) form a communication channel between man and machine and they aim to offer an effective and easy way to serve the functional requirements of software. Typical examples of machine software are web applications, which constantly grow in both number and popularity. As a result, web application providers strive to build user interfaces that offer attractive aesthetical design and ease of navigation and access to the information that users are looking for. The question raised, which constitutes the basic research field of this diploma thesis, is the following: How can the aesthetic design of a webpage GUI be evaluated? The above question dictates the design and development of a reliable mechanism for evaluating and modeling the design of the GUIs of web applications. This diploma thesis aims to contribute to the aforementioned question by identifying design patterns related to the aesthetic characteristics of the GUIs, as well as specialized patterns which are implemented on webpages of specified content. The process of finding the aforementioned design patterns is based on the way end-users perceive aesthetics (user- perceived aesthetics), indirectly reflected on web application popularity. Towards this end, a data collection and processing system was developed, that led to the development of an evaluation model of the aesthetic design quality of the webpages. The training data comprises 75 popular webpages of three different domains (e-shopping, news, search engines). Static analysis was performed on these webpages in order to collect useful information regarding the GUI components (i.e. the number of elements in each webpage and the way they are distributed among the layers of view), as well as the calculation of a series of metrics used widely in bibliography. Classification and clustering techniques were applied on the collected data which resulted in the development of a combined aesthetics evaluation model. Results successfully incorporate the notion of aesthetics, as it is perceived by the end users, and therefore it can be a useful tool for programmers.
Bagia Rousopoulou
Automatic user - perceived usability evaluation of web applications through the identification of dominant design patterns in user interface elements
In recent years, the rapid development of the internet has become apparent, resulting in a plethora of web applications that have become source of information for millions of users. People, regardless of the age group they belong to, their economic and social status, use the web on daily basis for multiple purposes such as information retrieval, entertainment, communication, business etc. The continually increasing trend of web applications, in conjunction with the existence of multiple tools which automate the design of user interfaces, necessitate the development of methods that asses user perceived usability. The current diploma thesis aims to contribute towards the improvement of web applications graphic interface design, by suggesting an automated evaluation model for user interfaces based on crowdsourcing information regrading the way usability is perceived by end-users. Towards this direction, the proposed system applies static analysis techniques into a number of popular websites in order to calculate a series of metrics closely related to UI aesthetics and visual complexity. Those metrics constitute the information basis upon which design patterns are extracted using artificial intelligence and data mining techniques. The identified design patterns are used to create a rule-based system for the evaluation of user interfaces. Preliminary results regarding the usage of the proposed system indicate that it can be a useful tool for developers and interface designers.
Dimitrios Dontsios
Meta-modeling of Non Functional Software Requirements for RESTful Services
The REST design pattern was first introduced in Roy Fielding’s dissertation in 2000. This pattern is in fact a set of well-defined rules and constraints, which, when applied to a given web service, they make it more ap- pealing by improving performance, enabling scalability, modifiability and easy grasping of the functionality of the service. The basic idea of REST is that every object of the service is a resource that can be easily created and destroyed, using Uniform Resource Identifiers (URIs), namely web links. These resources can be modified by a well-defined set of actions, the HTTP verbs, and they can be shared between clients and servers using strict representation forms and protocols. Due to its simplicity, REST became so popular that more and more services are built guided by its architectural style. Thus, an increasing need for tools that automate the process of building RESTful web services is evident. Many such tools are easy to use, but lack in some aspects. Some of them achieve to meet more constraints that REST demands, while others manage better automation process. However, what is common in all existing tools is that they aim to fulfill only functional requirements whereas they disregard the importance of non-functional re- quirements. The aim of this thesis is the design and the development of a tool that automates the production of a RESTful API, while taking into account nonfunctional requirements apart from the functional ones. This tool is an extension of the S-CASE MDE Engine that semi-automatically produces RESTful APIs by employing model driven. The implemented extension uses the MDA process, a model driven engineering process that the OMG (Object Management Group) introduced for the purposes of model driven engineering. The generated code conforms to the MVC architecture using JAVA EE. It also complies with all the rules that the S-CASE tool introduces, thus conforms to Richardson’s Maturity Model. The nonfunctional requirements are satisfied by modeling design patterns and integrating them into the S-CASE MDE engine, hence by the inte- gration of those patterns into the produced code.
Athanasios Lelis
Deep auto-encoders for source code retrieval and visualization
Dimanidis Anastasios
RESTful Web API Development using the Gherkin language and the OpenAPI Specification
The problem of the effective satisfaction of customer requirements in the typical software de- velopment lifecycle has been of major concern, not only to the software industry, but also to the academic world. Thus, new software development methodologies like Behavior-Driven Develop- ment and the Agile manifesto are introduced, dictating continuous and detailed communications between the software engineer and the customer. At the same time the World Wide Web is ma- turing. The concept “The Web as an Application Platform” is greatly adopted. Inevitably, Web developers and industry specialists are discussing methods of effectively designing and devel- oping Web applications. The current state of the industry shows that technologies like REST, might be the answer to those discussions. This thesis sets two major goals: a) To design a methodology where RESTful Web API functional requirements are described in a customer friendly format and in natural language, b) To develop a software tool that will transform the described requirements to technical information. For these goals to be met we employ Gherkin -a user requirements language-, the OpenAPI Spec- ification -a specification for REST Web APIs- and finally Natural Language Processing (NLP) mechanisms. At first, it was examined how would the REST specifications be mapped to Gherkin. For that reason, Agile and BDD company members were contacted, API company blogs and seminars were examined and the available bibliography was thoroughly studied. Based on this research, the methodology Resource Driven Development (RDD) was designed. Per RDD, the functional requirements of a Web application are organized in resources. Thus, the original way of writing Gherkin feature files was revised. The steps ‘When’ and ‘Then’ are now used to model the HTTP protocol. The scenarios are used to describe resource and application state changes, as implied by REST. The RDD methodology is described in detail with specific examples of Gherkin files. The next step was to develop a software tool, which was named Gherkin2OAS and which is responsible of converting Gherkin requirements to the OpenAPI Specification. The software is written in python 3.5. It’s functionality, it’s functions and the NLP mechanisms it uses are thoroughly described. Gherkin2OAS can detect in natural language text HTTP verbs, parameter names, types (like string, int, float, bool, array, file, date, password and more) and properties (like required, min/max, descriptions and formats), resource linking through the HATEOAS concept, roles/users, HTTP status codes and more. It also has a separate functionality, where it organizes those technical properties to an OpenAPI Specification document. Furthermore, Gherkin2OAS has built in messages that try to guide the user in writing Gherkin requirements, the way a programming language compiler would help programming software.
Christos Psarras
Development of a Source Code Visualization System using Information Retrieval techniques
The internet has completely revolutionized the way we communicate and exchange informa- tion. It has provided the necessary infrastructure for the creation of software repositories, that offer access to large collections of open source software, including software applications and libraries. Libraries provide the building blocks for the creation of larger, more complex, software, by implementing useful algorithms, that effectively confront specific problems.. This functiona- lity, though, comes at a price, due to the considerable time and effort required to understand and/or extend a library. Several applications have been created to analyze the structure and the documentation of a given library and present them to the developer. Even though these tools can be quite effective in some cases, the documentation for a library is often limited or even non-existent, while the structure of the code is not sufficient for deducing its functionality. As a result, there is a growing need for tools that harness the semantic contents of source code, left behind by developers in identifier names and comments, in order to provide a semantic description of the functionality of an application, as well as an analysis for the cohesion of its package structure. By utilizing state of the art information processing techniques we have implemented a system that analyzes the source code of a given library, extracts useful information from variable/method names and comments, and identifies semantic topics. Our system supports a set of vectorizers (count, tf-idf) and clusterers (k-means, LDA) and automatically evaluates their performance based on the purity score of the extracted topics. Furthermore, an online search is performed in order to find tags related to the top terms of each topic, and thus offers a more abstract description of the topic. Our system also provides visualization of the distribution of packages to topics. Finally, it identifies similar topics, and clusters them into semantic categories. Based on the results of a case study on Weka, as well as the application of our methodology on 5 other libraries of different sizes, we assess that the purity metric is at least 60% and in most cases over 75 − 80%. Furthermore, examining the retrieved tags for the topics indicates that their semantic content is described accurately. Finally, we provide a comparison between the clustering algorithms of our system, and further assess their effectiveness with respect to the selected vectorization techniques.
Themistoklis Papabasileiou
Extracting API usage examples from software repositories
In the era of the Internet, information sharing is an everyday phenomenon. The big amount of data shared deems their effective usage mandatory. Software, as a form of information, exists in abundance online mainly in software repositories. However, the vastness of this information usually makes searching for code usage examples hard, while the usage of software libraries is further obscured by the lack of sufficient documentation. These library usage examples consist mainly of Application Programming Interface (API) usages, of which documentation is not always available and even when it is, no guarantees are provided regarding their quality. More precisely, conducting such a search through common search engines proves cumbersome and time consuming. This problem is addressed by Code Search Engines (CSEs) that mine useful code information to provide relevant results. However, they also fail to solve this search problem effectively. Recommendation Systems in Software Engineering (RSSEs), especially those regarding API usage mining, offer a more specialized solution to the aforementioned problem. These systems provide relevant usage examples that match the queries given by the user. Still, most of them do not perform any checks whatsoever on the quality of the results returned and produce redundant examples or cover a small part of the API under examination. The need to systematically confront the problem of API usage mining leads us to design and implement an RSSE system in order to effectively search for usage examples for a given API. Our system checks whether the retrieved code is compilable and employs a Frequent Closed Sequence mining algorithm in order to ensure that the produced results are of high quality and the API is covered effectively. Moreover, the rejection of redundant information at the stage of mining makes our results cohesive. As output, our system can summarize an API by providing general examples for its methods as well as process queries regarding specific methods. We evaluate our system with respect to the percentage of API methods that are covered by the produced examples and further assess the quality of these examples by calculating their variety and cohesion. In addition, we conduct a case study for the Machine Learning and Data Mining library weka where our system is tested in a real life scenario. The results of the evaluation are quite encouraging, indicating sufficient coverage of API methods while producing cohesive examples in a timely manner.
Ioannis Zafeiriou
Software Engineer Profile Recognition Through Application of Data Mining Techniques on GitHub Repository Source Code and Comments
Software development methodologies, or process models, attempt to describe the steps that should be followed along the way from conception to deployment of software. There are traditional approaches that focus on the sequence of discrete and well defined steps, like the Waterfall model, where communication channels are realized by passing documents, and others, like the Agile model, which emphasizes on the need for flexibility and constant, direct communication between team members. These newer models are very popular with software teams of varying sizes. Due to the importance and the means of communications described by these models, it is desired to recruit people that possess both technical and communicational skills. The problem, though, that arises when looking for people like these, lies in the difficulty of assessing these skills. Within the context of this diploma thesis we focus on this issue. To do so we employ data mining techniques for identifying different team roles and also assess the activities of team members within the software development and operations process. The implemented system draws user activity data from the GitHub web platform and uses them as input to cluster team members. This way we attempt to provide insight into the different team member roles that appear in open source projects, like the ones at GitHub, and the performance of the users that act under these roles. After extensive experimentation with different combinations of datasets and evaluation features, the results that are presented as final are considered to offer critical insight into those matters.
Eirini Chatzieleftheriou
Design and Development of a Refactoring-Based Quality Enhancement System
Marina Gerali
Automated Test Case Generation using Source Code Repositories
Recently, programmers and software engineers have started trying to take advantage of the abundance of information on the internet, to be able to reuse code snippets which fit in their projects, thus saving up time and effort. To do so, Code Search Engines (CSEs) were developed, which acquire code snippets from various software repositories and by using data mining algorithms attempt to present the user with results as relevant to his/her needs as possible. The process of relevant code search is facilitated by the use of the so-called Recommendation Systems in Software Engineering (RSSEs), which cooperate with CSEs, respond in more complex queries than CSEs do, take into consideration the developed project and apply complicated data mining techniques, in order to present results to the end user. Despite the contribution of CSEs and RSSE systems to the field of code reuse, they are unable to solve the problem that is the subject of current thesis. That is, the searching for reusable test methods and the automated test case generation. This thesis aims to demonstrate an RSSE system which receives user’s source code and constructs appropriate queries, in order to search for test cases in online source code repositories, such as GitHub and AGORA. By using sophisticated techniques, which will be presented in detail in later chapters, our system mines data from the retrieved code snippets, evaluates them based on their relevance to the query and checks whether they compile and run successfully. For each method that a user requests, the retrieved test methods are presented to the user, ranked in descending order. The user may select those test methods he/she prefers, so that he/she can construct his/her own test case. Furthermore, if he/she chooses so, he/she can select one from the proposed test cases, that occur from all the possible combinations of compilable test methods. After submitting a set of queries to our system and after evaluating its performance, we believe that it produces satisfactory results, since the retrieved and relevant results are more than one in most cases, whether the user searches for single test methods or for complete test cases.
Ioannis Malamas
Design and Development of a Web Analytics System Based on Monitoring and Analyzing Users' Behavior
Σhe continuous outspread of the Internet is accompanied by its active presence in all areas of human activity. Information pages, e-commerce platforms, social media and other websites are an integral part of modern reality. Everyone interacts with them, more or less, which depends on the age, familiarity and particular needs of each. This new reality feeds the ever-growing trend for new websites and web applications that aspire to attract as many users as possible. Developing webpages and web applications that meet the ever-increasing demands of users is not an easy task: rather it is a multifaceted problem. Its difficulty lies in the fact that different user categories imply different requirements. In addition, existing tools and recommendation systems that provide suggestions on optimal design have the disadvantage that they provide general assumptions, without taking into account the scope of each website. Thus, the following question arises: \"How can a personalized assessment methodology be developed to design a website?\" The answer to the above question lies in the use of information that originate from the end users themselves. Thus, this diploma thesis aims to contribute to the above research question through the development of a system for recording and analyzing the behavior of website users in order to come up with useful conclusions regarding the user-perceived optimal design. Recording user behavior can be achieved through the collection of data that reflects how users interact with the website. Typical examples are data related to mouse movements, clicks, subsections of the website they are accessing, and more. The collected data can then be analyzed to draw conclusions on understanding how users are browsing the website and how the user experience could be improved. The system implemented in the context of this diploma thesis is called \"Synopsis\" and is responsible for the recording and modeling of user interaction within web pages. \"Synopsis\" was developed as an online application and tested in a real environment where it was used to track the behavior of e-shop users. The results indicate that it can provide valuable information and contribute into the optimization of web pages design.
Vasilis Bountris
Towards Source Code Generation with Recurrent Neural Networks
The evolution of Machine Learning and Data Science disciplines has been rapid during the last decade. As computer engineers, we are looking for ways to take advantage of this evolution. In this diploma thesis we examine the potential of recurrent neural networks to generate source code, given their effectiveness at handling sequences. We propose two approaches, based on per-character analysis of software repositories. Following appropriate code pre- processing and network training, models generate source code through a stochastic process. We perform static code analysis on model products, in order to examine the performance of the approaches. We have applied our approach on the JavaScript Language. The analysis shows the great representational power of the recurrent neural networks, but also the inability of our approaches to satisfactorily address the problem of automatic programming. Based on these findings, we propose further research directions and ways of exploiting the models that were designed.
Eleni Nisioti
Automated Data Scientist
The science of machine learning has achieved, based on solid mathematical tools, to transform the current data deluge into the understanding of underlying social, economical and natural mechanisms and the generation of related predictive models. However, the presence of computa- tionally demanding problems and the current inability to automatically transfer the knowledge on how to apply machine learning on new applications and new problems, delays the evolution of knowledge itself. The necessity of discovering paths that lead to a deeper understanding of the machine learning mechanisms is evident, bearing the ambition of training models that optimize the very process of learning, instead of individual applications. AutoML, that has recently emer- ged, attempts to automate the application of machine learning. Its most apparent manifestations include software systems that serve as productivity tools, instruments to make experts more efficient and effective, but not eliminate them. A common feature of these systems is the embed- ding of meta-knowledge, namely knowledge produced by the application of machine learning in past experiments, a trait that adds experience and adaptability to the system. This diploma thesis aims at implementing a software tool to facilitate the AutoML process. Exploiting current technologies, such as the rich CRAN repository, we explored opportunities offered by machine learning techniques and have attempted to push forward the state of the art by embedding meta- learning for optimal hyperparameter selection and forward model selection ensembles to our system. Main aspiration of our work consisted in designing and implementing an experienced, intuitive and expandable automated data analyst. The experiments seem promising, and we argue that the implemented tool could constitute an informative contribution to the area of AutoML.
Maria Kouiroukidou
Automatic generation of user interfaces for RESTful web services
Over the last decade, the architectural style that has prevailed for the development of web applications is the one introduced in Roy Fielding\'s Thesis in 2000, the REST architectural style. Since then, thanks to its simplicity and power, the REST architectural style has conquered the field of web applications and is practically dominant as far as web service development is concerned. For this reason, the growing demand and use of REST APIs is accompanied by the tendency to create automated processes that can produce an application that consumes RESTful web services, minimizing the time and cost needed for their development. Many automation tools have been created in recent years, however, many of them are unable to produce ready-to-run applications, and require software developer intervention. This diploma thesis aspires to make the first steps towards the process of automating the development of fully functional and ready-to-run web client applications. In order to automate the process of generating web client applications, in this diploma thesis, MDA architecture (Model Driven Architecture) is employed. MDA defines a set of clearly defined templates and tools, and describes a development process where, once an initial abstract model has been defined, a series of transformations take place, resulting in a fully functional application. This is intended to speed up the process of software development and to generate more reliable software. My diploma thesis, CREATE (Client for RESTful Api Automated Engine), implements an automated graphical user interface development tool. It automatically produces web client applications that consume RESTful web services as generated by the S-CASE , manage CRUD (Create, Read, Update, and Delete) requests and receive, process, and present their responses. These web client applications provide features such as database search, user authentication and communication with external services. Also, graphical interface features are provided such as pop-ups for confirming or updating user movements, navigation menus, image integration, etc. Finally, documentation is produced to better explain the code. The application CREATE is implemented using the AngularJS framework.
Aspa Karanasiou, Chrisa Gouniotou
Interactive detection, tracking and localization of QR tags employing the NAO humanoid robot
Nowadays, robotics is one of the most progressive technological industries. Using an automatic robotic vehicle, a plenty of processes are now feasible to be achieved. One of the most desirable characteristics, which a robotic vehicle should dispose in order to complete a process, is the ability of autonomous navigation interior. The issue, which is being considered in this bachelor’s thesis, is the calculation of the most effective path, between two points, and its secure navigation. Specifically, its implementation is focused on interiors, which are known in advance and contain static and dynamic obstacles. A designing method for the most effective path developed, while this path is designed in order to avoid any possible conflicts with the obstacles. Furthermore, a pinpointing method developed targeting to the robot’s place to be known in a dynamic environment. In order to achieve that method, firstly, it should be determined the way which could separate the obstacle’s kind. Moreover, the method, which is used for the redefining of the starter path, is analyzed whenever a dynamic obstacle is being observed, so the robotic vehicle would be able to avoid such a conflict. Finally, in order to be checked and evaluated these methods, a series of experiments took place.
Grigorios Christainas
A restification methology for client-server architectures. Application on the PowerTAC platform
The REST design pattern (Representational State Transfer) is a set of principles and rules for designing web services that was first introduced in Roy Fielding\'s dissertation «Architectural Styles and the Design of Network-based Software Architectures» in 2000. These principles are in fact a set of rules and constraints that, when applied in the process of designing a web service, they make it more appealing and enable scalability. The basic consept of REST is the representation of information and objects as resources. Every object is in fact a resource and it can be easily created or deleted through URI\'s (Uniform Resource Identifier). Through a well defined set of HTTP actions a client is able to access and modify information through a strict set of represantation forms and protocols. The ever-increasing demand for web services that are governed by a RESTful architecture is accompanied by a tendency to evolve their creation and operation techniques. Client-server architectures based on outdated architectural patterns tend to be replaced by RESTful architectures and approaches. This diploma thesis deals with the study of a real transformation problem of a client-server architecture to REST. Specifically, the transformation of the PowerTAC platform is considered, a platform that constitutes a competitive simulation of an energy stock where competing entities called brokers offer energy services to customers through contracts and are then asked to maximize their profit by buying and selling energy in order to satisfy their customers. The diploma thesis presents the main problems encountered in the Restification process of the platform as well as the solutions given, aiming at the general presentation of solutions for design problems encountered during the Restification.
Dimitrios Gouris
Autodiscovery of Web services utilizing the Semantic Web
The Web is adapting in order to handle the magnitude of the ever increasing data. The semantic web, as envisioned by Tim Berners Lee, is emerging slowly, although the required methods and technologies are already there to be applied at large scale. The current thesis is focused on automating the discovery and usage of Web services. We argue that in a dialogue between participants whether they are human or machines, there has to be a common context between them. This context guarantees the soundness of their communication. This fundamental context is described through technologies offered by the Semantic Web toolchain. We use the Resource Description Framework (RDF) for our data format. Additionally, a variety of vocabularies is provided, in order to assign meaning to data models and services. A network of servers offering Web services is implemented. Their content is described with terms from the HYDRA vocabulary. The data model is built upon terms rather than being simply annotated. The generic client is able to understand those terms and communicate with the servers with the aid of RDF graphs, instead of direct calls to their URL. In the middle of this communication lies the API-Resolver, a server equipped with an RDF parser and a SPARQL endpoint aspiring to resolve and match the requests from the client to the desired server. The goal of the current thesis is twofold. The evaluation of this proof of concept implementation and the exhibition of the Semantic Web potential. However, only with its adoption in large scale, the automation of many processes and the extension of the functionality of the current Web will become feasible.
Andreas Hadjithomas
Design and implementation of a ChatOps Bot using the Hubot Framework
The technology, communications and information industry has been evolving rapidly in the recent years. This is due to the fact that meeting most of the natural needs depends mainly on technological achievements. Even the main sectors of health, industry, nutrition, mass transportation and communication are based on advanced technology products, which in order to operate properly designed software is required. The development and maintenance of software is a versatile and complex process, especially when it comes to large-scale software that requires the collaboration of many people and the combination of various services, tools and technologies that are relevant to its development. Collaboration between teams, constantly updating work progress and automating everyday processes is a success-key to software development. This diploma thesis deals with the implementation of a chatOps Bot for the chat-driven software development within the group chat tool, Slack. Its main goal is to provide the development teams with the ability to automate tasks and cooperate in an easier manner. The Bot provides users the ability to manage the services of GitHub, Trello and Jenkins, and update and exchange information with the rest of the group in common Slack channels about the progress of work undertaken without leaving Slack.
Dimitris Niras
Development of a web recorder for automating tests in web applications
Nowadays, spending time on the internet is a daily task and is related to almost every area of human activity. People of all ages, different educational, social, and economic backgrounds, visit a wide range of websites daily for information, entertainment communication, conciliation at different levels, and the evolvement of their business activities. As a result, the formation of this new reality has resulted in the increasing creation of webpages and web applications that aim to attract as many users as possible. The creation of web applications, as well as their continuous maintenance, is a strenuous process, which requires constant monitoring of changes that take place on them. In order to achieve this, it is necessary to develop a fairly large number of tests, which will constantly validate the proper functioning of the website. However, this manual process proves to be extremely time-consuming, since for each service of the application, the developer has to create his/her own tests, which should also be changed, whenever any of the website’s elements are changed. This diploma thesis aims to contribute to the automation of test creation and the testing process. To this end, a Chrome Extension has been developed, which is responsible for “filming” a user’s actions on a website, recording all the available information that he encounters, such as HTML, CSS, JS code, API calls, content, as well as photos of web pages. In addition, a web application was created, which is responsible for the presentation and programing of the various tests results. Both the extension and application were tested on real-world websites, and the results showed it to be a very useful tool for developers, saving them valuable time.
Dimitrios Tampakis
Design and development of a Conversational Bot for a User personalized web
Internet has become nowadays an integral part of people’s lives. On a daily basis, users consume the services provided by the Internet for professional, recreational and other reasons. Users interact with computers via appropriately designed interfaces (user interfaces), in order to satisfy their needs. User experience (UX) is the most fundamental metric utilized to assess the human-computer interaction. UX is defined as \"a person\'s perceptions and responses that result from the use or anticipated use of a product, system or service\". The notion of UX is directly associated with the user him/herself and every user individually. However, each user is characterized by a different level of knowledge and experience, as far as the use of internet is concerned. The user has his/her own interests and preferences that match his/her personality. With a focus on improving UX and better satisfying the user’s needs the exploitation of information -gathered from the user- is deemed necessary. Information such as gender, age, demographic characteristics and the content of websites visited by the user could be used to identify the user’s interests and create a corresponding internet profile. The main goal of this diploma thesis is to design a system which allows the identification of user’s interests and provide a mechanism in order dynamically re-assess these interests. Initially, information is gathered from the user’s internet history -via a Chrome extension- and the interaction of the user with a Messenger Bot. This information is used in order to identify the user’s interests and create an internet profile. Subsequently, a personalized news feed provided by the Messenger bot enables us to dynamically re-assess the user’s internet profile. Within the context of this thesis we present relevant applications and describe the implemented system and its components in detail. In addition, we present results of the system’s utilization by a real user.
Ioannis Agrotis
Design and development of a software quality optimization system using automated correction of coding violations
The ever-growing penetration of the internet into our lives could not leave the software development process unaffected. Broad and easy access to all kinds of information has provided an opportunity for software developers around the world to create a collaborative community for building new software projects, also known as the \"open source community\". It is now a fact that software development requires systematic source code reuse in order to help create better quality software faster and at a lower cost. However, most of the source code which is located in open source repositories and is available for reuse does not necessarily meet specific quality standards, fact that makes the development of mechanisms for quality monitoring necessary. Towards this direction, in an effort to model quality, a set of standards have been proposed that analyze software quality in a number of features. Similarly, in an attempt to find a common ground between software developers, a number of best coding practices have been proposed to supervise quality at the source code level. For this purpose, static source code analysis tools have been developed that detect and report coding violations; however, they do not offer the ability of automatic correction. In the context of the above we propose the development of an automated code quality improvement system through automatic correction of code quality violations, whose primary objective is to be a reliable and useful tool for developers. The first results the systems implementation in a series of open source projects lead to the conclusion that our approach can correct a large number of violations and thereby substantially contribute to improving the software’s quality.
Ioannis Iakovidis
Applying reinforcement learning for structured prediction
During the last few years, the increased popularity of the internet, the proliferation of integrated computers and the continuously growing research community have generated an explosive increase in the number and size of the various available data collections. At the same time, the increase in available computing power and storage and the huge interest in fields that demand a large amount of data, has made the analysis of such datasets possible. In practice, though, the exploitation and integration of data from a large number of sources has proved to be a very hard and time-consuming process. Even when working only with data collections that contain similar data, these are seldom in common formats. On the contrary, the bigger the variety of data we wish to use, the more effort does it require to modify the data to a common structure. Furthermore, a huge category of data that cannot be utilized easily is that of semi-structured data. This category includes data collections that exhibit a loose structure, such as HTML trees (websites). The exploitation of those data is often prohibitively complicated or even impossible if manual data processing is used. The above reasons render clear the need for development of flexible algorithms capable of handling data processing and manipulation with limited or even no human help. Even though a variety of artificial intelligence methods have been used to solve the above problem with promising results, there still exists a large margin of improvement in those results. Algorithms that belong in the field of reinforcement learning are especially interesting, since we believe that the structure of those algorithms makes them ideal for the task of processing data of various structures. In this diploma thesis we elaborate on the performance of reinforcement learning algorithms in a variety of problems focused on structured prediction.
Ioannis Tsafaras
Design and development of an automatic mechanism for Continuous Integration
The progress of cloud computing in the recent years has been rapid. Given the advantages that cloud computing offers, it is being used more and more by businesses and, accordingly, there are many providers that offer cloud computing services. Together with the advantages of cloud computing, there are several challenges, for example related to data security. These challenges vary depending on each provider\'s implementations. An important part of the software development process is Continuous Integration (CI). CI aspires to minimize errors and accelerate the progress of software project development and evolution. Testing is automatically performed through CI systems and, upon successfully running the automated tests, CI delivers the latest version of the code in a production or pre-production (staging) environment automatically through Continuous Deployment (CD) and Continuous Delivery (CDE), which are an extension to CI systems. Numerous cloud-based CI implementations are available as-a-Service, but there is a differentiation between the services provided, depending on whether the software project is closed or open source, while data (code) security challenges arise, especially for closed source projects. Moreover, the adaptability of the systems to users’ requirements is limited. The process of implementing an integrated, customizable, automated CI + CD/CDE system, using cloud infrastructure, is time-consuming and requires know-how. The subject of this thesis is, after comparing cloud providers, to develop a service for automating the installation, configuration and running of a CI + CD/CDE system. Our approach also integrates static code analysis and evaluation. CI is implemented through Jenkins, an open source software, while static analysis is performed through SonarQube. Automation of the CI creation workflow as well as the CD/ CDE processes are performed through the Ansible software configuration management tool. The outcome of the thesis is a user-friendly web interface that enables, after inserting the appropriate variables, the creation of a CI system, which is compatible with the cloud infrastructure of multiple providers, as well as with the use of local servers. The product can be used by companies or individual application developers.
Giorgos Karagiannopoulos
Design and Development of a Recommendation System for Extracting Source Code Snippets from Online Sources
The outspread of the Internet has facilitated the search for useful code from online software repositories, therefore fundamentally changing the way software is developed and maintained. Software engineers focus their effort on combining the best examples and interfaces in order to achieve the optimal solutions. Nevertheless, even with a huge variety of available choices, the developer is often forced to leave his programming environment and resort to search engines in order to find useful code and examples. In the aftermath, his productivity and concentration are reduced. Lately, the research area of Recommendation Systems in Software Engineering has been developed in an attempt to confront these challenges. These are systems that receive queries from the developer, and through data mining techniques, aspire to provide ready-to-use solutions, such as reusable code. In current literature, there are several systems that receive some form of query and return ready-to-use code snippets. Nevertheless, most of these systems use complex query languages, thus requiring significant effort for properly constructing a query. Furthermore, the presentation of the results is often limited, as the developer is only given a list of snippets, without any grouping and without any further information regarding their quality. In this work, we design and develop a new recommendation system in order to confront the aforementioned challenges. Our system receives queries in natural language and searches for useful snippets in multiple online sources. After that, data mining and machine learning techniques are employed in order to assess and cluster the snippets. The results are assessed both for their usefulness and their quality (readability), while their presentation allows the developer to easily distinguish among the implementation that is most desired. Finally, we evaluate our system in a set of queries to confirm its proper functionality.
Vasilis Remmas
Automatic Build and Deployment of Robotic Microservices at Cloud
Generated data volumes are constantly increasing, dictating the need for more sophisticated algorithms and mathematical models to achieve faster and more accurate processing of this data volume. The execution requirements of these algorithms/models often require increased computational resources which entails increased energy and costs. It is evident that, as data continue to grow, performing such processing algorithms on robotic vehicles that do not have the computational power and the energy autonomy will be impossible. This diploma thesis focuses on the implementation of a system that aims to offload some robotic vehicle operations into a computer cluster. This way, robots can execute algorithms that, due to computational resource and energy requirements, would be impossible. The proposed system allows developers that do not have robotic programming skills, to treat robotic systems under a software as a service prism.
Panagiotis Doxopoulos
Providing robotic web services through a hardware node and interfacing with IoT platforms
The rapid development of technology over the last decade has greatly influenced peoples’ daily habits. Nowadays, due to the development of robotics and the Internet, we enter the 4th Industrial Revolution (Industry 4.0). The communication and collaboration of Cyber-Physical Systems, including machines and robots, among themselves and with humans, is expected to attract researchers\' interest for at least the next decades. A key element of the 4th Industrial Revolution is the Internet of Things (IoT). The idea of IoT initially appeared at Carnegie Mellon University in 1982, where a network of smart objects was envisioned in an effort to connect a soft drink vending machine to the network. Nowadays, IoT begins a period of acne, with scientists having good evidence that by 2020 smart objects will reach the number of 50 billion. This diploma thesis presents the development of an IoT system, through which various devices and smart objects come into contact, either with each other or with people. The most important part of the system consists of a router, crossbar, which allows the connection of smart objects, robots and people to the system. Communication is accomplished using Remote Procedure Calls (RPCs) and Publish / Subscribe (PubSub). The first refers to remote calls offered by the device, while using PubSub protocol, asynchronous messaging between independent nodes on an IoT network is achieved. Specifically, the WAMP and REST over HTTP protocols were used. In the current thesis, connectivity to a NAO robot, a Raspberry Pi (RPi), the REMEDES system and an Arduino embedded device was achieved. An equally important part of the overall system is the implementation of a hardware node which served robotic-oriented web services. This node was implemented on a Raspberry Pi and hosts a server that was created with the help of the Swagger framework tools. The server provides RESTful web services for utilization by robots. Raspberry Pi is connected to the IOT system, allowing robots to indirectly contact the service. Of course, direct communication of robots with RPi is also possible. Also, a large number of experiments have been carried out demonstrating the satisfactory operation of the system and drawing useful conclusions. Finally, some applications have been implemented that show the potential of the system.


Ioannis Gkiliris
Emergent programming practices through crowd intelligence
The rapid expansion of the Internet in the recent years has undoubtedly affected peoples\' daily lives, as well as the way they carry out their working duties. Access to a vast volume of information is unprecedentedly convenient and fast. Thus, the whole procedure in which software is being developed has entered a new era, where collectiveness and collaboration determine quite profoundly the final outcome. Due to this evolvement, substantial research has been carried out in searching and reusing existing, open source, code snippets, since platforms like GitHub seem suitable for the pursuit of such purposes. Collective Intelligence is therefore something tangible for the software engineering field as well, bearing in mind that modern day programmers receive considerable support in their endeavors. As for the purpose of this diploma thesis, that is to develop an open source tool (statLint) that is capable of recognizing the user\'s programming practices that deviate from those of other developers in well-known and open source software projects. Those emerging practices have resulted from the analysis of numerous packages and applications, freely available on GitHub. During the initial stage of this process a suitable system was developed in order to collect and analyze the useful data and finally be able to efficiently provide the summarized knowledge. Then, the statLint package that is used as a plugin for the Atom IDE is ready to use that knowledge in order to evaluate the users\' practices and, if necessary, to inform them accordingly. A variety of experiments validated the proper function of the system, not only from a practical point of view, but also from a rather overall perspective. Moreover, a powerful correlation was ascertained between the quality of our tool\'s evaluation capability and the quality of another, similar one. To conclude with, the main point of this work was to provide evidence that new tools that keep up with the modern programming conventions can be established successfully. This is quite essential considering that as with natural languages, programming languages are used by the human society as well and that alone renders the evolvement of the current practices inevitable.
Natalia Michailidou
Design and Development of Web Client for RESTful Web Services
The REST architectural style was introduced for the first time by Roy T. Fielding in 2000 in an attempt to generalize the fundamental principles of the web’s architecture and to present them as a specific set of restrictions. Since then, the REST architectural style has become extremely popular for web service- oriented development and is practically dominant as far as web service development is concerned. The development of web client applications that must consume RESTful web services, is however limited to web client libraries, and involve heavy front-end developer effort to become fully functional. The current thesis aspires to set the first steps for automating the process of developing the front-end of web client applications. In order to automate the process of generating web clients, MDA principles are applied (introduced by the OMG group). This MDA approach supports the definition of models in different levels of abstraction and thus permits software development based on the design objectives related to the problem and not the underlying computing environment. This way acceleration of the software development process and software production with higher credibility, extensibility and interoperability is pursued. Within the context of this thesis the Automated Client Engine is designed and developed. It produces web client applications that consume RESTful web services, as generated by S-CASE ( These web client applications manage CRUD (create, read, update and delete) requests as defined at the RESTful service level. In addition, the generated web clients provide authentication features, and embed UI/UX elements and CSS format styling. They are developed in Angular JS and HTML and are ready to deploy.
Sotirios Angelis
User Experience evaluation of On-Screen Interaction Techniques and Semantic Content
In the last decade, web application development has gained increased popularity. Web services support computer interaction and data exchange via HTTP, and play an important role in the development of this applications. The predominant choice for web services are RESTful web APIs. These are application programming interfaces that follow the Representational state transfer (REST) architectural style. REST architecture became popular because of its simplicity and ease of processing and expansion of the application based on it. The increasing demand and use of REST APIs is followed by the trend for developing new tools and techniques for their generation and consumption. These tools focus on minimization of the time and cost that is required for the development of a web application and/or the implementation of interfaces that effectively serve the functionality of the application. In order to be consider successful, REST APIs - like any other software product - have to be characterized by a reasonable ease of use that allows people to handle the functionality of the application effectively. Based on the above, current thesis deals with the design and development of Interact, a graphical user interface tool that is fully adaptable to the structure of a given REST API. Practically it produces a complete user interface that does not require any knowledge about front-end development. Interact is compatible with S-CASE (, a software platform that includes, among other things, an automated code generation engine (employing MDA primitives) that generates RESTful Web APIs. Interact is developed in Angular JS. It can be used for testing a REST API, as a website or as a prototype presentation of the API’s operations.
Nikiforos Sakkas
Design and Implementation of an interactive system in order to evaluate User Experience (UX) of Interaction Techniques and Semantic Content
Undoubtedly, Internet has become nowadays an integral part of everyday life. As a consequence, a continuous interaction between the man and his computer is performed on an almost daily basis. Human-Computer Interaction (HCI) is the scientific field of information technology that focuses on the interaction between people (users) and computers. It is regarded as the intersection between Computer Science, Cognitive and Social Psychology, Linguistics, Industrial design and many more disciplines. The interaction between users and computers is implemented at the level of user interfaces, through the appropriate software and hardware implementations. The most important metric of human-computer interaction assessment, as far as the Internet is concerned, is User Experience (UX). User Experience indicates a holistic metric on the experience that a website or an application offers to the user. UX is composed of and influenced by many factors, some of which are the various interaction techniques and the use of semantic content. Within the context of this thesis we focus on evaluating User Experience related to different interaction techniques and semantic content. More specifically, six similar webpages were developed, differentiated only by the type of interaction technique that they offer to users. Subsequently, six additional webpages were developed, which identify and pinpoint the most significant semantic entities to the user, providing, that way, additional material to interact with. Various UX evaluation experiments were performed and results are discussed with respect to significance in UX and user familiarity with the web.
Georgios Ouzounis
Personalized Automatic Speech Recognition
The goal of this thesis is to increase the usability of electronic devices in every day life. A step towards this goal is the transformation of the com- munication interface between a human and an electronic device, in order to further approach the natural communication interface between humans. For this purpose the Personalized Automatic Speech Recognition (PASR) desktop application has been designed and developed. By using this applica- tion, a user can compose e-mails in English by simply dictating them. The methodology proposed comprises two stages. During the first stage, called the Automatic Speech Recognition (ASR) stage, user’s voice is transformed into plain text. This stage makes use of the open source speech recognition toolkit CMU Sphinx. During the second stage, the output of the first stage is syntactically corrected based on a set of existing e-mails that the user has previously provided. This stage is called the Post-Processing stage and makes use of Natural Language Processing (NLP) techniques to alter the ASR output text. Experiments performed on both the ASR and the Post-Processing system indicate that the latter introduces a significant increase to the performance of the whole application. These experiments, along with their results, are dis- cussed on two separate chapters of this document. Finally, the final chapter contains discussion on the future work, concerning this application.
Paraskevas Lagakis
Venuetrack: a smart search engine of points of interest in Thessaloniki, with evaluation capabilities based on sentiment analysis of comments
The world wide web has been rapidly expanding over the last decade, and today more than 40% of the world population is using it on a daily basis. Social media have played a very important part in this increase of the internet’s popularity, since for many people, social media is one of the few if not the only reason to go online. As a result of this explosion, large quantities of raw data have been produced, and its analysis is a huge challenge for the scientific community. In this context, sentiment analysis and natural language processing are in the center of the scientific status qwo, presenting great interest and vast new opportunities. Especially so in Greece, where these fields are still relatively evolving in a very slow pace. For that reason, this thesis tries to develop a sentiment analysis system, by using natural language processing methods. The aim of this thesis, is to apply sentiment analysis in order to evaluate points of interest (or venues) in the city of Thessaloniki, by evaluating users’ comments. These comments are categorized as positive or negative by a classifier that was developed and trained using a relevant dataset. By using the classifier to evaluate each venue’s comments, we then decide if each venue offers a positive or negative experience to the visitor. The results of this NLP system are presented in a web application named Venuetrack. Venuetrack is a smart and easy-to-use search engine for venues in the city of Thessaloniki, in which users can search for points of interest on the map of Thessaloniki, and check out their information as well as the classification of the NLP classifier created.
Odysseas Doumas
Design of a Platform Specific Model of RESTful Web Services and automatic code generation from this mode
In the last decade, the RE ST architectural style has dramatically changed the way web services are developed. RE STful Web APIs have conquered the programmable web, due to the simplicity and flexibility they provide, with a resulting raise in the demand for Web API development. This demand has given birth to several development frameworks that allow rapid development and aspire to automate parts of the development process. However, most of these frameworks fail to generate ready-to-run applications, while the end program is usually not RE ST compliant. Having this raise in demand in mind, in this diploma thesis the MDA architecture, an OMG initiative, is examined. MDA falls into the category of Model Driven E ngineering techniques, where their main characteristic is the systematic use of abstract models and model transformations as active parts of the development process. MDA comprises a set of strict standards and tools, and describes a development process where an initial abstract model is designed, a sequence of model transformations takes place and finally, a fully functional application is generated. MDA architecture promises a boost in productivity, an improvement in the understandability and the communication between the different members involved in the development process, and also an improvement concerning the reliability, quality, extensibility and interoperability of the developed software. Initially, in the present thesis, a thorough exploration of the basic concepts and the philosophy of MDA alongside a brief explanation of the RE ST architectural style and its principles, takes place. Afterwards, a development tool is designed and developed, which incorporates and MDA architecture and can be used for the development of RE STful Web APIs. This tool was designed to be compatible with S-CASE , a software project that includes, among others, an automatic code generation engine following the MDA architecture that generates RE STful Web APIs running on the Java environment. The tool developed in this thesis essentially forks the functionality of the S-CASE engine, in the sense that it accepts as input a PIM model, an abstract, platform independent model that describes the functionality of a RE STful Web Services, generated by the S-CASE engine, and it generates a RE STful Web API described by that PIM, but designed and implemented to run on Microsoft’s .NE T platform.
Konstantinos Sideris
Developing a web application for static analysis of software repositories
In recent years there has been a rapid expansion of Javascript’s ecosystem. To this fact has contributed both the establishment of the internet as a development platform for web applications as well as the Node.js platform, which offered the opportunity to use the language outside the browser, for the development of any kind of application. Node’s success lead to the creation of the first package manager for JavaScript, which now hosts tens of thousands of JavaScript packages. The accumulation of a large number of software packages in a relatively short time has made the selection and detection of valuable packages difficult. The large volume of information and data available both in repositories and source code could be utilized for the improvement and evaluation of the available software. The thesis deals with the development of a web application (npm-miner) which aims to enhance the software package selection process through the use of quality metrics for assessment. The user will have the ability to search and compare software as well as explore statistics on ecosystem quality.
Klearchos Thomopoulos
QualBoa: A Source Code Recommendation System using Software Reusability Metrics
Undoubtedly, the digital era could be characterized by the widespread adoption of the Internet, which has contributed to the facilitated information sharing. A reasonable problem that arises is the efficient exploitation of this information, part of which refers to software, which can be widely found in open source software repositories. All this information could be of use to software developers in order to support software reuse. To effectively exploit online source code information, alongside conventional search engines, Code Search Engines or CSEs have been developed. However, these are also not adequate for completely addressing the problem of finding reusable source code, as it is not possible to adequately describe the user’s query, and moreover they cannot guarantee the functionality and the reusability of the retrieved results. As a result, more sophisticated systems were developed, named Recommendation Systems in Software Engineering or RSSEs. These systems aspire to automate the extraction of the query from the code of the developer and evaluate the functionality of the retrieved results. However, current systems do not consider the non-functional characteristics of source code components, which essentially refer to their reusability. The incapability of these systems to address the problem, led us to the development of an RSSE, in order to cover both the functional and the quality aspects of software component reuse. Our system employs the Boa language and infrastructure, which comprises processed information of software repositories, which are accessed using a query language. Specifically, our system firstly extracts the query from the source code of the developer and translates it to a query for Boa, in order to find relevant results. Moreover, our system extracts quality metrics and uses them in a model to measure the reusability of each retrieved component. Thus, upon retrieving components, our system provides a ranking that involves not only functional matching to the query, but also a reusability score. Based on the evaluation, the system indicates a satisfactory outcome, both towards quality and accuracy. It is safe to conclude that our system can be effectively used for recommending reusable source code components.
Ioannis Likartsis
Planning and monitoring operations in robotic missions
Mobile robots are used more and more often to accomplish critical goals, where human presence is not considered safe. Such goals are search and rescue mission, surveillance and reckoning of dangerous areas, space missions etc. One more advantage that derives from the use of robots in such conditions is the dramatic reduce of cost (a lot less strict security measures). Human-robot cooperation, enables the utilization of the best qualities of each. The robot\'s ability to perform extremely fast calculations and make educated decisions provided enough it can gather enough information, and the human\'s ability to conceive situations and take actions with limited data. To achieve the latest, when the robot is in the field of the mission, a system for human-robot communication is required. The field that studies the aforementioned system is Human-Robot Interaction (HRI). The goal of HRI is to optimize human-robot co-operation to carry out missions in a range of possibilities, from fully autonomy to teleoperation. The problem studied in this thesis, is that of developing a graphical user interface (GUI) that will allow the operators on the ground to control and monitor the tasks of robotic assets located in the field of the mission. To this end the application GRASP was developed. GRASP allows the accomplishment of missions using remote robotic assets. GRASP is designed in a way that it can control various types of robots (rovers, UAVs, arms etc). Furthermore it is extendable when It comes to changes in the software of the controlled robot. The functionalities mentioned above are achieved using configuration files, which also allow GRASP to adapt to potential changes of the on-board computer. Through GRASP the operator has the ability to create mission plans, send them to the robot and monitor their execution. GRASP helps the operator to carry out the necessary tasks, providing him with situational awareness (conditions and environment) utilizing the incoming telemetry messages. The type and quality of situational awareness provided to the operator depends completely on the sensors (cameras stereo/thermal, maps 2D/3D etc) and the software of each robot (object recognizing, digital elevated maps etc).
Miltiadis Siavvas
Design and development of a framework for the evaluation of software projects quality based on static analysis and fuzzy multi-criteria decision-making techniques
Our era is characterized by the rapid technological development and the continuous digitization of information. Software products are continuously being developed in order to help people achieve their goals easier, faster and more efficiently. This raises the issue of software quality as a major concern for both the end users and the software development companies that wish to offer their customers high- quality services. A lot of research has been carried out in recent years in order to design and develop a universally accepted mechanism for software quality assessment. However, no efficient generic model exists. This is why contemporary research is now focused on seeking mechanisms able to produce quality models that are easily adjusted to custom needs. Within the context of this diploma thesis we focused on the design and development of a system that enables the quality assessment of software products according to a particular set of design aspects. In order to achieve this, a tool chain was developed that allows the production of quality models by applying static analysis to a desired benchmark repository and then using these models to assess the quality of software products written in Java. Multithreading is applied to accelerate the time-consuming process of static analysis, while fuzzy multi-criteria decision-making techniques are adopted in order to model the uncertainty imposed by human judgement. The system produces a carefully calibrated and reliable quality model, the base model, which is utilized to verify the system and serve as a guide to further implement similar quality models. Finally, an online service was designed and developed that offers quality assessment of open-source software products placed on GitHub, with the ultimate goal to become a reliable code quality certification service. The performed experiments ascertained the proper system operation and the independence of the models with regards to the size of the product under assessment. By assessing both automatically generated and user-developed software products the contribution of quality models towards the improvement of software product quality was highlighted. Afterwards, a comparison between the serial and parallel implementation was made, leading to the conclusion that parallel implementation greatly enhances the static analysis process. Finally, a comparison of the fuzzy weight generation technique with its deterministic counterpart showed a close correlation between the results of these two methods.
Christos Zalidis
Augmenting perception for unmanned ground vehicles for efficient exploration and navigation in rough terrains
Robotics is currently one of the most rapidly evolving scientific areas, where significant advantages in development of autonomous robotic agents enabled the execution of complex tasks that where not possible before. A fundamental feature robotic agents needs to exhibit, in order to interact with their surrounding environment, is perception. This thesis tackles the problem of modeling and representing an environment, which is not known in advance and consists of uneven surfaces and rough terrain. Specifically, we are interested in unmanned ground vehicles, that use the representation of the environment for efficient navigation and exploration. We developed a unified system that performs the task of robot localization in three-dimensional space, uses elevation maps to represent the environment, extracts traversability features from that representation and finally performs autonomous and safe navigation. Full three-dimensional robot localization is achiev- ed through the combination of various state estimation algorithms and raw data from sensors, using an extend Kalman filter. Building upon an onboard range measurement sensor and an existing robot pose estimation, we formulate a novel elevation mapping method from a robot-centric perspective. This formulation can explicitly handle drift of the robot pose estimation which occurs for many autonomous robots. Additionally, we extract terrain features, useful for navigation and path planning, performing traversability analysis based on the representation of the elevation maps, which leads to a new represent- ation of the environment, traversability maps. Moreover, we extend the representation of the classic elevation maps, adding the ability to model environments that contain multiple overlapping structures, such as bridges, underpasses and buildings. This new representation leads to the developments of multi- level elevation-surface maps. The proposed architecture is based upon the navigation module of ROS (Robot Operating System), one of the most popular navigation algorithms in the robotics community. The representation that is used by this algorithm is not sufficient for navigating in environments that contain uneven surfaces or generally rough terrain. Therefore, using the above mentioned representation we can extend the ROS navigation system, adding new capabilities and enabling navigation in such environments. Finally, extensive experiments examine and evaluate the proposed method’s performance in diverse environments.


Vasileios Lolis
Design and implementation of an Android application for the sentimental and categorical analysis of user web browsing contents
Mobile devices have become a part of our lives and offer us the opportunity to access the Internet anywhere. We browse the internet more and more, whenever and wherever. Companies, such as Google, gather information on us and with proper processing can create a complete profile on who we really are and what we like to do. When a user browses the web through his/her Android device, Google collects the history of the URLs he/she visits. These data are usually stored locally as well as on the Google Cloud. Within the context of this diploma thesis a review of the REST architectural style is initially performed, as well as a review of current RESTful web services that offer sentimental and categorical text analysis. The thesis introduces C.H.A.T. (Chrome History Analysis Tool), a mobile Android app that extracts the history browsing data from Google Chrome of a user’s mobile device, employs a web service for the sentimental and categorical analysis and stores the results in a remote database, while also ensuring user anonymity and safety of personal information. C.H.A.T. provides the user with diagrams, correlations, statistics and pictures in a user friendly manner, also enabling him/her to choose specific time periods for the analysis. Conclusions and future work are discussed at the end of the thesis.
Alexandra Baltzi
Applying Test-Driven Development and Code Transformation Techniques to improve Code Reuse
Undoubtedly, the digital era has contributed to the easier transmission of information through the widespread adoption of the internet. The proper exploitation of this information is a difficult challenge. An interesting form of information is the source code provided in open source software repositories. As a result, a new objective is the exploitation of existing software from the developers, or the exploitation of the knowledge resulting from it, in order to create new software. So, alongside conventional search engines, Code Search Engines or CSEs were also developed to extract code from open source software repositories. However, these are not always sufficient to address the problem, as they fail to adequately describe the query of the developer and cannot guarantee that the results returned are indeed functional. Later on, more sophisticated systems were developed, namely the Recommendation Systems in Software Engineering or RSSEs and more particularly those which make use of Test-Driven Development. Although these systems support the creation of complex queries from the user, they often have no way of controlling the utility and functionality of the final result, while most of them are no longer functional. The incapability of these systems to address the problem has led us to develop our own RSSE, which uses a dynamically renewable repository for code search. Initially our system extracts the query from the developer’s code, and then it searches in the CSE AGORA and applies a mining model on the retrieved results. Furthermore, our system applies transformations on thees results in order for them to match to the original query, therefore providing more useful and functional results in comparison with other test-driven RSSEs. Finally, our system provides the user with information about each result, regarding its relevance to the user’s initial guestion, its complexity and its functionality. The comparison of our system with other known RSSEs proves that our results are satisfactory in terms of quality and accuracy, and that it remains efficient as far as response time is concerned. Additionally, the code transformations performed by our system further improve the results.
Michail Papamichail
Design and development of a source code quality estimation system using static analysis metrics and machine learning techniques
The most representative description of today’s age in one phrase would be “the information age”. In contrast to previous ages, when access to information was extremely difficult, time consuming or in many cases even impossible, today due to the evolution of the technology and the outspread of the internet, information is just a few clicks away. This fact is present in every aspect of our everyday life and of course the software development process could not have been left behind. The exploitation of numerous open source software projects, both from software repositories, and from using search engines or specialized code retrieval systems, facilitates the process of software development. However, code reuse is beneficial as long as the reused components meet the requirements of the developer\'s project, and fulfill certain quality standards. Searching in a software repositories, one may notice that hundreds if not thousands of results are retrieved for their query. This vastness of retrieved code raises the question: “How can one choose the most high quality source code results to reuse?” The contribution of this diploma thesis lies in answering the above question in a reliable manner by proposing a source code quality estimation mechanism depending on static analysis metrics. For this purpose, a system has been designed with the primary goal to estimate the source code quality based on the static analysis metrics. To this end, the system uses two models, a one class SVM classifier and a Neural Network model. The former is used to determine whether the examined files meet a fundamental quality threshold, while the latter provides a score for the quality of the files. These two models were trained using 24930 java source code files included in the most popular 100 GitHub database repositories. Finally, upon successfully evaluating our system for ranking new files, we conclude that it can be a valuable asset for the developer.
Konstantinos Papangelou
Personalizing web search results by incorporating user behavior and semantic data
The most important step towards understanding and satisfying web users\' needs is the analysis of their behavior and the use of the data they provide in order to implement personalized services. Several information, like gender, age and the location of the users as well as the webpages they visit, can be used by a plethora of web applications in order to identify users\' interests and provide them with better services. These kinds of applications are already part of the web with the most prominent example being the personalized search offered by some commercial search engines. The main goal of the diploma thesis is to present a complete method that identifies users\' interests based on their browsing history. For this purpose we have implemented two systems. The first system creates profiles relevant to various domains while the second one assigns these profiles to users based on the content of the webpages they visit. In particular, for the first system, we collect webpages relevant to some subject (e.g. a music genre) using the search API of a commercial search engine and we perform thematic analysis of them using Latent Dirichlet Allocation. We use the results of LDA in order to find the most dominant topics and for each one of them the most probable words. We use these words to form a vocabulary relevant to the corresponding subject. We are also interested in forming profiles that describe the user\'s level of expertise in each subject. For the second system, we extract the user\'s browsing history and for each webpage-profile pair we calculate a score based on the number of matching words. To improve further our scoring system we include a measure that captures the semantic similarity between webpages and profiles. Finally, for every webpage we find the profile that has the maximum score and the set of the resulting profiles is assigned to the user. Within the context of the thesis we present relevant applications and describe the implemented systems. We also present results of the first system in two popular domains, music and sports, as well as an example of a user\'s browsing history analysis. The results are promising and allow us to draw some conclusions.
Antonis Noutsos
DPDHMMY: A Design Pattern Detection tool
Currently, designing and developing software has grown to be a tedious task. The ultimate goal of the software development process includes not only satisfying the desired functionality of a software product, but also various, sometimes critical, non- functional requirements. Extensibility, robustness, usability, maintainability, portability, testability και reusability are concepts that the developer has to take into account during the development of their project. A methodology that ensures proper structured and good quality code is the application of various code design patterns, which are acknowledged for their added. In this context, design pattern detection can improve the understanding of code architectures, while it can also offer an asset in applying patterns in existing code, while ensuring the satisfaction of non-functional criteria. This thesis proposes a new way for representing well known, but also custom design patterns. The methodology builds upon the connections among classes inside a project, in order to match them to predefined design pattern structures. Following the above methodology, we designed and developed DPDHMMY (pronounced dee-pee-dee- mee), a user-friendly application for design pattern detection. One of the main advantages of DPDHMMY is the capability of detecting patterns even in non- compilable code. Using DPDHMMY, design patterns can be detected in incomplete code or code with errors, in order for the programmer to fix and improve it. Thus, the tool can be used when structuring the source code of an application (e.g. during the definition of interfaces) in order to determine whether proper design patterns are taken into consideration. Furthermore, it provides the users with the option to define their own design patterns, thus, promoting high extensibility.
Ioannis Antoniadis
Interactive Question Answering using Topic Models
Bridging the gap between humans and machines on the scope of in- formation retrieval has always been a challenging task. Search Engine Optimization (SEO) has made a lot of progress to that end, but still the gap seems long. Search engines are incapable of capturing the content semantics of neither the information resources nor the user’s query. Question Answering systems were proposed a couple of decades ago in order to cope with this challenge and a lot of knowledge has come to light since then. QA systems attempt to capture the semantics of a user’s question and provide a specific, suitable answer. Many different Natural Language Processing (NLP) techniques, such as linguistic and probabilistic techniques, have been incorporated to Question Answering with success. The main focus of this thesis is the proposal of a Question Answering mechanism that aims at providing improved answers to user queries. The proposed mechanism incorporates content semantic analysis and proba- bilistic topic modelling techniques to capture the latent thematic structure of the document collection, from which the answer is derived. The evaluation process includes a comparison of the proposed, topic- based ranking mechanism with a standard search engine ranking mecha- nism and proves its validity.
Emmanouil Krasanakis
Automatic Code Generation
This diploma thesis aims to bridge the gap between logical model generation and second-order logic. In particular, it develops methods to manipulate programatically equivalent logical models. After developing the necessary mathematical tools, we then delve deeper into that area, attempting to replace parts of a given model with ones from a set I without losing programatic equivalence. This process is given in the form of an algorithm that tries to minimize a certain quantity. If the set I contains only programatically implementable models, we effectively approximate the implementation of the given model. Afterwards we develop methods for comparing (and thus interpreting) loosely-defined model descriptions, as well as import- ing already existing Python libraries to generate the mapping between comments and their implementation. The end result is automatically generating code for a given problem by replacing similar comments with their implementation. Finally, after discussing areas for future development, we present a fully developed environment that implements all developed algorithms.
Alaoui Tzamali Zakia
Genome Data Analysis by Computational Intelligence Methods and Applications in R
Advances in gene profiling technologies, heralded by the completion of the human genome, have revolutionized the field of molecular biology by producing large amounts of genetic data that require a powerful bioinformatics tools for a meaningful interpretation of genetic abnormalities that occur in a specific disease state. In recent years, a widely used technique for gene profiling is the affymetrix microarray technology, which enables the study of the expression of thousands of genes simultaneously in a single experiment, creating a huge set of data. In this context, the laboratory of our collaborator Pr. Moulay Jamali (Faculty of Medicine of McGill University, Canada) has used this technology to investigate genes that are selectively regulated by the cooperation between overexpression of an oncogenic receptor called ErbB2 and a tumor suppressor gene called p53. This was achieved by overexpression of ErbB2, in colorectal cancer cells deficient or proficient for p53. Genomic data was generated using the affymetrix method. My thesis work, which focuses on analyzing this novel gene expression data using clustering methods, has revealed novel biological knowledge relative to gene regulation in these cell models. In particular clustering algorithms presented are the K-means, the SOM (Self-Organizing Map) and finally SOTA (Self-organizing Tree Algorithm), an algorithm that manages to automatically determine the optimal number of clusters. From the results of the clustering, differentially expressed genes were identified in each comparison. These candidate genes have a great potential for understanding key mechanisms and functions that may contribute to disease development and progression, in relation to the cooperation between ErbB2 and p53.


Georgios Voulgarakis
Simulator software, capable of simulating various aspects of the Beam Position Monitors
The topic of the following thesis is the development of a simulator software, capable of simulating various aspects of the Beam Position Monitors. The software is intended to be used for educational purposes, in the CERN Accelerator Schools. The software is capable of simulating both Beam Position Monitors, as well as the signal processing electronics which follow, thus allowing the user to define and simulate his own BPM processing circuitry. The simulator has been developed in MATLAB, due to the ease of coding, the ability to easily make changes, as well as the speed of performing array operations. (General advantage of interpreted languages).Nikolaos Katirtzis
Mining Software Repositories for Test-Driven Reuse
The digital age, fairly characterized as information age, has brought about significant changes in everyday life. The widespread adoption of the internet has facilitated information sharing and now the question that arises is how we can exploit it. Open source software repositories provide an interesting kind of information waiting to be exploited. This kind of information could specifically be of use to developers in order to support software reuse. Since traditional search engines cannot solve this task, more specialized search engines, namely Code Search Engines (CSEs) have emerged. However, they also fail to address the problem, as it is not possible for them to adequately describe user’s query -due to its complex structure- and moreover they cannot guarantee that the results are indeed functional. A more recent approach to the problem are the so called Recommendation Systems for Software Engineering (RSSE) and particularly those that make use of Test- Driven Development (TDD). Such systems allow better description of the developer queries, while they sometimes also check for the functionality of the results. However, most of them do not make use of dynamic software repositories, their results are of poor quality and their response time is not satisfactory. The failure of existing systems to address the problem led us to the development of our own RSSE, which allows code searching in growing repositories. A CSE named AGORA or a subsystem -that uses the CSE Searchcode- that enables code searching in GitHub can be used to search for available code. Our system adopts the user’s query, extracting the query from user’s code and using the Vector Space Model (VSM) to compare the query with the results. Various techniques from the areas of Information Retrieval (IR) and Natural Language Processing (NLP) have been employed in order to make this comparison as effective as possible. The user is informed about the relativeness of the results, their complexity and their functionality. The comparison between our system and some popular CSEs proves that its results are satisfactory, in terms of quality and accuracy, while the integration of CSEs AGORA and GitHub is considered successful. Also, the comparison between our system and some popular RSSE systems, proves once more its effectiveness in terms of the quality of the results, as well as its stability in response time.
Chariton Karamitas
With the advances in computer technology and telecommunications, computer systems, that were once machines designed to carry out simple mathematical operations, have become composite multiuser systems implemented to be able to execute several multithreaded scientific applications in parallel. This advancement also resulted in a simultaneous increase in the number of threats that a computer system is exposed to, as a result of either physical or remote access on it. Such a hostile environment raises concerns for the essential problem of recognizing delinquent behaviors on a computer system and computing the set of subsystems that may have been affected by malicious actions. In an attempt to deal with this problem, government organizations and secret services from the US, France and Germany, published a series of standards regarding the design, the development and the evaluation of security features of computer systems designed for use at critical state infrastructure. The aforementioned standards, among others, require the implementation of advanced mechanisms for monitoring the actions taking place on such computer systems. This diploma thesis deals with the aforementioned mechanisms, focuses on OpenBSM, evaluates its capabilities and proposes modifications that improve its functionality. Last but not least, this thesis presents Synapse, a tool that, using OpenBSM, is able to determine forms of communication between applications running in parallel on a computer system. For a system administrator, that knows about a certain application having been compromised, Synapse is a powerful tool that can aid her in detecting the set of further applications that may have been affected by the compromise in question.
Evagellos Karvounis
Optimizing the performance of a Web Crawling mechanism based on semantic content
It is common knowledge that surfing the World Wide Web has become a daily activity for almost everyone. Each day Internet users continuously interact with websites. Extracting valuable observations and understanding the real link between the user and the website has, thus, become an important research question. In order to succeed in capturing this information, Search Engines depend upon Web Crawlers, that traverse the Web by following URLs and their hyperlinks. Then, they process and store the website content in repositories that can later be indexed for more efficient execution of user queries. The evolution of Web and the development of related frameworks and standards has helped towards the data machine understandable helped Web 3.0 (aka Semantic Web). Apart from other facets, the transition to the Semantic Web dictates the development of Web Crawlers that can handle and process semantic information. Within the context of this diploma thesis we have focused on optimizing the performance optimization of an existing Web Crawler, Apache Nutch, in order to optimally handle semantic data. The enhanced Web Crawler, SpiTag, traverses the Web focusing on the semantic content of the webpages and aims at finding a more efficient traversing path than Nutch in terms of semantic content. Experiments and results show that SpiTag indeed performs better than Nutch and acquires more information and improved information.
Elli Kasparidou
A web-based tool for visualization and analysis of social network data
Feeding Social network platforms with data is a daily routine for millions of people all over the world, and at the same time one of the major means to reproduce any kind of information. The wealth of information hidden in this massive volume of produced data is an asset to anyone that can understand how to process them. The main goal is to produce structures able to semantically the users’ network; then, the update of information becomes easier. Another issue of concern is the identification of appropriate methods to be employed in order to represent the respective networks in a user friendly way. So far, a multitude of applications have been designed and developed in an effort to provide solutions to the visualization problem of networks (in general, but also social networks in specific), each one with specific advantages and drawbacks. This dissertation aims at developing a web-based visualization system for social networks, capable of updating its data automatically and exporting the graph structures, allowing at the same time the end user to interact with the network. Through the developed tool the end user is able to interact with a hierarchically structured network and reform it in terms of semantic significance. Obviously, constraints related to graph size and system response have been taken under consideration along design and development of the presented tool.
Stylianos Moschoglou
Development of a multi-agent platform of a stock market for the exploration of the causes of financial crashes
Advances in technology and increased Internet penetration have simplified access to stock markets and, especially in recent years, have increased the number of people willing to invest in them. Stock markets consisted exclusively of stock brokers; this however has changed radically and, as a result, the investor mix has expanded demo- graphically as well as qualitatively, including a wide spectrum of social groups varying in education, social behaviors, etc. This diversity in stock markets has led to deviations from the old classical models that economists employed in order to analyze them. As a result, a huge interest arose in finding new models for the replacement of the old ones. One of the main targets of these new models was to represent reality in a better way by eliminating the problems generated by the complexity of the stock markets. In order for this to be achieved, there was an interdisciplinary cooperation with other scientific fields apart from the field of behavioral finance, such as the fields of software engineering and applied mathematics. The outcome of this interdisciplinary collaboration was that stock market models, mainly from the late 90’s, were developed as multi-agent platforms. These multi- agent platforms, especially designed software to simulate a wide range of emergent social and scientific phenomena, contributed profoundly to the optimization of stock market modeling. Researchers, with the assistance of the multi-agent platforms and based on the theory of behavioral finance, were able to construct models that would embed and simulate different social groups with a wide range of behaviors and strat- egies. Within the context of the current diploma thesis we present a stock market mod- eling and simulation multi-agent platform, which is based on the original Sim- StockExchange model. Our platform includes a plethora of different behaviors and possesses all the necessary mechanisms that validate the outputs of our modified model against those of the original one. We focus our attention on the conditions which, when fulfilled, could trigger a fi- nancial crash in the stock markets. More precisely, we study different types of behav- iors among investors, so as to figure out which specific types could precipitate a finan- cial crash. Finally, we present a mechanism with which wealthy and experienced investors in the stock market could impose a financial crash. Results generated from simulation correspond quite well with the ones from real stock markets. There are a lot of poten- tial extensions of the model and some of them are mentioned in the penultimate chapter.
Michail Karasavvas
Fault Detection on Sensor Systems Based on Adaptive Outliers Detection Techniques
Anomaly detection is a very important aspect of contemporary systems, since all an increasing number of human tasks, from simple everyday activities to complex businesses and industry workflows, are becoming more and more automated. In this context, the detection of anomalies in the behavior of systems is an intriguing re- search topic, where researchers are striving to reduce system ”failures” to a minimum, thus requesting as little human intervention as possible. Through proper design, a system should operate with the highest possible accuracy.Within the context of the current thesis, the data generated by real-life operating air-conditioning units, are analyzed and explored. The data comprising mainly measurements of sensor based systems. The conclusions derived during the analysis are then used for the construc- tion of machine learning models. The aim of the machine learning models is the prediction and detection of the unit’s faulty behaviors. The methodology and the techniques employed here could be applied in any sensor system for fault detection analysis.Stylianos Tsampas
Vulnerable System and Wargame
The increasing dependence on technology has pushed humanity to invest more and more resources on the field of Computer Security. This field evolves rapidly, with new technologies, defensive or offensive, being introduced with very high frequency. This particular flow of new technology means that the field is in an ever-changing and unstable state. Systems thought to be adequately secured may become insecure in an instant. One of the most interesting approaches in Computer Security are Intrusion Detection Systems (IDS). Their purpose is the analysis of incoming inputs in a computer system or network and their subsequent classification as malicious or safe. IDS are, at least theoretically, very powerful defense mechanisms, since they can detect attacks regardless of their type. An Intrusion Detection System’s detection scheme might be based on either heuristics or on training using a representative set of data (dataset). As a result, in the second case, the dataset must resemble real life network traffic found in the wild and also, in the case of re-training, a new one should be easy to produce. An environment which simulates realistic attacking conditions eases the collection of attack data and as result eases the training of an IDS. However, related efforts so far were being hindered by the lack of virtualization technology. The current Diploma Thesis focuses on the creation of a simple, easy-to-use, easy-to- install and vulnerable system/framework for the simulation of realistic attacking conditions. At the same time, it focuses on implementing a set of realistic attacks exploiting the vulnerabilities present. The result is a complete system, capable of simulating a real-life networks, which combines state of the art vulnerabilities and attacks with ease to expand and modify. Finally, it can also be used for educational purposes.Vasiliki Gerokosta
Empirical Validation of the Efficiency of Change Metrics and Static Code Attributes in Software Projects for Defect Prediction
Defect prediction is an important issue in Software Engineering, and thus it has generated widespread interest for a considerable period of time. Defects in software become increasingly expensive to fix as the software progress through their life-cycle. Quality assurance via rigorous testing before releasing the product is crucial to keep such costs low. Nevertheless, time and manpower are finite resources. Thus, it makes sense to assign personnel and/or resources to areas of a software system with a higher probable quantity of bugs. Several defect prediction models have been developed by researchers in order to reliably identify defect-prone entities. Τhe task of machine learning solves this problem since its purpose is to develop algorithms capable of improving their own performance by exploiting existing data -stored in huge databases- in order to discover knowledge and interpret several phenomena. The aim in this case is to create a defect prediction model using different software metrics, which shall be able to predict the presence or absence of bugs for each part of the software. In this diploma thesis, we utilized the predictive power of change metrics, i.e. metrics that reflect the changes in the software’s source code, originating from bug repositories and the version database of Eclipse. We implement the classification models of Logistic Regression, Naïve Bayes Classifier and Decision Trees on the dataset and evaluate their efficiency. On an attempt to improve their performance we apply the theory of Ensemble Learning, specifically the boosting theory through the implementation of the AdaBoost algorithm. Our results illustrate how these metrics can be useful in predicting bugs, as long as they are utilized correctly by an appropriate algorithm.
Nina Eleutheriadou
Empirical validation of object-oriented metrics on open source software for fault prediction
The importance of open source software systems has been felt both in software industry and research. Numerous software projects are developed using open source tools and companies invest in open source projects, and they also use such software in their own work. Much research is performed on or using open source software systems because such software is not monopolized and they are free from licensing issues. Since open source software are developed following a style different from the conventional one, there arises a need to measure the quality and reliability of them. Hence, the characteristics of the source code of these projects need to be measured to obtain more information about them. Faults in software systems are a major problem. Knowing the causes of possible defects as well as identifying general software process areas that may need attention from the initialization of a project could save money, time and work. The possibility of early estimating the potential faultiness of software could help on planning, controlling and executing software development activities. Furthermore, there are available metrics for predicting fault-prone classes, which may help software organizations for planning and performing testing activities. Fault-proneness of a software module is the probability that the module contains faults. This may be possible due to the allocation of resources on fault- prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great challenge. In this diploma thesis, we study which prediction algorithm is more effective for detecting faults in classes of open source software. Our ultimate goal is the evaluation of several prediction algorithms, which were implemented and investigated within the objectives of this diploma thesis. Specifically, these algorithms are Logistic and Linear Regression, Decision Trees and AdaBoost. Their effectiveness was the benchmark for the fulfilling expectations of this thesis. Furthermore, our goal is the evaluation of each category of metrics, Change Metrics, Source Code Metrics, Complexity Metrics, Bug Metrics, Churn of Source Code Metrics, Entropy of Source Code Metrics, in order to decide which one is more effective in fault prediction.


Giorgos Kordopatis-Zilos
Design and Development of a Mechanism for the Automated Geotagging of Multimedia
The problem of geotagging emerged because of the incremental amount of images and videos that are found in the web, and denotes a matter of concern between the members of the scientific community. The creation of a system that will achieve this goal is the primary purpose of this diploma thesis. This system receives a set of training and test media followed by their metadata, and under the proper process it becomes capable of estimating the exact geographical location of each query media. For this task, two systems, based on theoretical means, are implemented in order to achieve the main goal. Initially, an approach that uses language models is built for the metadata of the training set to be analyzed. The outcome of this analysis consequences the formation of distinctive vocabularies with respect to the wider geographical areas. Afterwards, the assignment of the query media to the above areas is accomplished. Finally, the estimation of the final position of the query media is based on the media that belong on this particular area. Furthermore, an additional method for the location estimation of the media is developed, which is cited on the sematic analysis of the training set’s metadata and the visual analysis of the media. By means of the sematic analysis, a bag-of-excluded- words (BoEW) is formed in accordance with which, the metadata of the media of the test set is filtered. Yet, the location estimation of each query media is established in a similar manner to the one described on the previous model. From the implementation of the above approaches, useful remarks in proportion to the performance and the sensitivity of the imported sets of data arise. Regarding the achievement of the final goal, the semantic analysis of the media’s metadata appears to be effective.
Spyridon Skoumpakis
Workflow Extraction from UML Activity Diagrams
In the context of Software Engineering every system can have two aspects, static and dynamic. During the investigation of the dynamic part of a system every engineer seeks to automate the extraction of Workflow in a both human and machine processable form, as the Workflow constitutes essentially all the available information of a system in the light of its dynamic aspect. One problem often faced by Software Engineers throughout the development of a new software project is the fact that they can not easily find (online) the workflow for similar types of systems. All they can find is .jpg images of UML Activity Diagrams by corresponding software projects, which graphically represent system’s workflow. As a part of this diploma thesis a potential solution to the above (stated) problem is developed, a program named UADxTractor. This software – based tool receives images of Activity Diagrams as input and subjects them to various levels of processing. First level aims to enable the identification of the main entities of each diagram and the relationships between them. The second processing level involves the detection and storage of text included in each diagram as well as the detection of workflow\'s direction between its entities. In final level, the proposed system, having already obtained all the available information which is included (graphically) in the input image, stores it in a semantically aware structure (Ontology), named Workflow_RDF. To verify the proper operation of UADxTractor several experiments were performed, the results of which are presented along with the necessary conclusions at the end of this paper.
Anastasia Herodotou
Applying Machine Learning Techniques on Software Systems for Fault Diagnosis and Anomaly Detection
Anomaly detection is a very important aspect of contemporary systems, since all areas of human activity, from simple everyday activities to the most complex businesses and industry workflows, are becoming more and more automated. In this context, the detection of anomalies in the behavior of the systems is an intriguing research topic, where researchers are striving to reduce system \"failures\" to a minimum, thus requesting as little as human intervention as possible. Through proper design, a system should operate with the maximum possible accuracy, while humans are prone to inadvertent errors. Within the context of this dissertation we employ machine learning techniques and specifically classification in order to assess the ability to detection deviation in behavior on a system setup on ROS, a popular middleware framework for robotics. The system implements the (well-known in Operating systems) \"dining philosophers\" problem, which inherently supports concurrency and shared resources. Initially, an in-depth discussion on anomaly detection methods is performed and then a thorough examination of SVM (Support Vector Machines) and GMM (Gaussian Mixture Models) classifiers is provided. Next the developed “dining philosophers” system is analyzed, as well as the developed methodology for the detection of abnormalities in the behaviors of the philosophers. Finally, the results of the experiments that were performed are presented followed by a commentary on the performance of the two classifiers.
Christoforos Zolotas
Leonardo Software Entity Semantic Search Engine
Throughout the evolution of Software Engineering great progress has taken place in the management of software projects as well as the methodology that is being followed from the very first sketch of the new product up to its delivery. This progress allowed the production of relatively complex software systems, in due time and with fairly low cost. The aforementioned evolution is primarily the outcome of personal talent, imaginative breakthroughs and prior experience of re- searchers/engineers in task. However, many challenges still reside in the realm of Software Engi- neering both during the design and construction phase of a software project. As a result, many projects fail to serve their goals or even get abandoned prior to completion. The primary reason is the arbitrary nature of software which is among the most visionary and abstract human inventions, accompanied by the ambiguities in human communication. In the late 2000s, OMG announced a new initiative named MDA, which aims to full or at least partial obviation of the above-mentioned challenges. MDA’s primary idea is to shift a remarka- ble portion of the software engineer’s involvement with low level code production to more abstract models, of varying detail level, which isolate him from low level implementation details. The core idea is the production of an initial model of the product, free of specific implementation or platform details called CIM, which in turn is formally transformed to an implementation detailed (PIM) and then to a platform specific one (PSM) that allows the production of fully or partially executable soft- ware. Initially, in this diploma thesis, a thorough exploration of the core MDA ideas alongside with a brief discussion of ontologies takes place. Subsequently, the design and implementation of a soft- ware entity semantic search engine tool is presented. The search resource of this tool is an ontology populated with existing, semantically annotated software projects in such a way so that complete UML model retrieval is possible. The input of the semantic search engine is a functional requirement which then triggers the quest of a UML model which is annotated in the ontology as one that satis- fies it. This quest is put through a series of SPARQL queries, each of which retrieves a small part of the needed UML model. Depending on user input, the semantic search engine implemented as a product of this diploma thesis, is capable of retrieving function, class or package UML models, whe- reas the principle goal is to be as complete as possible given that it abides by Software Engineering rules. The output of the tool is an XMI file containing the desired models, which is compatible with the popular open source tool named ArgoUML. As an extra feature, it is possible to generate “struc- tural” code which could be the basis for the implementation of the retrieved model.
Rafaila Grigoriou
PYTHIA: a Question Answering System for Assisting Software Engineers in Searching Software Projects
Software Engineering has progressed at an accelerating pace during the last decades. Software Engineers that are trying to make decisions related to their systems’ design and development, such as the optimal set of functional requirements and the proper system design and code structure, have to deal with huge amounts of information, which usually are not easy to access. Were this information stored and annotated, defining the proper software architecture could become much easier and could result to much more efficient software. In this context, developers would be able to address other Software Engineers’ solutions to similar projects and could reuse them as off-the-shelf components, or could adjust them to their own needs. Within the context of this diploma thesis PYTHIA (Programmer’s dYnamic Thematic Interactive Advisor) has been designed and developed. PYTHIA is a driver-tool that can provide guidance through requirements elicitation and class modeling of a software project. Information related to already implemented software projects is stored in two ontologies and offer engineers with the ability to access previous design paradigms, reuse, or even evolve them. RequirementsOnt contains information related to the user requirements phase; UMLOnt stores information related to the system specification phase. PYTHIA employs both ontologies, and uses natural language processing techniques in order to assist software developers in defining the proper queries to be sent to the Software search engine. It is a web-application, which provides users with the opportunity to query the ontology either using natural language, or by compiling advanced queries through the corresponding view of the interface. In both cases, due to the use of external dictionaries, the system is able to deal with term disambiguations. PYTHIA may be considered as a platform that promotes reusability of software elements (requirements, classes, components), assisting software engineers to avoid “reinventing the wheel”, thus make their work more efficient and effective.Ioannis Goutas
A Multi-Agent Simulation Framework for the Societal and Behavioral Modeling of Stock Markets
Nowadays, all types of information are available online and in real-­‐time. Data regarding politics, financial regimes and legislation are constantly changing and evolving, thus dictating the need for adaptability, in order for someone to advance professionally, or even personally. Given that Complex Adaptive Systems have been widely applied to simulate and monitorsocietal phenomena, these parameters have to taken into account. Complex adaptive systems comprise software Agents(Complex Multi Agent System) that simulate the desired societal or economic activity and adapt their behavior and decision making based on environment information. Thus, in such systems, agents that are more influenced by politics, their close environment and mass media usually adjust to changes more eagerly. Others that are less prone to this type of influencers, show more stable behaviors. Obviously, an agent society that comprises different agent types in different mixes may behave in interesting ways, which deserve further investigation. In the context of this thesis FinanceCity has been developed. It is a Complex Adaptive multi-­‐Agent system that emphasis on the behavioral changes of different types of populations in constantly evolving environments. Like in the real world, in FinanceCity no agent follows a static strategy; rather it adapts the decision making process based on external stimuli and the initial goal defined. As its name implies, FinanceCity focuses on the analysis of such a complex environment on the financial domain: a stockmarket environment is simulated, where agents buy and sell stocks based on their character, their goal and the societal snapshot. In short, an agent’s behavior is defined by its Static and Dynamic characteristics, as well as its Status Portfolio. Focus of the current work is on the study of the agents’ behavioral changes based on their characteristics, as well the study of the overall system’s balance.
Sotirios Beis
Clustering Evolving Social Networks with Community Detection Techniques
Christina Mpoididou
Use of Technics in Editing of Natural Languages and Data Mining to relate Characteristics of Software Quality and Failure Reports
The subject of the current diploma thesis is in the field of Data Mining and especially Data Mining for Software Engineering data. The information that is provided by this kind of data creates the need to extract knowledge that would be very helpful for further study. In this thesis, we attempt to study a dataset of bug reports with primitives of Semantic Analysis and Natural Language Processing fields, in order to create sets of words that users choose to use mostly when they write reports and also combine them in groups. The aim of word grouping is to generally understand how people describe the software problems. Except for that, we also apply Semantic Relatedness to these groups of words to relate them semantically with some software quality attributes. The final goal of this thesis is to extract knowledge for the content of bug reports and also the kind of bugs that describe, as the kind of reports’ content is identified, something that is very useful for the development procedure of a software application.
Alexander Adamos
Analysis and Development of a algorithm to Detect Slow Attack in Intrusion Detection System
With the continuous evolution of Computer and Network Technology the need for protection of Information Systems is ever growing. In order to detect network attacks, administrators use Intrusion Detection Systems (IDS). Nevertheless, their effectiveness is questionable and one may bypass them, given the required skills. During the last years, a new trend in the world of hacking is introduced, known as “Slow HTTP DoS” attacks. These attacks use the HTTP protocol and manage to occupy all the offered connections by a web-server with legitimate use of the TCP/IP Link Layer. Also, the attacker does not need to send any crafted or malformed packets to the victim. He/she just takes advantage of the known vulnerabilities of the network’s Application Layer 7 (OSI model). These attacks are known as extremely stealth and low bandwidth Denial-of-Service attacks. Within the context of this diploma thesis, we propose the enhancement of a popular Intrusion Detection System, to detect the fore attacks. At first, we examine the software tools that may perform the “Slow HTTP DoS” attacks. Having understood their nature of operation, we have selected the Snort IDS in order to implement our detection plugin. We have developed a new Snort Preprocessor, the Slow Preprocessor, which incorporates the Slow HTTP DoS Module. This module incorporates the Slow HTTP DoS Detection Algorithm and calculates the Attack Entropy metric, by which we estimate the potential of an occurring attack. Furthermore, we create network and server statistics based on the Netflow protocol. The last gives us the opportunity to create the background of a future implementation of a complete detection module that alerts users on the very early stages of such attacks. Our mechanism was subjected to a number of experiments, based on different types of settings. Results are presented and are discussed in the prism of network security.


Chrisafenia Mallia-Stolidou and Asterios Vounotripidis
Software Platform for the Design and Development of 3D Role Playing Games
Evagelia Diamantidou
Creation of Valuable Mechanism of Trust and Fame in Open Transaction Networks
Dimitra Miha and Nikolaos Chandolias
Mechanism of Data Mining From Text with the use of Natural Language and Display That Process in 3D
Grigoris Athanasiadis
Use of Technics of Artificial Intelligence for the Analysis and Development of an Intelligent Softare Agent for E- Auction Commence
Marina-Eirini Stamatiadou
Analyse and Development of Service Oriented Architect for RFID Systems
Nikolaos Tsiotskas
Virtual Project 3D: A Tool for the Graphical Presentation of Software Projects in 3D
Ioannis Papastergiou
Creation of a System to Log and Simulate Network Traffic in Order to Control Attack Detection Mechanism with the Use of Markov
Anna Adamopoulou
Analysis and Reaction of an Algorithm of Multi- Criteria Trust and Fame in E-Shop Systems with Software Agents


Anastasia Mourka
Export of Software Requirements From UML Use Case Diagrams.
Ioanna Kampilauka
Use of Date Mining Technics to Sort and Label of Electrical Power Consumers
Themistoklis Diamantopoulos
Analysis and Development of Algorithms for Auction House to be Used in the Power TAC Competition
Konstantina Valogianni
Analysis of an Agent Architect for the Participation in Energy Stock Market
Ioannis Stamkos
Creation of Profiles for the Users of the Second Life by Using Data Mining
Ioannis Gounaris
Creation of Real-Time Systems with Real-Time Java
Anastasia Skantza and Vasileia Tzamtzi
Creation of Bid Mechanism for E-Commerce Auction Systems
Themistoklis Mavridis
An LDA-based mechanism for the optimization of website ranking in search engines


Nikolaos Stasinopoulos
Algorithm of Export Semantic Knowledge From Software Storage
Theano Mintsi
Algorithm to Export Relations to Software Requirements Using Technics of Natural Language and Data Mining
Emmanouil Spanoudakis
ezHome – Simulation and Control System of a Smart House With the use of Agents