Completed diploma thesis, Andreas Symeonidis

Search:

2019

Elpida Falara
Issues Assignment Optimization through the Analysis of Contributions in Open Source Repositories
We are currently experiencing spectacular results of technological progress. The discovery of the use of a series of technological innovations has resulted in the name of the present era as “the information age”. With the rise of cloud computing and the Internet, open source software platforms have been boosted. They are mainly based on the philosophy of distributed version control systems. Among the most popular ones is GitHub, which acts as a service that hosts millions of software projects and users. Collaborative development environments have changed the way the software development process is conducted; additionally, they allow continuous monitoring of software projects, which is of pivotal importance. An important part of the software development process is monitoring of the software projects for possible errors and improvements. Collaborative open source platforms usually have a bug repository where all bug reports are recorded there. However, software engineers are usually overwhelmed by the number of reports submitted daily, which have to be assigned to the appropriate software engineer to deal with the bug/issue. More automated process for assigning bugs/issues to software engineers would definitely ease the work for the software team. This diploma thesis aims to create a system that will propose the suitable software engineer in a software development team to resolve a reported bug. The system was built by retrieving information from open source software repositories in GitHub. The assignment process was carried out by implementing models based on the similarity of the bugs solved by each software engineer in the project, as well as its contribution to the programming level. Text analysis and classification algorithms applied to model the above parameters. In the end, the performance of the system was evaluated individually, but also in combination.
Aristotelis Mikropoulos
Implementation of a Source Code Quality Evaluation System for multi-language software projects
Ioannis Loias
Measuring Semantic Similarity of Software Projects Using Comments
The internet has radically changed the means and speed of information transmission. In the field of code development, it has created new prospects by creating software repositories, which host a large number of software projects that can be used by developers. However, developers often plagiarize source code that they have not developed without attributing the original creator. In this work we develop a system that can discover source code similarities from a multifaceted per- spective that analyzes the semantics and structure of both the source code and its related comments. The system we develop employs a multitude of widespread source code processing techniques to com- pare software projects and produce similarity values for different features. For example, it supports vectoring algorithms, such as word processing of source code through bag-of-words and tf-idf features, for the purpose of applying vector comparison methods at the file and function levels. The outcome of vectoring can also be refined with Latent Semantic Analysis (LSA) that reduces the noise resulting from the use of different terms with the same semantics. Our system also implements a number of existing and novel graph-based methods to compare function call trees. Finally, all methods are able to account for source code comments when calculating similarities between software projects. We tested our system on two datasets of software projects extracted from GitHub; one comprised of software projects and their forks related to the keyword ‘Pacman’ and one comprised of software projects and their forks across different domains. We asserted that employing source code comments helps better detect whether two projects of the ‘Pacman’ dataset are forks of the same project, producing more descriptive evaluation results than algorithms that do not use them. Then, for the second dataset, we used the version of our methods that account for source code comments to help discover which software projects were forks of the same one in this dataset. All algorithms were found to be approximately equal in yielding higher similarities for forks of the same projects. However, algorithms that compare function call trees could more reliably discover similar projects and yielded almost zero similarities when comparing forks of dissimilar ones.
Panagiotis Mouzenidis
Modeling and expansion of the robotic architecture R4A for automated generation of user interfaces
Model Driven Engineering (MDE) aims at solving problems by building models. These models can describe a system without taking into account platform specific limitations. Then, by using executable model transformations a model can generate a platform specific model. This way MDE aspires to solve more generic problems, that can be applied for solving smaller problems by using the right transformations. Obviously, since a generic model can apply to many problems it is cost effective and produces quality source code. In this diploma thesis an Eclipse plugin is designed and developed that generates user interfaces to control robots. The user can easily add or remove functionality and generate the code of the user interface. The generated user interfaces communicate with the robots via the API of a framework that is called R4A. The user interfaces are completely platform independent as they are web based and use only HTTP calls to communicate with the robot. The proposed approach has been tested on the NAO robot which is a humanoid robot developed by Aldebaran Robotics in 2008 for research, educational and entertainment purposes. The target group of the plugin are engineers that want to have a simple user interface to control a robot in no time, but also others who just want to control a robot for entertainment.
Giorgos Tsamis
Automatic code generation for robotic applications based on structured text input
The use of robots is becoming ever more wide-spread, enabled by constant research on the field of robotics and artificial intelligence. Robots, having been used in the industrial sector for decades now, are also becoming accessible to ordinary people. This creates the need to find new human-robot interaction styles, that are not overly complicated or difficult, for use even by users without a technical background. For this reason, new systems are being developed, in order to improve the communication between users and robots, be it with natural language recognition, gesture recognition etc. Towards that end, a system was developed in the context of this diploma thesis, that facilitates the assignment of tasks to the NAO robot based on a description of its desired actions, given by the user in a structured text format, and the automatic generation of the corresponding code. The implemented system was based on another code generation system for the same robot, in which graphical symbols were used for the input of the desired actions and the way they were connected. The main topic addressed by this thesis was the temporal part of the above problem; determining the correct sequence of the actions and the transitions between them (serial or in parallel execution, conditional or random transition, loop, preemption), based on the text phrases provided by the user. Determining the type of each action was not important in the scope of this work and so it was annotated by the user during the text input. Based on that input, the system orders the actions, sets their connections, generates the corresponding code and shows the result to the user using a graph. Then, the generated code can be executed on the robot. The evaluation of the aforementioned system showcased that it is possible to effectively describe all the different ways of linking the actions, including their combinations, with the structured text format, designed for the system input. It also showed that it is possible to easily convert natural language scripts to the proposed input format, in order to produce the desired robotic application
Dimitra Ntzioni
Automatic generation of high - level interface s to collect robot sensor data using the R4A platform
In software engineering , the term automatic programming describes a mechanism that creates a program which in turn allows scientists to code at a higher level of abstraction . Nowadays , robot applications , both in business and home environments are gaining t r a c ti o n , increasing the need for automatic software generation without errors. It is well know n that robots are equipped with multitude of sensors, which play a key role in their operation and in accomplishing certain tasks. For this reason, it is often necessary to control the produced data, in order to build software systems. This diploma thesis aspires to take the first steps towards the process of automating the development of ready - to - run interfaces to collect robot sensor dat a . Towards this direction , MDE (Model Driven Engineering) is employed . More specifically, once a subtractive model has been defined, a series of transformations takes place, resulting in a fully functional system . T his way, the software development process is accelerated and software is produced with greater reliability. Within the contex t of this diploma thesis , we have designed and implemented C oRSeDA ( Collecting Rob ot Sensor Data Automatically ) , a system where the user interacts through a friendly graphical user interface and defines the features for the desired sensors of a robot. Based on the sensors and the ir parameters, the system generates automatically executable c ode , based on the R4A platform , to collect data from the specific robot . A t the same time , a fully functional interface is generated providing informa tion for the whole system. The data of the system and its sensor s are displayed, along with any other information generated, in a web appli cation created for that purpose . To test and evaluate this system, experiments w ere performed on the robot NAO, an autonomous, programmable humanoid robot developed by Aldebaran Robotics .
Aggelou Evaggelos
Detection and Transfer of Objects by a Robot & Design and Construction of a Power Supply Circuit
The field of robotics has been growing rapidly in recent years and is becoming more and more a part of human life. People nowadays assign small, everyday tasks to autonomous robotic agents, thus facilitating their lives. A widespread application of this kind is robotic brooms, which help cleaning the house. Also, autonomous robotic agents can be used to explore unknown spaces, which are inaccessible due to some, natural or not, disaster. Finally, by using a robotic arm in such agents, we can expand their capabilities and gather various items from their surroundings. The present diploma thesis proposes a solution to the operation of an autonomous robotic agent, which has the capability to explore and cover (by a camera) an a priori unknown space. Also, by using a robotic arm, the robot can gather objects located scattered within the environment. For the implementation, code was developed which, using data from a laser sensor, creates the map of the area where the robot is located and through camera data, calculates the field that the camera covers. Finally, code was developed that, through RGB camera data, recognizes a specific type of objects within the space, approaches them and then, using a robotic arm, collects and stores them until it returns to the point where it began the exploration, having covered the whole area in which it operates. At the same time, a circuit was developed that aims to supply with power the peripheral devices of the robot, namely the computer and the robotic arm that are attached on it, via a LiPo (Lithium - Polymer) battery. Also, using an appropriate sensor, current, voltage and power consumption data are measured, and then being sent to the computer on the robot. In addition, a LED light is used, that indicates the voltage level of the battery. Finally, this circuit was printed on a circuit board (PCB - Printed Circuit Board) and was placed on the robot.
Christopher Bekos
Design and implementation of Model-Driven mechanism to automate graphical Web-UI generation for domain specific application modeling
This thesis describes the creation of the Simplified Web Sirius Framework (SWSF), a framework designed to overcome Sirius\'s inability to run independently of the Eclipse environment. The SWSF provides a graphical environment where the user can process models using the widgets defined by a GDSL. The graphical environment is automatically generated, depending on the GDSL given as input and is similar to the environment provided by Sirius. SWSF has been constructed using both MDE and web technologies. MDE technologies are necessary in order to automatically generate these parts of code, which describe the differences between the graphical environments, created for each GDSL. In addition, the usage of web technologies provides users with the ability to create and edit models using a graphical environment equivalent to Sirius, while experiencing the benefits of a web app.The SWSF supports some of the most basic and frequently used features of Sirius. An .odesign file as well as an .ecore file containing the definitions of a GDSL and the concepts of a particular field, respectively, are given as input to the framework. Then, an xml parser is used, in order to transform the information of these two files into a model that conforms to the gdslMetamodel metamodel. Next, this model is used as an input to an M2T transformation, which generates two python modules. These modules will be used in conjunction with the executable form of appMetamodel and a UI (User Interface) in order to produce a graphical modeling environment, just as described by the GDSL. The creation of a model which conforms to gdslMetamodel, as well as M2T transformation are key elements of SWSF, which make it capable of automatically produce different graphical environments, according to a given GDSL. SWSF integrates the above-mentioned systems, by providing an online application based on the client-server architecture. The server holds information about the model, its graphical representations and tools describing modifications on it. The client is a UI, which displays graphical representations to the user, using the corresponding information provided by the server. In addition, the client undertakes to handle the interaction with the user and to forward the appropriate requests to the server, in order to properly update both the model and its graphical representations.
Maria Ioanna Sifaki
Applying Data Mining Techniques to Extract Evolution Patterns in Question-answering Systems
Developers still face difficulties and obstacles when using reusable code snippets, which are mainly related to their functionality and the occurrence of errors. This is because the validity of the answers in such communities is not checked by experts but only by the members themselves. Therefore, this imposes an effort to improve code snippets and to timely correct their errors. Our research has focused on identifying and then clustering the most common edits (and mostly errors) found in the history of answers of the most popular community named Stack Overflow. The SOTorrent dataset was the basis on which our system was implemented as well as its qualitative assessment. In this way, we have managed to make each individual solution of an answer to a generic solution offering ultimately useful and utilizable information to developers so as to avoid similar mistakes in the future. Finally, the visualization of the data and their clustering have significantly helped to strengthen the evaluation of this work, as the answer edits groups were highly coherent. By creating an edit recommender, we have confirmed the accuracy of the research results and, moreover, we have succeeded in improving the content of some generic edit comments by proposing other more appropriate ones in their place.
Georgia Pantalona
Identifying software engineer profiles using data extracted from version control systems
The software development process is constantly evolving, and this development is indicated by the ever-growing need for producing new software. The need for the fastest deployment of new features and the effort of integrating the customer into the software development process has led to the development of new software development models (eg, agile) where collaboration plays a leading role. These different types of models require the engineers to have developed different types of skills, whether those are hard skills, such as coding, or soft skills, such as communication or project management. Therefore, it is clear that there is a growing need to recognize skills of engineers and assess them as to the extent to which they meet the requirements of a role in a software development team. Finally, with the emergence of agile software development methods, the use of version control systems was also increased, and such systems include a large amount of data related to the software development process. The purpose of this diploma thesis is to exploit the data from the version control systems for the recognition of skills with a view to more objective evaluation and the better role assignment. In this direction, data from GitHub are extracted and used to create metrics that reflect the skills of engineers. Additionally, a benchmarking of the metrics is carried out in order to properly categorize the proficiency of each engineer for the different types of skills. Finally, those metrics are presented in the form of graphs in an application. This application presents the profile of each engineer in an easily understandable way to enable the best possible assignment to a role based on the skills of the engineer. After evaluating the system for a capable set of software engineers from GitHub, we can conclude that our system produces useful results. The profile overview is satisfactory as it gives an insight into the role and skills of each engineer, thus contributing to their more efficient exploitation within the software development team.
Panagiotis Sakkis
Automated application of linting rules through machine learning
The last few years, the vast growth of the Internet, has added many new possibilities and has altered the way of writing code. The reuse of parts of code is an everyday phenomenon in the field of software development. Thus, writing “clean” code, that is code that follows the principles of readability, maintainability and extendibility, is of upper importance. A lot of research is being held constantly into the reassurance of these values. One of the basic tools being used towards that goal is the linter. Linters are crucial nowadays, especially when the programming languages lack the part of interpretation, such as JavaScript and Python. Researchers throughout the community, lately aim their work into improving those tools and making them more efficient. The most popular linter right now is ESLint. It is an open source tool, that is fully configurable and as a result it offers huge possibilities to the programmers. Our goal in the thesis is to examine the usage of ESLint, done by the programming community, extract useful conclusions based on that and propose improvements. WE will attempt to apply modern machine learning techniques in order to give a solution to some of the problems faced when using ESLint. For our analysis we used some open source repositories from GitHub, from which we extracted useful data that we have available for future use for researcher that want to involve in the subject. Alongside, based of those data we implemented some tools, using machine learning algorithms, that offer direct solutions to the problem of configuring the rules of ESLint.
Orestis Georgiadis
News popularity prediction with image/text content
The prediction of popularity of online content (text/image/video), is a topic of great interest for the related research community, and also for the companies that publish that kind of content. Current work focuses on news articles, and its goal is to predict the popularity of the article as long as it gets published. The prediction is achieved by a machine learning system, that receives data about the articles as input, processes them accordingly and trains/tests a model in order to estimate the number of impressions of each article. In this thesis two different models have been developed. The first one is a regression model that estimates the exact number of impressions. The second one is a classification model, that classifies the articles into four different clusters depending on the number of the impressions. Both models use the same input data. Firstly, the titles of the articles are being processed and a vocabulary that contains the word embeddings is created. Those word embeddings come from the pre-trained library fastText, which uses the Continuous-Bag-of-Words (CBoW) method for training. Then, a Python dictionary is created that contains the image labels of the articles, along with the probability of each label to fit the image. Those labels are created by the ImageNet database, and were trained by the ResNet50 classification method. Finally, the last system input is the publisher of the article, which is a website. After that, a convolutional neural network was developed that was used to train both models. The results are promising and they seem to improve the basic methods that are not using neural networks for the prediction. One may safely say that the regression model is the most efficient of the two, and that neural networks lead to more efficient predictions on news popularity.
Kosmas Tsiakas
Autonomous aerial vehicle localization, path planning and navigation towards full and optimal 3D coverage of a known environment
The present Diploma Thesis focuses on implementation of algorithms for solving the problem of fast, reliable and low-cost inventorying in the Logistics industry. The usage of drones simplifies this procedure and aims to determine every product’s position with a few centimeters accuracy. This problem consists of two subproblems: a) the position estimation in the indoor environment and b) the autonomous full coverage of the area. In order to successfully tackle the problems described above, a known 3D map in OctoMap format is used. During the research, a Particle Filter based algorithm that uses anarrayofdistancesensorsaroundthedronewasimplemented,inordertotrackthepose of the robot against the known map. Navigation is based on a PID position controller that ensures an obstacle free path. As for the full coverage, an extraction of the targets and then their optimal succession is performed. Finally, a series of experiments were carried out to examine the robustness of the positioning system in three types of motion, as well as different speeds in each of these cases. At the same time, various ways of traversing the environment were examined by using different configurations of the sensor that performs the area coverage. The experiments were entirely performed in a simulated environment.
Floros-Malivitsis Orestis
Natural Language Understanding for Human-Robot Interaction: the NAO Robot Case
This diploma thesis aim store cognize inside a natural language text actions that belong to a predefined list and map them to an already existing robotic platform. It is not attempted to synthesize a fully working algorithm that reflects the logic of the text; rather, a static mapping of the given sentences to actions is performed. The output of the system could be processed by an independent application for the final production of executable code. Fortheabove-mentionedpurposes,we have developed a natural language under standing (NLU) system, r4a-nao-nlp, that recognizes the supported actions of the R4A-NAO metamodel. We have implemented a modular software pipeline that segments the text using semantic role labeling to identify multiple user intents per sentence. In addition, the system utilizes the results of coreference resolution throughout the text to enhance the performance of intent classification and slot filling in sentences that include mentions. Since the dataset for training the NLU system had to be created fromscratch,our approach has been designed to cope with a low-data regime; there are no requirements in the dataset for sentences that combine multiple intents since that would result in polynomial growth of the dataset size. The final output of our pipeline is a directed graph that encompasses all detected actions and connects them with the original conjunctions of the text. This implementation benefits from its modularity since the used models, with the exception of those that perform intent classification and slot filling, come pre-trained on much larger datasets and concern major natural language processing tasks and therefore are bound to improve with the further development of the related technology. We believe that our approach can be utilized, without the need to increase training data, by task-oriented dialog systems or other related applications that often lack the ability to recognize multiple intents per sentence. Inconclusion, we have developed a system that can prove useful to the final user who can obtain optimal results if they learn about its limitations and idiosyncrasies. This procedure is not considered to demand technical or esoteric knowledge on r4a-nao-nlp.
Pantelis Photiou
Design and implementation of a hybrid system to satisfy execution time constrains for software built with ROS1, ROS2 and IoT frameworks
The enormous development of robotics has created the necessity of the remote monitoring of these machines and at the same time the remote monitoring of the parameters of a controlled environment such as temperature, pressure or humidity, using sensors. Here is where the Internet of Things (IoT) comes into the picture. In the concept of the IoT, it is very important to connect the \"Things\" to a network so that they can send and receice data. \"Things\" can be devices that are used in everyday life, have computational power and allow connection to the Web. The rapid evolution of the IoT industry over the past few years and the large number of low-cost smart devices, has created the need for the development of protocols and tools allowing the communication between them and their connection to the Web. In addition to the existing protocols, new have been developed to efficiently transfer data between all these devives, as well as remotely monitor them through Cloud. This diploma thesis presents the development of tan IoT application, which allows communication between smart devices and robots. For the creation of the system, the use of the ROS2 framework has explored. ROS2 is the latest version of the Robot Operating System (ROS), the most used robotic framework in our day, which implements/ uses the DDS (Data Distribution Service) communication protocol. DDS is a real-time, data-centric, publish-subscribe protocol created specifically to meet the needs of a fully distributed IoT system. In addition to the infrastructure, experiments were performed to evaluate DDS as a communication protocol for robotic applications, as well as an integration with an IoT platform, drawing useful conclusions.Dimitrios Tatsis
Test case automated generation through dynamic analysis and symbolic execution
With the current technological advances there is arisingneed for testing the reliability and the security of software. Security testing usually requires manual work from a highly specialized security researcher.However,lately a number of program analysis techniques have been employed in order to automate parts of the process. In this thesis an automatic system is developed that uses dynamic analysis and symbolic execution in order to produce testcases for programs. Dynamic analysis is used in order to collect run time information like the current program state,the usage of input data by the program and the paths that have been executed by the program. This information is used in order to accelerate symbolic execution and produce a test case file that is able to pass various program checks and execute more program code, thusincreasingcodecoverage.These test case can be used in further analys is of the program. The system that was developed gives encouraging results producing partial input files with the structure that is expected by the programs analyzed, automatically and without any prior knowledge.However,the limitations of symbolic execution quickly become apparent since the analysis complexity has exponential complexity .As a result additional methods must be used in order to surpass these limitations in larger programs.

2018

Vasilios Politiadis
Automated Production and Execution of Tests on RESTful Web Services
The REST architectural style first appeared in Roy Fielding\'s doctoral dissertation in 2000. The basic idea behind REST architecture is that all objects managed by a web service can be considered as system resources. Since it’s b ased on client - server logic, a clien t may send a request to the service to create, retrieve, update, or delete a resource representation , through a URI that is associated with it, and by properly using the CRUD verb s of HTTP protocol. In recent years, thanks to its simplicity and power, the REST architectural style has been embraced by the global software industry, has conquered the field of web services and is now the dominant model for their development. Meanwhile, the growing need for easy and fast development of reliable software has led software engineers to deploy methodologies such as Model Driven Engineering ( MDE ) , with the goal of increasing performance, productivity, and automation in the process o f software development . The idea promoted by MDE is to use models at different levels of abstraction while designing systems and to automate the process by using model transformations from higher to lower abstraction levels , until the final executable code is produc ed . One of the most important issues throughout the software development process is to test the quality and reliability of the produced software . Software testing is in principal performed following two different approaches in regard to the test er\'s knowledge of the internal structure and design of the tested software. The White Box approach dictates that the tester knows the internal structure of the system and performs tests from the developer ’s point of view , while the Black Box approach dicta tes that the tester considers software as a black box , where he implements inputs and analyzes its responses , from the end user ’s point of view . T he context of this diploma thesis is the development of a programming tool that helps automate the generation and execution of tests on RESTful web services developed using the S - CASE MDE Engine. Given the S - CASE PSM meta - model and implementing the PSM model of a given service as input , a Model - To - Text transformation is performed , which produces a dedicated Black Box testing application. Finally, the execution of the former application generates detailed test reports and test results stored in JSON format .
Anastasios Kakouris
Continuous User Authentication in Web Applications through Behavioral Biometrics
The use of continuous authentication systems that employ behavioral biometrics are gradually gaining ground as the preferred method of authentication. This is due to the limitations imposed by standard authentication methods, which are unable to guarantee user identity beyond initial authentication and conceal serious security issues such as impersonation and exposure of personal data to third parties. On the other hand, a user’s behavioral biometric trait is very difficult to be copied or intercepted in any way, and as a result it can be used appropriately for the implementation of a continuous authentication system, ensuring this way a secure session for the user. Within the context of this thesis we choose as a behavioral biometric trait the way a user interacts with his/her keyboard. Practically, we use keystroke dynamics to identify and authenticate users. Specifically, we analyze the keystroke digraphs of a user that are collected from the typing of words, and extract three features: hold time of the first key of a digraph, hold time of the second key and time elapsed since the release of the first key and the presssing of the second key. In order to test and evaluate the implemented system we collected a total of 59000 keystroke events, from 37 subjects within a period of 12 weeks. Using that data, we tested various pattern recognition models such as classification and outlier detection, while experimenting at the same time with data pre-processing techniques for the reduction of feature vector dimensions (PCA). The best results are obtained by employing an One-Class SVM for a 3-d feature vector, achieving 0.61% False Accept Rate (FAR) and 0.75% False Reject Rate (FRR), and by employing Gaussian Mixture Models for a 2-d feature vector, this way achieving 1.35% FAR and 1.71% FRR. Our results show that keystroke dynamics can be used effectively in a continuous authentication system.
Anastasios Loutroukis
Employing semantic analysis methods for personalizing recommendation in e-commerce systems
The digitization of markets has made the e-commerce industry the dominant way of performing human transactions. A key challenge for e-commerce systems is the high volume of data they manage; appropriate techniques need to be designed and developed in order to personalize the content available to consumers, offering them information on products of interest to them. Recommendation Systems for e-commerce, which employ machine learning and data analysis techniques in order to generate appropriate personalization content models, have been developed in this direction. The main problem characterizing existing systems is the lack of semantic understanding of the provided proposals, as most of the literature algorithms focus on proposing products solely based on the analysis of users’ rating patterns. Within the context of this diploma thesis we have focused on the design and development of semantically aware personalization techniques for e-commerce. These techniques focus on the content that characterizes the products and the interests of the users. The model that is generated employs a set of natural language processing methods and achieves, initially, the categorization of products based on their thematic content and, secondly, the assignment of users to the exported product categories based on the users’ interests.
Alexandra Ampartsoumian
An alerting system for the elderly by utilizing the NAO social robot
The social and technological inclusion of the elderly, as well as their psychological and physical support, have recently become a major issue, due to the fact that senior citizens comprise an ever-increasing percentage of the general population. Within this context, numerous scientific studies have been conducted worldwide, engaging socially assistive robots, in order to find solutions that will enhance the autonomous living and the overall quality of the seniors’ life. In the present diploma thesis, the humanoid robot NAO assumes the role of a socially assistive robot, in order to assist the elderly towards quality living. In particular, NAO reminds seniors of their medication and general events of their everyday life, as well as plays songs associated with their past memories and experiences. Furthermore, each time a medication event reminder is triggered, the proper medicine image is simultaneously being displayed on a computer screen. Appropriate information of medication and music tracks can be inserted by the caregiver of the elderly person via the application graphical interface, while scheduling of all types of events is possible via the Google Calendar UI. Additionally, in the context of the present application, the assistive robot can operate as a recreational companion so as to contribute to the improvement of the mental and emotional well-being of the elderly. Its entertaining role is accomplished through a set of activities that allow the elderly user to be informed about upcoming events of their everyday life and to listen to music pieces of their choice on demand. The interaction of the elderly with the robot is accompanied by a series of interactive images displayed on a computer screen so that the activity is as pleasant and user-friendly as possible. The evaluation of the implemented robotics application by a psychologist specialized in issues related to the elderly, can be found at the end of this document.
Napoleon Christos Oikonomou
Call by Meaning: Calling Software Components Based on Their Meaning
Software development today involves code reusability, to a great extent. Software components to reuse are often difficult to be fully understood, since they are written by third parties and are usually designed to solve more abstract and general problems. This makes component-based software development a tedious process, as it requires developers to first find the component they need, then understand exactly how it works and make it compatible with their system and lastly, continuously upgrade it to stay compatible when the component changes. Software developers have long realized that even the creation of a simple application is now quite complex. This is because we still rely on the description of various components based on their name. The problem is that this name consensus cannot be easily expanded outside the environment it was created. As a result, the process of discovering and making compatible software components, as well as the ability of the application to respond to changes in its external environment, becomes difficult. This Diploma Thesis deals with the creation of an infrastructure that, potentially, replaces arbitrary naming conventions, creating an environment in which component discovery is based on the -generally accepted- assumption that any method can be described analytically and uniquely if the description of its inputs and outputs is sufficiently detailed. Aside from discovery, having in mind installation and compatibility issues, it seemed logical that this infrastructure should be cut off from any local development environment and be a more universal part of the ecosystem.
Ιoannis Μaniadis
UI Persinalization in E-Commerce through User Interest Analysis
In recent years, the increase of consumer activity on the interest, as well as an increase in processing power made available to smaller business, have created the potential and necessity for websites to do an in depth study of their visitors, in an effort to approach them in better and more intuitive ways. Specifically with regard to e-commerce websites, machine learning techniques are being applied to make specific personalized products and/or product categories recommendations to visitors, based on their recorded activity and/or their known traits (age, gender, etc.). This process is one of the ways through which web personalization is applied, and to achieve this a number of methods have been developed, with each aiming at dealing with specific aspects of the issue. The objective of this diploma thesis is to design and develop a method which efficiently analyzes the recorded activities (history) of an e-shop\'s visitors, and makes predictions about their interests when they revisit the website. Analysis is performed on read data from www.pharm24.gr. Visitor the website, as defined by the website\'s administrator. This is a novel approach which deviates form typical methods found in the relevant literature. Based on these data, our system applies machine learning techniques to predict which sections of the website are most likely to be of interest to each user in their next session. Within this framework, different techniques are tested, evaluated and compared, and the ones that yield the best results are presented.
Valasia Dimaridou
Analysis of Human Presence and Behavior at Points of Interest
The widespread use of image recording systems has le a d to the implementation of integrated systems for human behavior observation at various points of interest. Meanwhile , research towards targeted advertisement has increase d during the past years. The combination of the two reported science tendencies along with the continuous evolution of the hardware responsible for image recording and processing has led to the design, develop ment and validati on of an innovative system responsible for analyzing human presence at points of interest . The above need is summarized in a methodology that conducts statistical analysis of the number of people crossing a point of interest (e . g . in front of billboards), while estimating and recording some biometrics (such as age and gender) and the direction of each person’s glance, provided his/her position and pose allow it . Within the context of the present diploma thesis , an extensive research was carried out concerning the previous work on methods used to detect and analyze individual s passing by a place . Furthermore , our work proposes a method of handling videos including people in order to solve the problems of foreground and background separation, face detection , finding distinctive landmarks in them, calculating the rotation in three degrees of freedom and providing biometric lab e l s . The second part of t he methodology is responsible for incorporating the proposed methodology into a tracking system which uses depth images . The proposed methodology is meant for usage as a real time embedded system , so its’ implementation is as computationally inexpensive as possible. The current work includes an extensive evaluation methodology which demonstrates the capability of using the proposed system for commercial purposes .
Spyridon Papatzelos
Study on Cost of Application Execution and Storing Process in Blockchain environments
In a world that is evolving faster than ever, information, data or intelligence are the most valueable of all the goods. How fast is the information available and to whom are the keys that made Blockchain transform from an idea to a useful tool, which is expanding year after year of its very short life. Trace-ability, i.e. the \"what\", the \"when\" and the \"where\" of the procedures of the transaction, and transparency, i.e. every user\'s right and privilege to access the Blockchain database, are two of the most important advantages this technology has to offer. Discovering this innovating technology makes the discoverer want to learn more, even if he/she is not in the computer and data professions. The main goals of this thesis are the analysis of Blockchain systems and searching optimization techniques from the stand point of data storing cost and application execution in a Blockchain environment. Two objectives have been targeted. First, the implementation of Blockchain application on the subject of agricultural logistics and, second, the research of optimization techniques on this application. The first stage began with constant study of various papers on the characteristics of the operation of Blockchain system. The Blockchain system is a general concept which means that it is not an actual implementation. It was introduced to the world of the applications by Bitcoin, the first implementation of Blockchain in 2009. Bitcoin combined a variety of existing technologies to make Blockchain as it is known today. In order to make the understanding of Blockchain more efficient the study immersed in different environments such as Bitcoin, Ethereum, Hyperledger. Open character, security and user-community made the choice of Ethereum optimal. The second stage includes a further examination of Ethereum characteristics. Ethereum allows the application implementation through Smart Contracts in the Ethereum Virtual Machine (EVM), the execution environment. Smart Contracts were developed using the programming language Solidity. All the acquired knowledge was used to design an agricultural logistics application. Finally, various scenarios were tested in order to result in cost optimization of Blockchain execution and storing process. The optimization techniques were aplied to the agricultural logistics application.
Dimitrios Rakantas
Implementation of a robotic application platform using a robotic web simulator
Nikos Oikonomou
Extracting Semantics from Online Source Code for Software Reuse
The widespread use of the Internet and the convenience of information sharing, which comes as a consequence, resulted in essential changes regarding software distribution and development. Software examples are now in abundance in code repositories, as well as in various programming related websites. However, the search for code examples (aiming in code reuse) proves to be problematic task when using conventional search engines, as the Software Engineer is force to set his project aside and waste a lot of time examining the usefulness of the results. In order to cope with the aforementioned difficulties and provide a more specialized solution to the problem, development begun in Recommendation Systems in Software Engineering (or RSSEs). The fundamental objective of these systems is the ability to recognize, whten given a query, code examples with relevant content. Nevertheless, this is difficult for most systems due to the lexical gap between search queries, usually expressed in natural language, and retrieved documents, often expressed in code. In addition, the majority of systems often require composition of complicated search queries and provide results in a non-optimal order. Finally, most of the systems are based on simple Vector Space models (VSM) and do not make user of semantic information for the retrieval of useful code snippets. The need for an efficient solution to the problem led us to the design and development of a new recommendation system called StackSearch. Our system uses as a data source the Stack Overflow Website. After careful preprocessing of both textual and code data, we trained certain Vector Space Models. Using the aforementioned models, our system accepts search queties in natural language form and is able to take advantage of the semantic information of text that accompanies code snippets, thus, achieving to present the user with relevant examples. Those code examples have earlier been mined from the Stach Overflow posts and checked for syntax errors. Finally, we evaluate the system by comparing our ranking algorithm with existing solutions from recent research to ensure its efficiency.
Triantafyllia Voulibasi
Test Routine Automation through Natural Language Processing Techniques
Artificial Intelligence and Big Data concern a big portion of the technology research community nowadays. The question is how to move from research to an actual \"intelligent\" implementation. This work utilizes numerous novel Big Data manipulation and Artificial Intelligence techniques in text mining in order to build a productivity tool prototype for Software Testing Engineers to produce automated tests. The tool finds its foundation on Recommender Systems, where Deep Learning approaches are calculating the semantic similarity between a search query and the results, after taking into account massive software related documentation data. Association Analysis empowers the system with the ability to \"remember\" and improve itself, as older inputs are stored and processed to assign better scores to future queries. This tool addresses test engineers working with Model-Based Testing, where building blocks can be teamed to implement an automated test. The user can create a test scenario that will transform into a test ready to run and supports automatic requirement tracing. Experiments conducted in the dataset of the European Space Agency’s Ground Segment test scenarios demonstrate the ability of this domain-specific tool to produce results close to human thinking and ease testing procedures
Adamantidou Eleni
Development of an application that provides services based on speech recognition
In an era where technology is a big part of most people\'s everyday life, verbal communication between a human and a machine could make the use of technological products easier, even by older people. For that reason, a speech recognition application which provides users with information is implemented in this project. The user makes an oral question to the application, the application translates the question into text and responds to the user providing him with information that receives from the communication with the corresponding web service. The application consists of 7 individual stages and it is designed to be easily extensible, making the addition of a new service possible with a minor change to the existing code. In this thesis, 3 services that inform the user about hospitals, pharmacies or about the weather were implemented. In the matter of speech recognition, a new, specific, greek model was trained in order to improve the performance of speech recognition against other models. The new model is trained on recordings of people asking questions whose content is relevant with the 3 services above. The conducted experiments show the significant improvement of the application-specific speech recognition system against a generic system as well as the efficient response of the application to the user\'s questions.
Fengomytis Thomas
Source code quality analysis in multi-language software projects
The rapid development of technology and the widespread use of the internet have resulted in the evolution of software development, which is now dominated by the concept of code reuse. This has been contributed by numerous open source software which are distributed freely in online repositories and are easily accessible by developers. So, reusing code to develop a new software creates the need for quality evaluation of software components. Additionally, the wide adoption of component-based software engineering techniques has led to the emergence of multi-language software projects, namely software projects containing code fragments written in different programming languages, where developers aspire to optimize the exploitation of the capabilities of each language. This further increases the complexity of software quality evaluation. To this end, existing practices focus on analyzing and evaluating source code of single-language software projects. In practice they employ static analysis tools that can evaluate software projects only in one programming language. Within the context of this diploma thesis a quality evaluation system for multilanguage software projects was designed to take account of the calls between the source code sections of different programming language. The methodology of this thesis is based on adapting static metric analysis techniques for quality evaluation, by differentiating their calculation of static metrics based on the the various multi-language calls. Applying our methodology to multilingual software projects (implemented in Python and Java) has shown that the system is able to provide a comprehensive and representative quality evaluation model and can therefore be a useful tool for developers
Eystratios Narlis
User-perceived quality evaluation of user interfaces in web applications through the identification of dominant design patterns
Web pages have become an indispensable part of gathering and providing information in all areas of everyday life. Whether a user is working on a computer at the office, entertaining himself on a video game console, communicating with others on a smartphone, or entering an address in a GPS device while driving a car, people are constantly interacting with GUIs through which they exchange information. The plethora of available web applications has led to a new reality where each user can find applications that meet every need. In the majority of cases, the available applications that cover certain functionality are dozens, which makes the design of the interfaces crucial to the end user\'s choice. Better design in terms of both attractiveness and usability significantly increases the eligibility of an application by end-users. Based on the above, this diploma thesis aims at providing assistance in order to improve the design of graphical interfaces of web applications by proposing an automated tool that can evaluate the design of a website by modeling how best design is perceived by end users. To this end, training data was collected using data mining techniques in a data set containing the 5000 most popular websites. Static analysis was performed on these websites in order to identify common design templates for their structural elements and then used these templates to create a tree model of web site design assessment. The system developed, in addition to rating websites, is able to propose specific design changes based on the prevailing standards that have been exported and can therefore be a useful tool for developers
Dimosthenis Kitsios
AUTOMATIC CODE GENERATION OF BEHAVIOR FOR THE ROBOT PLATFORM NAO
In computer science, the term automatic programming identifies a kind of programming in which a mechanism creates a program that allows scientists to create software applications at a higher level of abstraction. Model-driven engineering is a software development methodology that focuses on creating and exploiting models that are conceptual models of all issues associated with a specific problem. Therefore, it emphasizes and aims at abstract representations of activities that govern a particular field of application instead of computational (i.e. algorithmic) concepts. In this diploma thesis, an Eclipse-based method of automated code generation was designed to allow the use of model-driven approaches and is used to create executable code from Ecore models defined by a metamodel. Modelbased software technology aims to reduce development effort by creating executable code from high level models. The aim of the diploma thesis is to create a pleasant and friendly graphical user interface through which the user interacts and selects the functions that define the commands to be executed by a robot. To test and evaluate the system that was designed, experiments were performed on the Nao robot. Nao is an autonomous, programmable humanoid robot developed by Aldebaran Robotics in 2006.

2017

Sofia Sysourka
Design and development of an aesthetics quality evaluation system of web applications based on structural analysis
Graphical User Interfaces (GUI) form a communication channel between man and machine and they aim to offer an effective and easy way to serve the functional requirements of software. Typical examples of machine software are web applications, which constantly grow in both number and popularity. As a result, web application providers strive to build user interfaces that offer attractive aesthetical design and ease of navigation and access to the information that users are looking for. The question raised, which constitutes the basic research field of this diploma thesis, is the following: How can the aesthetic design of a webpage GUI be evaluated? The above question dictates the design and development of a reliable mechanism for evaluating and modeling the design of the GUIs of web applications. This diploma thesis aims to contribute to the aforementioned question by identifying design patterns related to the aesthetic characteristics of the GUIs, as well as specialized patterns which are implemented on webpages of specified content. The process of finding the aforementioned design patterns is based on the way end-users perceive aesthetics (user- perceived aesthetics), indirectly reflected on web application popularity. Towards this end, a data collection and processing system was developed, that led to the development of an evaluation model of the aesthetic design quality of the webpages. The training data comprises 75 popular webpages of three different domains (e-shopping, news, search engines). Static analysis was performed on these webpages in order to collect useful information regarding the GUI components (i.e. the number of elements in each webpage and the way they are distributed among the layers of view), as well as the calculation of a series of metrics used widely in bibliography. Classification and clustering techniques were applied on the collected data which resulted in the development of a combined aesthetics evaluation model. Results successfully incorporate the notion of aesthetics, as it is perceived by the end users, and therefore it can be a useful tool for programmers.
Bagia Rousopoulou
Automatic user - perceived usability evaluation of web applications through the identification of dominant design patterns in user interface elements
In recent years, the rapid development of the internet has become apparent, resulting in a plethora of web applications that have become source of information for millions of users. People, regardless of the age group they belong to, their economic and social status, use the web on daily basis for multiple purposes such as information retrieval, entertainment, communication, business etc. The continually increasing trend of web applications, in conjunction with the existence of multiple tools which automate the design of user interfaces, necessitate the development of methods that asses user perceived usability. The current diploma thesis aims to contribute towards the improvement of web applications graphic interface design, by suggesting an automated evaluation model for user interfaces based on crowdsourcing information regrading the way usability is perceived by end-users. Towards this direction, the proposed system applies static analysis techniques into a number of popular websites in order to calculate a series of metrics closely related to UI aesthetics and visual complexity. Those metrics constitute the information basis upon which design patterns are extracted using artificial intelligence and data mining techniques. The identified design patterns are used to create a rule-based system for the evaluation of user interfaces. Preliminary results regarding the usage of the proposed system indicate that it can be a useful tool for developers and interface designers.
Dimitrios Dontsios
Meta-modeling of Non Functional Software Requirements for RESTful Services
The REST design pattern was first introduced in Roy Fielding’s dissertation in 2000. This pattern is in fact a set of well-defined rules and constraints, which, when applied to a given web service, they make it more ap- pealing by improving performance, enabling scalability, modifiability and easy grasping of the functionality of the service. The basic idea of REST is that every object of the service is a resource that can be easily created and destroyed, using Uniform Resource Identifiers (URIs), namely web links. These resources can be modified by a well-defined set of actions, the HTTP verbs, and they can be shared between clients and servers using strict representation forms and protocols. Due to its simplicity, REST became so popular that more and more services are built guided by its architectural style. Thus, an increasing need for tools that automate the process of building RESTful web services is evident. Many such tools are easy to use, but lack in some aspects. Some of them achieve to meet more constraints that REST demands, while others manage better automation process. However, what is common in all existing tools is that they aim to fulfill only functional requirements whereas they disregard the importance of non-functional re- quirements. The aim of this thesis is the design and the development of a tool that automates the production of a RESTful API, while taking into account nonfunctional requirements apart from the functional ones. This tool is an extension of the S-CASE MDE Engine that semi-automatically produces RESTful APIs by employing model driven. The implemented extension uses the MDA process, a model driven engineering process that the OMG (Object Management Group) introduced for the purposes of model driven engineering. The generated code conforms to the MVC architecture using JAVA EE. It also complies with all the rules that the S-CASE tool introduces, thus conforms to Richardson’s Maturity Model. The nonfunctional requirements are satisfied by modeling design patterns and integrating them into the S-CASE MDE engine, hence by the inte- gration of those patterns into the produced code.
Athanasios Lelis
Deep auto-encoders for source code retrieval and visualization
Dimanidis Anastasios
RESTful Web API Development using the Gherkin language and the OpenAPI Specification
The problem of the effective satisfaction of customer requirements in the typical software de- velopment lifecycle has been of major concern, not only to the software industry, but also to the academic world. Thus, new software development methodologies like Behavior-Driven Develop- ment and the Agile manifesto are introduced, dictating continuous and detailed communications between the software engineer and the customer. At the same time the World Wide Web is ma- turing. The concept “The Web as an Application Platform” is greatly adopted. Inevitably, Web developers and industry specialists are discussing methods of effectively designing and devel- oping Web applications. The current state of the industry shows that technologies like REST, might be the answer to those discussions. This thesis sets two major goals: a) To design a methodology where RESTful Web API functional requirements are described in a customer friendly format and in natural language, b) To develop a software tool that will transform the described requirements to technical information. For these goals to be met we employ Gherkin -a user requirements language-, the OpenAPI Spec- ification -a specification for REST Web APIs- and finally Natural Language Processing (NLP) mechanisms. At first, it was examined how would the REST specifications be mapped to Gherkin. For that reason, Agile and BDD company members were contacted, API company blogs and seminars were examined and the available bibliography was thoroughly studied. Based on this research, the methodology Resource Driven Development (RDD) was designed. Per RDD, the functional requirements of a Web application are organized in resources. Thus, the original way of writing Gherkin feature files was revised. The steps ‘When’ and ‘Then’ are now used to model the HTTP protocol. The scenarios are used to describe resource and application state changes, as implied by REST. The RDD methodology is described in detail with specific examples of Gherkin files. The next step was to develop a software tool, which was named Gherkin2OAS and which is responsible of converting Gherkin requirements to the OpenAPI Specification. The software is written in python 3.5. It’s functionality, it’s functions and the NLP mechanisms it uses are thoroughly described. Gherkin2OAS can detect in natural language text HTTP verbs, parameter names, types (like string, int, float, bool, array, file, date, password and more) and properties (like required, min/max, descriptions and formats), resource linking through the HATEOAS concept, roles/users, HTTP status codes and more. It also has a separate functionality, where it organizes those technical properties to an OpenAPI Specification document. Furthermore, Gherkin2OAS has built in messages that try to guide the user in writing Gherkin requirements, the way a programming language compiler would help programming software.
Christos Psarras
Development of a Source Code Visualization System using Information Retrieval techniques
The internet has completely revolutionized the way we communicate and exchange informa- tion. It has provided the necessary infrastructure for the creation of software repositories, that offer access to large collections of open source software, including software applications and libraries. Libraries provide the building blocks for the creation of larger, more complex, software, by implementing useful algorithms, that effectively confront specific problems.. This functiona- lity, though, comes at a price, due to the considerable time and effort required to understand and/or extend a library. Several applications have been created to analyze the structure and the documentation of a given library and present them to the developer. Even though these tools can be quite effective in some cases, the documentation for a library is often limited or even non-existent, while the structure of the code is not sufficient for deducing its functionality. As a result, there is a growing need for tools that harness the semantic contents of source code, left behind by developers in identifier names and comments, in order to provide a semantic description of the functionality of an application, as well as an analysis for the cohesion of its package structure. By utilizing state of the art information processing techniques we have implemented a system that analyzes the source code of a given library, extracts useful information from variable/method names and comments, and identifies semantic topics. Our system supports a set of vectorizers (count, tf-idf) and clusterers (k-means, LDA) and automatically evaluates their performance based on the purity score of the extracted topics. Furthermore, an online search is performed in order to find tags related to the top terms of each topic, and thus offers a more abstract description of the topic. Our system also provides visualization of the distribution of packages to topics. Finally, it identifies similar topics, and clusters them into semantic categories. Based on the results of a case study on Weka, as well as the application of our methodology on 5 other libraries of different sizes, we assess that the purity metric is at least 60% and in most cases over 75 − 80%. Furthermore, examining the retrieved tags for the topics indicates that their semantic content is described accurately. Finally, we provide a comparison between the clustering algorithms of our system, and further assess their effectiveness with respect to the selected vectorization techniques.
Themistoklis Papabasileiou
Extracting API usage examples from software repositories
In the era of the Internet, information sharing is an everyday phenomenon. The big amount of data shared deems their effective usage mandatory. Software, as a form of information, exists in abundance online mainly in software repositories. However, the vastness of this information usually makes searching for code usage examples hard, while the usage of software libraries is further obscured by the lack of sufficient documentation. These library usage examples consist mainly of Application Programming Interface (API) usages, of which documentation is not always available and even when it is, no guarantees are provided regarding their quality. More precisely, conducting such a search through common search engines proves cumbersome and time consuming. This problem is addressed by Code Search Engines (CSEs) that mine useful code information to provide relevant results. However, they also fail to solve this search problem effectively. Recommendation Systems in Software Engineering (RSSEs), especially those regarding API usage mining, offer a more specialized solution to the aforementioned problem. These systems provide relevant usage examples that match the queries given by the user. Still, most of them do not perform any checks whatsoever on the quality of the results returned and produce redundant examples or cover a small part of the API under examination. The need to systematically confront the problem of API usage mining leads us to design and implement an RSSE system in order to effectively search for usage examples for a given API. Our system checks whether the retrieved code is compilable and employs a Frequent Closed Sequence mining algorithm in order to ensure that the produced results are of high quality and the API is covered effectively. Moreover, the rejection of redundant information at the stage of mining makes our results cohesive. As output, our system can summarize an API by providing general examples for its methods as well as process queries regarding specific methods. We evaluate our system with respect to the percentage of API methods that are covered by the produced examples and further assess the quality of these examples by calculating their variety and cohesion. In addition, we conduct a case study for the Machine Learning and Data Mining library weka where our system is tested in a real life scenario. The results of the evaluation are quite encouraging, indicating sufficient coverage of API methods while producing cohesive examples in a timely manner.
Ioannis Zafeiriou
Software Engineer Profile Recognition Through Application of Data Mining Techniques on GitHub Repository Source Code and Comments
Software development methodologies, or process models, attempt to describe the steps that should be followed along the way from conception to deployment of software. There are traditional approaches that focus on the sequence of discrete and well defined steps, like the Waterfall model, where communication channels are realized by passing documents, and others, like the Agile model, which emphasizes on the need for flexibility and constant, direct communication between team members. These newer models are very popular with software teams of varying sizes. Due to the importance and the means of communications described by these models, it is desired to recruit people that possess both technical and communicational skills. The problem, though, that arises when looking for people like these, lies in the difficulty of assessing these skills. Within the context of this diploma thesis we focus on this issue. To do so we employ data mining techniques for identifying different team roles and also assess the activities of team members within the software development and operations process. The implemented system draws user activity data from the GitHub web platform and uses them as input to cluster team members. This way we attempt to provide insight into the different team member roles that appear in open source projects, like the ones at GitHub, and the performance of the users that act under these roles. After extensive experimentation with different combinations of datasets and evaluation features, the results that are presented as final are considered to offer critical insight into those matters.
Eirini Chatzieleftheriou
Design and Development of a Refactoring-Based Quality Enhancement System
Marina Gerali
Automated Test Case Generation using Source Code Repositories
Recently, programmers and software engineers have started trying to take advantage of the abundance of information on the internet, to be able to reuse code snippets which fit in their projects, thus saving up time and effort. To do so, Code Search Engines (CSEs) were developed, which acquire code snippets from various software repositories and by using data mining algorithms attempt to present the user with results as relevant to his/her needs as possible. The process of relevant code search is facilitated by the use of the so-called Recommendation Systems in Software Engineering (RSSEs), which cooperate with CSEs, respond in more complex queries than CSEs do, take into consideration the developed project and apply complicated data mining techniques, in order to present results to the end user. Despite the contribution of CSEs and RSSE systems to the field of code reuse, they are unable to solve the problem that is the subject of current thesis. That is, the searching for reusable test methods and the automated test case generation. This thesis aims to demonstrate an RSSE system which receives user’s source code and constructs appropriate queries, in order to search for test cases in online source code repositories, such as GitHub and AGORA. By using sophisticated techniques, which will be presented in detail in later chapters, our system mines data from the retrieved code snippets, evaluates them based on their relevance to the query and checks whether they compile and run successfully. For each method that a user requests, the retrieved test methods are presented to the user, ranked in descending order. The user may select those test methods he/she prefers, so that he/she can construct his/her own test case. Furthermore, if he/she chooses so, he/she can select one from the proposed test cases, that occur from all the possible combinations of compilable test methods. After submitting a set of queries to our system and after evaluating its performance, we believe that it produces satisfactory results, since the retrieved and relevant results are more than one in most cases, whether the user searches for single test methods or for complete test cases.
Ioannis Malamas
Design and Development of a Web Analytics System Based on Monitoring and Analyzing Users' Behavior
Σhe continuous outspread of the Internet is accompanied by its active presence in all areas of human activity. Information pages, e-commerce platforms, social media and other websites are an integral part of modern reality. Everyone interacts with them, more or less, which depends on the age, familiarity and particular needs of each. This new reality feeds the ever-growing trend for new websites and web applications that aspire to attract as many users as possible. Developing webpages and web applications that meet the ever-increasing demands of users is not an easy task: rather it is a multifaceted problem. Its difficulty lies in the fact that different user categories imply different requirements. In addition, existing tools and recommendation systems that provide suggestions on optimal design have the disadvantage that they provide general assumptions, without taking into account the scope of each website. Thus, the following question arises: \"How can a personalized assessment methodology be developed to design a website?\" The answer to the above question lies in the use of information that originate from the end users themselves. Thus, this diploma thesis aims to contribute to the above research question through the development of a system for recording and analyzing the behavior of website users in order to come up with useful conclusions regarding the user-perceived optimal design. Recording user behavior can be achieved through the collection of data that reflects how users interact with the website. Typical examples are data related to mouse movements, clicks, subsections of the website they are accessing, and more. The collected data can then be analyzed to draw conclusions on understanding how users are browsing the website and how the user experience could be improved. The system implemented in the context of this diploma thesis is called \"Synopsis\" and is responsible for the recording and modeling of user interaction within web pages. \"Synopsis\" was developed as an online application and tested in a real environment where it was used to track the behavior of e-shop users. The results indicate that it can provide valuable information and contribute into the optimization of web pages design.
Vasilis Bountris
Towards Source Code Generation with Recurrent Neural Networks
The evolution of Machine Learning and Data Science disciplines has been rapid during the last decade. As computer engineers, we are looking for ways to take advantage of this evolution. In this diploma thesis we examine the potential of recurrent neural networks to generate source code, given their effectiveness at handling sequences. We propose two approaches, based on per-character analysis of software repositories. Following appropriate code pre- processing and network training, models generate source code through a stochastic process. We perform static code analysis on model products, in order to examine the performance of the approaches. We have applied our approach on the JavaScript Language. The analysis shows the great representational power of the recurrent neural networks, but also the inability of our approaches to satisfactorily address the problem of automatic programming. Based on these findings, we propose further research directions and ways of exploiting the models that were designed.
Eleni Nisioti
Automated Data Scientist
The science of machine learning has achieved, based on solid mathematical tools, to transform the current data deluge into the understanding of underlying social, economical and natural mechanisms and the generation of related predictive models. However, the presence of computa- tionally demanding problems and the current inability to automatically transfer the knowledge on how to apply machine learning on new applications and new problems, delays the evolution of knowledge itself. The necessity of discovering paths that lead to a deeper understanding of the machine learning mechanisms is evident, bearing the ambition of training models that optimize the very process of learning, instead of individual applications. AutoML, that has recently emer- ged, attempts to automate the application of machine learning. Its most apparent manifestations include software systems that serve as productivity tools, instruments to make experts more efficient and effective, but not eliminate them. A common feature of these systems is the embed- ding of meta-knowledge, namely knowledge produced by the application of machine learning in past experiments, a trait that adds experience and adaptability to the system. This diploma thesis aims at implementing a software tool to facilitate the AutoML process. Exploiting current technologies, such as the rich CRAN repository, we explored opportunities offered by machine learning techniques and have attempted to push forward the state of the art by embedding meta- learning for optimal hyperparameter selection and forward model selection ensembles to our system. Main aspiration of our work consisted in designing and implementing an experienced, intuitive and expandable automated data analyst. The experiments seem promising, and we argue that the implemented tool could constitute an informative contribution to the area of AutoML.
Maria Kouiroukidou
Automatic generation of user interfaces for RESTful web services
Over the last decade, the architectural style that has prevailed for the development of web applications is the one introduced in Roy Fielding\'s Thesis in 2000, the REST architectural style. Since then, thanks to its simplicity and power, the REST architectural style has conquered the field of web applications and is practically dominant as far as web service development is concerned. For this reason, the growing demand and use of REST APIs is accompanied by the tendency to create automated processes that can produce an application that consumes RESTful web services, minimizing the time and cost needed for their development. Many automation tools have been created in recent years, however, many of them are unable to produce ready-to-run applications, and require software developer intervention. This diploma thesis aspires to make the first steps towards the process of automating the development of fully functional and ready-to-run web client applications. In order to automate the process of generating web client applications, in this diploma thesis, MDA architecture (Model Driven Architecture) is employed. MDA defines a set of clearly defined templates and tools, and describes a development process where, once an initial abstract model has been defined, a series of transformations take place, resulting in a fully functional application. This is intended to speed up the process of software development and to generate more reliable software. My diploma thesis, CREATE (Client for RESTful Api Automated Engine), implements an automated graphical user interface development tool. It automatically produces web client applications that consume RESTful web services as generated by the S-CASE , manage CRUD (Create, Read, Update, and Delete) requests and receive, process, and present their responses. These web client applications provide features such as database search, user authentication and communication with external services. Also, graphical interface features are provided such as pop-ups for confirming or updating user movements, navigation menus, image integration, etc. Finally, documentation is produced to better explain the code. The application CREATE is implemented using the AngularJS framework.
Aspa Karanasiou, Chrisa Gouniotou
Interactive detection, tracking and localization of QR tags employing the NAO humanoid robot
Nowadays, robotics is one of the most progressive technological industries. Using an automatic robotic vehicle, a plenty of processes are now feasible to be achieved. One of the most desirable characteristics, which a robotic vehicle should dispose in order to complete a process, is the ability of autonomous navigation interior. The issue, which is being considered in this bachelor’s thesis, is the calculation of the most effective path, between two points, and its secure navigation. Specifically, its implementation is focused on interiors, which are known in advance and contain static and dynamic obstacles. A designing method for the most effective path developed, while this path is designed in order to avoid any possible conflicts with the obstacles. Furthermore, a pinpointing method developed targeting to the robot’s place to be known in a dynamic environment. In order to achieve that method, firstly, it should be determined the way which could separate the obstacle’s kind. Moreover, the method, which is used for the redefining of the starter path, is analyzed whenever a dynamic obstacle is being observed, so the robotic vehicle would be able to avoid such a conflict. Finally, in order to be checked and evaluated these methods, a series of experiments took place.
Grigorios Christainas
A restification methology for client-server architectures. Application on the PowerTAC platform
The REST design pattern (Representational State Transfer) is a set of principles and rules for designing web services that was first introduced in Roy Fielding\'s dissertation «Architectural Styles and the Design of Network-based Software Architectures» in 2000. These principles are in fact a set of rules and constraints that, when applied in the process of designing a web service, they make it more appealing and enable scalability. The basic consept of REST is the representation of information and objects as resources. Every object is in fact a resource and it can be easily created or deleted through URI\'s (Uniform Resource Identifier). Through a well defined set of HTTP actions a client is able to access and modify information through a strict set of represantation forms and protocols. The ever-increasing demand for web services that are governed by a RESTful architecture is accompanied by a tendency to evolve their creation and operation techniques. Client-server architectures based on outdated architectural patterns tend to be replaced by RESTful architectures and approaches. This diploma thesis deals with the study of a real transformation problem of a client-server architecture to REST. Specifically, the transformation of the PowerTAC platform is considered, a platform that constitutes a competitive simulation of an energy stock where competing entities called brokers offer energy services to customers through contracts and are then asked to maximize their profit by buying and selling energy in order to satisfy their customers. The diploma thesis presents the main problems encountered in the Restification process of the platform as well as the solutions given, aiming at the general presentation of solutions for design problems encountered during the Restification.
Dimitrios Gouris
Autodiscovery of Web services utilizing the Semantic Web
The Web is adapting in order to handle the magnitude of the ever increasing data. The semantic web, as envisioned by Tim Berners Lee, is emerging slowly, although the required methods and technologies are already there to be applied at large scale. The current thesis is focused on automating the discovery and usage of Web services. We argue that in a dialogue between participants whether they are human or machines, there has to be a common context between them. This context guarantees the soundness of their communication. This fundamental context is described through technologies offered by the Semantic Web toolchain. We use the Resource Description Framework (RDF) for our data format. Additionally, a variety of vocabularies is provided, in order to assign meaning to data models and services. A network of servers offering Web services is implemented. Their content is described with terms from the HYDRA vocabulary. The data model is built upon schema.org terms rather than being simply annotated. The generic client is able to understand those terms and communicate with the servers with the aid of RDF graphs, instead of direct calls to their URL. In the middle of this communication lies the API-Resolver, a server equipped with an RDF parser and a SPARQL endpoint aspiring to resolve and match the requests from the client to the desired server. The goal of the current thesis is twofold. The evaluation of this proof of concept implementation and the exhibition of the Semantic Web potential. However, only with its adoption in large scale, the automation of many processes and the extension of the functionality of the current Web will become feasible.
Andreas Hadjithomas
Design and implementation of a ChatOps Bot using the Hubot Framework
The technology, communications and information industry has been evolving rapidly in the recent years. This is due to the fact that meeting most of the natural needs depends mainly on technological achievements. Even the main sectors of health, industry, nutrition, mass transportation and communication are based on advanced technology products, which in order to operate properly designed software is required. The development and maintenance of software is a versatile and complex process, especially when it comes to large-scale software that requires the collaboration of many people and the combination of various services, tools and technologies that are relevant to its development. Collaboration between teams, constantly updating work progress and automating everyday processes is a success-key to software development. This diploma thesis deals with the implementation of a chatOps Bot for the chat-driven software development within the group chat tool, Slack. Its main goal is to provide the development teams with the ability to automate tasks and cooperate in an easier manner. The Bot provides users the ability to manage the services of GitHub, Trello and Jenkins, and update and exchange information with the rest of the group in common Slack channels about the progress of work undertaken without leaving Slack.
Dimitris Niras
Development of a web recorder for automating tests in web applications
Nowadays, spending time on the internet is a daily task and is related to almost every area of human activity. People of all ages, different educational, social, and economic backgrounds, visit a wide range of websites daily for information, entertainment communication, conciliation at different levels, and the evolvement of their business activities. As a result, the formation of this new reality has resulted in the increasing creation of webpages and web applications that aim to attract as many users as possible. The creation of web applications, as well as their continuous maintenance, is a strenuous process, which requires constant monitoring of changes that take place on them. In order to achieve this, it is necessary to develop a fairly large number of tests, which will constantly validate the proper functioning of the website. However, this manual process proves to be extremely time-consuming, since for each service of the application, the developer has to create his/her own tests, which should also be changed, whenever any of the website’s elements are changed. This diploma thesis aims to contribute to the automation of test creation and the testing process. To this end, a Chrome Extension has been developed, which is responsible for “filming” a user’s actions on a website, recording all the available information that he encounters, such as HTML, CSS, JS code, API calls, content, as well as photos of web pages. In addition, a web application was created, which is responsible for the presentation and programing of the various tests results. Both the extension and application were tested on real-world websites, and the results showed it to be a very useful tool for developers, saving them valuable time.
Dimitrios Tampakis
Design and development of a Conversational Bot for a User personalized web
Internet has become nowadays an integral part of people’s lives. On a daily basis, users consume the services provided by the Internet for professional, recreational and other reasons. Users interact with computers via appropriately designed interfaces (user interfaces), in order to satisfy their needs. User experience (UX) is the most fundamental metric utilized to assess the human-computer interaction. UX is defined as \"a person\'s perceptions and responses that result from the use or anticipated use of a product, system or service\". The notion of UX is directly associated with the user him/herself and every user individually. However, each user is characterized by a different level of knowledge and experience, as far as the use of internet is concerned. The user has his/her own interests and preferences that match his/her personality. With a focus on improving UX and better satisfying the user’s needs the exploitation of information -gathered from the user- is deemed necessary. Information such as gender, age, demographic characteristics and the content of websites visited by the user could be used to identify the user’s interests and create a corresponding internet profile. The main goal of this diploma thesis is to design a system which allows the identification of user’s interests and provide a mechanism in order dynamically re-assess these interests. Initially, information is gathered from the user’s internet history -via a Chrome extension- and the interaction of the user with a Messenger Bot. This information is used in order to identify the user’s interests and create an internet profile. Subsequently, a personalized news feed provided by the Messenger bot enables us to dynamically re-assess the user’s internet profile. Within the context of this thesis we present relevant applications and describe the implemented system and its components in detail. In addition, we present results of the system’s utilization by a real user.
Ioannis Agrotis
Design and development of a software quality optimization system using automated correction of coding violations
The ever-growing penetration of the internet into our lives could not leave the software development process unaffected. Broad and easy access to all kinds of information has provided an opportunity for software developers around the world to create a collaborative community for building new software projects, also known as the \"open source community\". It is now a fact that software development requires systematic source code reuse in order to help create better quality software faster and at a lower cost. However, most of the source code which is located in open source repositories and is available for reuse does not necessarily meet specific quality standards, fact that makes the development of mechanisms for quality monitoring necessary. Towards this direction, in an effort to model quality, a set of standards have been proposed that analyze software quality in a number of features. Similarly, in an attempt to find a common ground between software developers, a number of best coding practices have been proposed to supervise quality at the source code level. For this purpose, static source code analysis tools have been developed that detect and report coding violations; however, they do not offer the ability of automatic correction. In the context of the above we propose the development of an automated code quality improvement system through automatic correction of code quality violations, whose primary objective is to be a reliable and useful tool for developers. The first results the systems implementation in a series of open source projects lead to the conclusion that our approach can correct a large number of violations and thereby substantially contribute to improving the software’s quality.
Ioannis Iakovidis
Applying reinforcement learning for structured prediction
During the last few years, the increased popularity of the internet, the proliferation of integrated computers and the continuously growing research community have generated an explosive increase in the number and size of the various available data collections. At the same time, the increase in available computing power and storage and the huge interest in fields that demand a large amount of data, has made the analysis of such datasets possible. In practice, though, the exploitation and integration of data from a large number of sources has proved to be a very hard and time-consuming process. Even when working only with data collections that contain similar data, these are seldom in common formats. On the contrary, the bigger the variety of data we wish to use, the more effort does it require to modify the data to a common structure. Furthermore, a huge category of data that cannot be utilized easily is that of semi-structured data. This category includes data collections that exhibit a loose structure, such as HTML trees (websites). The exploitation of those data is often prohibitively complicated or even impossible if manual data processing is used. The above reasons render clear the need for development of flexible algorithms capable of handling data processing and manipulation with limited or even no human help. Even though a variety of artificial intelligence methods have been used to solve the above problem with promising results, there still exists a large margin of improvement in those results. Algorithms that belong in the field of reinforcement learning are especially interesting, since we believe that the structure of those algorithms makes them ideal for the task of processing data of various structures. In this diploma thesis we elaborate on the performance of reinforcement learning algorithms in a variety of problems focused on structured prediction.
Ioannis Tsafaras
Design and development of an automatic mechanism for Continuous Integration
The progress of cloud computing in the recent years has been rapid. Given the advantages that cloud computing offers, it is being used more and more by businesses and, accordingly, there are many providers that offer cloud computing services. Together with the advantages of cloud computing, there are several challenges, for example related to data security. These challenges vary depending on each provider\'s implementations. An important part of the software development process is Continuous Integration (CI). CI aspires to minimize errors and accelerate the progress of software project development and evolution. Testing is automatically performed through CI systems and, upon successfully running the automated tests, CI delivers the latest version of the code in a production or pre-production (staging) environment automatically through Continuous Deployment (CD) and Continuous Delivery (CDE), which are an extension to CI systems. Numerous cloud-based CI implementations are available as-a-Service, but there is a differentiation between the services provided, depending on whether the software project is closed or open source, while data (code) security challenges arise, especially for closed source projects. Moreover, the adaptability of the systems to users’ requirements is limited. The process of implementing an integrated, customizable, automated CI + CD/CDE system, using cloud infrastructure, is time-consuming and requires know-how. The subject of this thesis is, after comparing cloud providers, to develop a service for automating the installation, configuration and running of a CI + CD/CDE system. Our approach also integrates static code analysis and evaluation. CI is implemented through Jenkins, an open source software, while static analysis is performed through SonarQube. Automation of the CI creation workflow as well as the CD/ CDE processes are performed through the Ansible software configuration management tool. The outcome of the thesis is a user-friendly web interface that enables, after inserting the appropriate variables, the creation of a CI system, which is compatible with the cloud infrastructure of multiple providers, as well as with the use of local servers. The product can be used by companies or individual application developers.
Giorgos Karagiannopoulos
Design and Development of a Recommendation System for Extracting Source Code Snippets from Online Sources
The outspread of the Internet has facilitated the search for useful code from online software repositories, therefore fundamentally changing the way software is developed and maintained. Software engineers focus their effort on combining the best examples and interfaces in order to achieve the optimal solutions. Nevertheless, even with a huge variety of available choices, the developer is often forced to leave his programming environment and resort to search engines in order to find useful code and examples. In the aftermath, his productivity and concentration are reduced. Lately, the research area of Recommendation Systems in Software Engineering has been developed in an attempt to confront these challenges. These are systems that receive queries from the developer, and through data mining techniques, aspire to provide ready-to-use solutions, such as reusable code. In current literature, there are several systems that receive some form of query and return ready-to-use code snippets. Nevertheless, most of these systems use complex query languages, thus requiring significant effort for properly constructing a query. Furthermore, the presentation of the results is often limited, as the developer is only given a list of snippets, without any grouping and without any further information regarding their quality. In this work, we design and develop a new recommendation system in order to confront the aforementioned challenges. Our system receives queries in natural language and searches for useful snippets in multiple online sources. After that, data mining and machine learning techniques are employed in order to assess and cluster the snippets. The results are assessed both for their usefulness and their quality (readability), while their presentation allows the developer to easily distinguish among the implementation that is most desired. Finally, we evaluate our system in a set of queries to confirm its proper functionality.
Vasilis Remmas
Automatic Build and Deployment of Robotic Microservices at Cloud
Generated data volumes are constantly increasing, dictating the need for more sophisticated algorithms and mathematical models to achieve faster and more accurate processing of this data volume. The execution requirements of these algorithms/models often require increased computational resources which entails increased energy and costs. It is evident that, as data continue to grow, performing such processing algorithms on robotic vehicles that do not have the computational power and the energy autonomy will be impossible. This diploma thesis focuses on the implementation of a system that aims to offload some robotic vehicle operations into a computer cluster. This way, robots can execute algorithms that, due to computational resource and energy requirements, would be impossible. The proposed system allows developers that do not have robotic programming skills, to treat robotic systems under a software as a service prism.
Panagiotis Doxopoulos
Providing robotic web services through a hardware node and interfacing with IoT platforms
The rapid development of technology over the last decade has greatly influenced peoples’ daily habits. Nowadays, due to the development of robotics and the Internet, we enter the 4th Industrial Revolution (Industry 4.0). The communication and collaboration of Cyber-Physical Systems, including machines and robots, among themselves and with humans, is expected to attract researchers\' interest for at least the next decades. A key element of the 4th Industrial Revolution is the Internet of Things (IoT). The idea of IoT initially appeared at Carnegie Mellon University in 1982, where a network of smart objects was envisioned in an effort to connect a soft drink vending machine to the network. Nowadays, IoT begins a period of acne, with scientists having good evidence that by 2020 smart objects will reach the number of 50 billion. This diploma thesis presents the development of an IoT system, through which various devices and smart objects come into contact, either with each other or with people. The most important part of the system consists of a router, crossbar, which allows the connection of smart objects, robots and people to the system. Communication is accomplished using Remote Procedure Calls (RPCs) and Publish / Subscribe (PubSub). The first refers to remote calls offered by the device, while using PubSub protocol, asynchronous messaging between independent nodes on an IoT network is achieved. Specifically, the WAMP and REST over HTTP protocols were used. In the current thesis, connectivity to a NAO robot, a Raspberry Pi (RPi), the REMEDES system and an Arduino embedded device was achieved. An equally important part of the overall system is the implementation of a hardware node which served robotic-oriented web services. This node was implemented on a Raspberry Pi and hosts a server that was created with the help of the Swagger framework tools. The server provides RESTful web services for utilization by robots. Raspberry Pi is connected to the IOT system, allowing robots to indirectly contact the service. Of course, direct communication of robots with RPi is also possible. Also, a large number of experiments have been carried out demonstrating the satisfactory operation of the system and drawing useful conclusions. Finally, some applications have been implemented that show the potential of the system.

2016

Ioannis Gkiliris
Emergent programming practices through crowd intelligence
The rapid expansion of the Internet in the recent years has undoubtedly affected peoples\' daily lives, as well as the way they carry out their working duties. Access to a vast volume of information is unprecedentedly convenient and fast. Thus, the whole procedure in which software is being developed has entered a new era, where collectiveness and collaboration determine quite profoundly the final outcome. Due to this evolvement, substantial research has been carried out in searching and reusing existing, open source, code snippets, since platforms like GitHub seem suitable for the pursuit of such purposes. Collective Intelligence is therefore something tangible for the software engineering field as well, bearing in mind that modern day programmers receive considerable support in their endeavors. As for the purpose of this diploma thesis, that is to develop an open source tool (statLint) that is capable of recognizing the user\'s programming practices that deviate from those of other developers in well-known and open source software projects. Those emerging practices have resulted from the analysis of numerous packages and applications, freely available on GitHub. During the initial stage of this process a suitable system was developed in order to collect and analyze the useful data and finally be able to efficiently provide the summarized knowledge. Then, the statLint package that is used as a plugin for the Atom IDE is ready to use that knowledge in order to evaluate the users\' practices and, if necessary, to inform them accordingly. A variety of experiments validated the proper function of the system, not only from a practical point of view, but also from a rather overall perspective. Moreover, a powerful correlation was ascertained between the quality of our tool\'s evaluation capability and the quality of another, similar one. To conclude with, the main point of this work was to provide evidence that new tools that keep up with the modern programming conventions can be established successfully. This is quite essential considering that as with natural languages, programming languages are used by the human society as well and that alone renders the evolvement of the current practices inevitable.
Natalia Michailidou
Design and Development of Web Client for RESTful Web Services
The REST architectural style was introduced for the first time by Roy T. Fielding in 2000 in an attempt to generalize the fundamental principles of the web’s architecture and to present them as a specific set of restrictions. Since then, the REST architectural style has become extremely popular for web service- oriented development and is practically dominant as far as web service development is concerned. The development of web client applications that must consume RESTful web services, is however limited to web client libraries, and involve heavy front-end developer effort to become fully functional. The current thesis aspires to set the first steps for automating the process of developing the front-end of web client applications. In order to automate the process of generating web clients, MDA principles are applied (introduced by the OMG group). This MDA approach supports the definition of models in different levels of abstraction and thus permits software development based on the design objectives related to the problem and not the underlying computing environment. This way acceleration of the software development process and software production with higher credibility, extensibility and interoperability is pursued. Within the context of this thesis the Automated Client Engine is designed and developed. It produces web client applications that consume RESTful web services, as generated by S-CASE (http://s-case.github.io). These web client applications manage CRUD (create, read, update and delete) requests as defined at the RESTful service level. In addition, the generated web clients provide authentication features, and embed UI/UX elements and CSS format styling. They are developed in Angular JS and HTML and are ready to deploy.
Sotirios Angelis
User Experience evaluation of On-Screen Interaction Techniques and Semantic Content
In the last decade, web application development has gained increased popularity. Web services support computer interaction and data exchange via HTTP, and play an important role in the development of this applications. The predominant choice for web services are RESTful web APIs. These are application programming interfaces that follow the Representational state transfer (REST) architectural style. REST architecture became popular because of its simplicity and ease of processing and expansion of the application based on it. The increasing demand and use of REST APIs is followed by the trend for developing new tools and techniques for their generation and consumption. These tools focus on minimization of the time and cost that is required for the development of a web application and/or the implementation of interfaces that effectively serve the functionality of the application. In order to be consider successful, REST APIs - like any other software product - have to be characterized by a reasonable ease of use that allows people to handle the functionality of the application effectively. Based on the above, current thesis deals with the design and development of Interact, a graphical user interface tool that is fully adaptable to the structure of a given REST API. Practically it produces a complete user interface that does not require any knowledge about front-end development. Interact is compatible with S-CASE (http://s-case.github.io), a software platform that includes, among other things, an automated code generation engine (employing MDA primitives) that generates RESTful Web APIs. Interact is developed in Angular JS. It can be used for testing a REST API, as a website or as a prototype presentation of the API’s operations.
Nikiforos Sakkas
Design and Implementation of an interactive system in order to evaluate User Experience (UX) of Interaction Techniques and Semantic Content
Undoubtedly, Internet has become nowadays an integral part of everyday life. As a consequence, a continuous interaction between the man and his computer is performed on an almost daily basis. Human-Computer Interaction (HCI) is the scientific field of information technology that focuses on the interaction between people (users) and computers. It is regarded as the intersection between Computer Science, Cognitive and Social Psychology, Linguistics, Industrial design and many more disciplines. The interaction between users and computers is implemented at the level of user interfaces, through the appropriate software and hardware implementations. The most important metric of human-computer interaction assessment, as far as the Internet is concerned, is User Experience (UX). User Experience indicates a holistic metric on the experience that a website or an application offers to the user. UX is composed of and influenced by many factors, some of which are the various interaction techniques and the use of semantic content. Within the context of this thesis we focus on evaluating User Experience related to different interaction techniques and semantic content. More specifically, six similar webpages were developed, differentiated only by the type of interaction technique that they offer to users. Subsequently, six additional webpages were developed, which identify and pinpoint the most significant semantic entities to the user, providing, that way, additional material to interact with. Various UX evaluation experiments were performed and results are discussed with respect to significance in UX and user familiarity with the web.
Georgios Ouzounis
Personalized Automatic Speech Recognition
The goal of this thesis is to increase the usability of electronic devices in every day life. A step towards this goal is the transformation of the com- munication interface between a human and an electronic device, in order to further approach the natural communication interface between humans. For this purpose the Personalized Automatic Speech Recognition (PASR) desktop application has been designed and developed. By using this applica- tion, a user can compose e-mails in English by simply dictating them. The methodology proposed comprises two stages. During the first stage, called the Automatic Speech Recognition (ASR) stage, user’s voice is transformed into plain text. This stage makes use of the open source speech recognition toolkit CMU Sphinx. During the second stage, the output of the first stage is syntactically corrected based on a set of existing e-mails that the user has previously provided. This stage is called the Post-Processing stage and makes use of Natural Language Processing (NLP) techniques to alter the ASR output text. Experiments performed on both the ASR and the Post-Processing system indicate that the latter introduces a significant increase to the performance of the whole application. These experiments, along with their results, are dis- cussed on two separate chapters of this document. Finally, the final chapter contains discussion on the future work, concerning this application.
Paraskevas Lagakis
Venuetrack: a smart search engine of points of interest in Thessaloniki, with evaluation capabilities based on sentiment analysis of comments
The world wide web has been rapidly expanding over the last decade, and today more than 40% of the world population is using it on a daily basis. Social media have played a very important part in this increase of the internet’s popularity, since for many people, social media is one of the few if not the only reason to go online. As a result of this explosion, large quantities of raw data have been produced, and its analysis is a huge challenge for the scientific community. In this context, sentiment analysis and natural language processing are in the center of the scientific status qwo, presenting great interest and vast new opportunities. Especially so in Greece, where these fields are still relatively evolving in a very slow pace. For that reason, this thesis tries to develop a sentiment analysis system, by using natural language processing methods. The aim of this thesis, is to apply sentiment analysis in order to evaluate points of interest (or venues) in the city of Thessaloniki, by evaluating users’ comments. These comments are categorized as positive or negative by a classifier that was developed and trained using a relevant dataset. By using the classifier to evaluate each venue’s comments, we then decide if each venue offers a positive or negative experience to the visitor. The results of this NLP system are presented in a web application named Venuetrack. Venuetrack is a smart and easy-to-use search engine for venues in the city of Thessaloniki, in which users can search for points of interest on the map of Thessaloniki, and check out their information as well as the classification of the NLP classifier created.
Odysseas Doumas
Design of a Platform Specific Model of RESTful Web Services and automatic code generation from this mode
In the last decade, the RE ST architectural style has dramatically changed the way web services are developed. RE STful Web APIs have conquered the programmable web, due to the simplicity and flexibility they provide, with a resulting raise in the demand for Web API development. This demand has given birth to several development frameworks that allow rapid development and aspire to automate parts of the development process. However, most of these frameworks fail to generate ready-to-run applications, while the end program is usually not RE ST compliant. Having this raise in demand in mind, in this diploma thesis the MDA architecture, an OMG initiative, is examined. MDA falls into the category of Model Driven E ngineering techniques, where their main characteristic is the systematic use of abstract models and model transformations as active parts of the development process. MDA comprises a set of strict standards and tools, and describes a development process where an initial abstract model is designed, a sequence of model transformations takes place and finally, a fully functional application is generated. MDA architecture promises a boost in productivity, an improvement in the understandability and the communication between the different members involved in the development process, and also an improvement concerning the reliability, quality, extensibility and interoperability of the developed software. Initially, in the present thesis, a thorough exploration of the basic concepts and the philosophy of MDA alongside a brief explanation of the RE ST architectural style and its principles, takes place. Afterwards, a development tool is designed and developed, which incorporates and MDA architecture and can be used for the development of RE STful Web APIs. This tool was designed to be compatible with S-CASE , a software project that includes, among others, an automatic code generation engine following the MDA architecture that generates RE STful Web APIs running on the Java environment. The tool developed in this thesis essentially forks the functionality of the S-CASE engine, in the sense that it accepts as input a PIM model, an abstract, platform independent model that describes the functionality of a RE STful Web Services, generated by the S-CASE engine, and it generates a RE STful Web API described by that PIM, but designed and implemented to run on Microsoft’s .NE T platform.
Konstantinos Sideris
Developing a web application for static analysis of software repositories
In recent years there has been a rapid expansion of Javascript’s ecosystem. To this fact has contributed both the establishment of the internet as a development platform for web applications as well as the Node.js platform, which offered the opportunity to use the language outside the browser, for the development of any kind of application. Node’s success lead to the creation of the first package manager for JavaScript, which now hosts tens of thousands of JavaScript packages. The accumulation of a large number of software packages in a relatively short time has made the selection and detection of valuable packages difficult. The large volume of information and data available both in repositories and source code could be utilized for the improvement and evaluation of the available software. The thesis deals with the development of a web application (npm-miner) which aims to enhance the software package selection process through the use of quality metrics for assessment. The user will have the ability to search and compare software as well as explore statistics on ecosystem quality.
Klearchos Thomopoulos
QualBoa: A Source Code Recommendation System using Software Reusability Metrics
Undoubtedly, the digital era could be characterized by the widespread adoption of the Internet, which has contributed to the facilitated information sharing. A reasonable problem that arises is the efficient exploitation of this information, part of which refers to software, which can be widely found in open source software repositories. All this information could be of use to software developers in order to support software reuse. To effectively exploit online source code information, alongside conventional search engines, Code Search Engines or CSEs have been developed. However, these are also not adequate for completely addressing the problem of finding reusable source code, as it is not possible to adequately describe the user’s query, and moreover they cannot guarantee the functionality and the reusability of the retrieved results. As a result, more sophisticated systems were developed, named Recommendation Systems in Software Engineering or RSSEs. These systems aspire to automate the extraction of the query from the code of the developer and evaluate the functionality of the retrieved results. However, current systems do not consider the non-functional characteristics of source code components, which essentially refer to their reusability. The incapability of these systems to address the problem, led us to the development of an RSSE, in order to cover both the functional and the quality aspects of software component reuse. Our system employs the Boa language and infrastructure, which comprises processed information of software repositories, which are accessed using a query language. Specifically, our system firstly extracts the query from the source code of the developer and translates it to a query for Boa, in order to find relevant results. Moreover, our system extracts quality metrics and uses them in a model to measure the reusability of each retrieved component. Thus, upon retrieving components, our system provides a ranking that involves not only functional matching to the query, but also a reusability score. Based on the evaluation, the system indicates a satisfactory outcome, both towards quality and accuracy. It is safe to conclude that our system can be effectively used for recommending reusable source code components.
Ioannis Likartsis
Planning and monitoring operations in robotic missions
Mobile robots are used more and more often to accomplish critical goals, where human presence is not considered safe. Such goals are search and rescue mission, surveillance and reckoning of dangerous areas, space missions etc. One more advantage that derives from the use of robots in such conditions is the dramatic reduce of cost (a lot less strict security measures). Human-robot cooperation, enables the utilization of the best qualities of each. The robot\'s ability to perform extremely fast calculations and make educated decisions provided enough it can gather enough information, and the human\'s ability to conceive situations and take actions with limited data. To achieve the latest, when the robot is in the field of the mission, a system for human-robot communication is required. The field that studies the aforementioned system is Human-Robot Interaction (HRI). The goal of HRI is to optimize human-robot co-operation to carry out missions in a range of possibilities, from fully autonomy to teleoperation. The problem studied in this thesis, is that of developing a graphical user interface (GUI) that will allow the operators on the ground to control and monitor the tasks of robotic assets located in the field of the mission. To this end the application GRASP was developed. GRASP allows the accomplishment of missions using remote robotic assets. GRASP is designed in a way that it can control various types of robots (rovers, UAVs, arms etc). Furthermore it is extendable when It comes to changes in the software of the controlled robot. The functionalities mentioned above are achieved using configuration files, which also allow GRASP to adapt to potential changes of the on-board computer. Through GRASP the operator has the ability to create mission plans, send them to the robot and monitor their execution. GRASP helps the operator to carry out the necessary tasks, providing him with situational awareness (conditions and environment) utilizing the incoming telemetry messages. The type and quality of situational awareness provided to the operator depends completely on the sensors (cameras stereo/thermal, maps 2D/3D etc) and the software of each robot (object recognizing, digital elevated maps etc).
Miltiadis Siavvas
Design and development of a framework for the evaluation of software projects quality based on static analysis and fuzzy multi-criteria decision-making techniques
Our era is characterized by the rapid technological development and the continuous digitization of information. Software products are continuously being developed in order to help people achieve their goals easier, faster and more efficiently. This raises the issue of software quality as a major concern for both the end users and the software development companies that wish to offer their customers high- quality services. A lot of research has been carried out in recent years in order to design and develop a universally accepted mechanism for software quality assessment. However, no efficient generic model exists. This is why contemporary research is now focused on seeking mechanisms able to produce quality models that are easily adjusted to custom needs. Within the context of this diploma thesis we focused on the design and development of a system that enables the quality assessment of software products according to a particular set of design aspects. In order to achieve this, a tool chain was developed that allows the production of quality models by applying static analysis to a desired benchmark repository and then using these models to assess the quality of software products written in Java. Multithreading is applied to accelerate the time-consuming process of static analysis, while fuzzy multi-criteria decision-making techniques are adopted in order to model the uncertainty imposed by human judgement. The system produces a carefully calibrated and reliable quality model, the base model, which is utilized to verify the system and serve as a guide to further implement similar quality models. Finally, an online service was designed and developed that offers quality assessment of open-source software products placed on GitHub, with the ultimate goal to become a reliable code quality certification service. The performed experiments ascertained the proper system operation and the independence of the models with regards to the size of the product under assessment. By assessing both automatically generated and user-developed software products the contribution of quality models towards the improvement of software product quality was highlighted. Afterwards, a comparison between the serial and parallel implementation was made, leading to the conclusion that parallel implementation greatly enhances the static analysis process. Finally, a comparison of the fuzzy weight generation technique with its deterministic counterpart showed a close correlation between the results of these two methods.
Christos Zalidis
Augmenting perception for unmanned ground vehicles for efficient exploration and navigation in rough terrains
Robotics is currently one of the most rapidly evolving scientific areas, where significant advantages in development of autonomous robotic agents enabled the execution of complex tasks that where not possible before. A fundamental feature robotic agents needs to exhibit, in order to interact with their surrounding environment, is perception. This thesis tackles the problem of modeling and representing an environment, which is not known in advance and consists of uneven surfaces and rough terrain. Specifically, we are interested in unmanned ground vehicles, that use the representation of the environment for efficient navigation and exploration. We developed a unified system that performs the task of robot localization in three-dimensional space, uses elevation maps to represent the environment, extracts traversability features from that representation and finally performs autonomous and safe navigation. Full three-dimensional robot localization is achiev- ed through the combination of various state estimation algorithms and raw data from sensors, using an extend Kalman filter. Building upon an onboard range measurement sensor and an existing robot pose estimation, we formulate a novel elevation mapping method from a robot-centric perspective. This formulation can explicitly handle drift of the robot pose estimation which occurs for many autonomous robots. Additionally, we extract terrain features, useful for navigation and path planning, performing traversability analysis based on the representation of the elevation maps, which leads to a new represent- ation of the environment, traversability maps. Moreover, we extend the representation of the classic elevation maps, adding the ability to model environments that contain multiple overlapping structures, such as bridges, underpasses and buildings. This new representation leads to the developments of multi- level elevation-surface maps. The proposed architecture is based upon the navigation module of ROS (Robot Operating System), one of the most popular navigation algorithms in the robotics community. The representation that is used by this algorithm is not sufficient for navigating in environments that contain uneven surfaces or generally rough terrain. Therefore, using the above mentioned representation we can extend the ROS navigation system, adding new capabilities and enabling navigation in such environments. Finally, extensive experiments examine and evaluate the proposed method’s performance in diverse environments.

2015

Vasileios Lolis
Design and implementation of an Android application for the sentimental and categorical analysis of user web browsing contents
Mobile devices have become a part of our lives and offer us the opportunity to access the Internet anywhere. We browse the internet more and more, whenever and wherever. Companies, such as Google, gather information on us and with proper processing can create a complete profile on who we really are and what we like to do. When a user browses the web through his/her Android device, Google collects the history of the URLs he/she visits. These data are usually stored locally as well as on the Google Cloud. Within the context of this diploma thesis a review of the REST architectural style is initially performed, as well as a review of current RESTful web services that offer sentimental and categorical text analysis. The thesis introduces C.H.A.T. (Chrome History Analysis Tool), a mobile Android app that extracts the history browsing data from Google Chrome of a user’s mobile device, employs a web service for the sentimental and categorical analysis and stores the results in a remote database, while also ensuring user anonymity and safety of personal information. C.H.A.T. provides the user with diagrams, correlations, statistics and pictures in a user friendly manner, also enabling him/her to choose specific time periods for the analysis. Conclusions and future work are discussed at the end of the thesis.
Alexandra Baltzi
Applying Test-Driven Development and Code Transformation Techniques to improve Code Reuse
Undoubtedly, the digital era has contributed to the easier transmission of information through the widespread adoption of the internet. The proper exploitation of this information is a difficult challenge. An interesting form of information is the source code provided in open source software repositories. As a result, a new objective is the exploitation of existing software from the developers, or the exploitation of the knowledge resulting from it, in order to create new software. So, alongside conventional search engines, Code Search Engines or CSEs were also developed to extract code from open source software repositories. However, these are not always sufficient to address the problem, as they fail to adequately describe the query of the developer and cannot guarantee that the results returned are indeed functional. Later on, more sophisticated systems were developed, namely the Recommendation Systems in Software Engineering or RSSEs and more particularly those which make use of Test-Driven Development. Although these systems support the creation of complex queries from the user, they often have no way of controlling the utility and functionality of the final result, while most of them are no longer functional. The incapability of these systems to address the problem has led us to develop our own RSSE, which uses a dynamically renewable repository for code search. Initially our system extracts the query from the developer’s code, and then it searches in the CSE AGORA and applies a mining model on the retrieved results. Furthermore, our system applies transformations on thees results in order for them to match to the original query, therefore providing more useful and functional results in comparison with other test-driven RSSEs. Finally, our system provides the user with information about each result, regarding its relevance to the user’s initial guestion, its complexity and its functionality. The comparison of our system with other known RSSEs proves that our results are satisfactory in terms of quality and accuracy, and that it remains efficient as far as response time is concerned. Additionally, the code transformations performed by our system further improve the results.
Michail Papamichail
Design and development of a source code quality estimation system using static analysis metrics and machine learning techniques
The most representative description of today’s age in one phrase would be “the information age”. In contrast to previous ages, when access to information was extremely difficult, time consuming or in many cases even impossible, today due to the evolution of the technology and the outspread of the internet, information is just a few clicks away. This fact is present in every aspect of our everyday life and of course the software development process could not have been left behind. The exploitation of numerous open source software projects, both from software repositories, and from using search engines or specialized code retrieval systems, facilitates the process of software development. However, code reuse is beneficial as long as the reused components meet the requirements of the developer\'s project, and fulfill certain quality standards. Searching in a software repositories, one may notice that hundreds if not thousands of results are retrieved for their query. This vastness of retrieved code raises the question: “How can one choose the most high quality source code results to reuse?” The contribution of this diploma thesis lies in answering the above question in a reliable manner by proposing a source code quality estimation mechanism depending on static analysis metrics. For this purpose, a system has been designed with the primary goal to estimate the source code quality based on the static analysis metrics. To this end, the system uses two models, a one class SVM classifier and a Neural Network model. The former is used to determine whether the examined files meet a fundamental quality threshold, while the latter provides a score for the quality of the files. These two models were trained using 24930 java source code files included in the most popular 100 GitHub database repositories. Finally, upon successfully evaluating our system for ranking new files, we conclude that it can be a valuable asset for the developer.
Konstantinos Papangelou
Personalizing web search results by incorporating user behavior and semantic data
The most important step towards understanding and satisfying web users\' needs is the analysis of their behavior and the use of the data they provide in order to implement personalized services. Several information, like gender, age and the location of the users as well as the webpages they visit, can be used by a plethora of web applications in order to identify users\' interests and provide them with better services. These kinds of applications are already part of the web with the most prominent example being the personalized search offered by some commercial search engines. The main goal of the diploma thesis is to present a complete method that identifies users\' interests based on their browsing history. For this purpose we have implemented two systems. The first system creates profiles relevant to various domains while the second one assigns these profiles to users based on the content of the webpages they visit. In particular, for the first system, we collect webpages relevant to some subject (e.g. a music genre) using the search API of a commercial search engine and we perform thematic analysis of them using Latent Dirichlet Allocation. We use the results of LDA in order to find the most dominant topics and for each one of them the most probable words. We use these words to form a vocabulary relevant to the corresponding subject. We are also interested in forming profiles that describe the user\'s level of expertise in each subject. For the second system, we extract the user\'s browsing history and for each webpage-profile pair we calculate a score based on the number of matching words. To improve further our scoring system we include a measure that captures the semantic similarity between webpages and profiles. Finally, for every webpage we find the profile that has the maximum score and the set of the resulting profiles is assigned to the user. Within the context of the thesis we present relevant applications and describe the implemented systems. We also present results of the first system in two popular domains, music and sports, as well as an example of a user\'s browsing history analysis. The results are promising and allow us to draw some conclusions.
Antonis Noutsos
DPDHMMY: A Design Pattern Detection tool
Currently, designing and developing software has grown to be a tedious task. The ultimate goal of the software development process includes not only satisfying the desired functionality of a software product, but also various, sometimes critical, non- functional requirements. Extensibility, robustness, usability, maintainability, portability, testability και reusability are concepts that the developer has to take into account during the development of their project. A methodology that ensures proper structured and good quality code is the application of various code design patterns, which are acknowledged for their added. In this context, design pattern detection can improve the understanding of code architectures, while it can also offer an asset in applying patterns in existing code, while ensuring the satisfaction of non-functional criteria. This thesis proposes a new way for representing well known, but also custom design patterns. The methodology builds upon the connections among classes inside a project, in order to match them to predefined design pattern structures. Following the above methodology, we designed and developed DPDHMMY (pronounced dee-pee-dee- mee), a user-friendly application for design pattern detection. One of the main advantages of DPDHMMY is the capability of detecting patterns even in non- compilable code. Using DPDHMMY, design patterns can be detected in incomplete code or code with errors, in order for the programmer to fix and improve it. Thus, the tool can be used when structuring the source code of an application (e.g. during the definition of interfaces) in order to determine whether proper design patterns are taken into consideration. Furthermore, it provides the users with the option to define their own design patterns, thus, promoting high extensibility.
Ioannis Antoniadis
Interactive Question Answering using Topic Models
Bridging the gap between humans and machines on the scope of in- formation retrieval has always been a challenging task. Search Engine Optimization (SEO) has made a lot of progress to that end, but still the gap seems long. Search engines are incapable of capturing the content semantics of neither the information resources nor the user’s query. Question Answering systems were proposed a couple of decades ago in order to cope with this challenge and a lot of knowledge has come to light since then. QA systems attempt to capture the semantics of a user’s question and provide a specific, suitable answer. Many different Natural Language Processing (NLP) techniques, such as linguistic and probabilistic techniques, have been incorporated to Question Answering with success. The main focus of this thesis is the proposal of a Question Answering mechanism that aims at providing improved answers to user queries. The proposed mechanism incorporates content semantic analysis and proba- bilistic topic modelling techniques to capture the latent thematic structure of the document collection, from which the answer is derived. The evaluation process includes a comparison of the proposed, topic- based ranking mechanism with a standard search engine ranking mecha- nism and proves its validity.
Emmanouil Krasanakis
Automatic Code Generation
This diploma thesis aims to bridge the gap between logical model generation and second-order logic. In particular, it develops methods to manipulate programatically equivalent logical models. After developing the necessary mathematical tools, we then delve deeper into that area, attempting to replace parts of a given model with ones from a set I without losing programatic equivalence. This process is given in the form of an algorithm that tries to minimize a certain quantity. If the set I contains only programatically implementable models, we effectively approximate the implementation of the given model. Afterwards we develop methods for comparing (and thus interpreting) loosely-defined model descriptions, as well as import- ing already existing Python libraries to generate the mapping between comments and their implementation. The end result is automatically generating code for a given problem by replacing similar comments with their implementation. Finally, after discussing areas for future development, we present a fully developed environment that implements all developed algorithms.
Alaoui Tzamali Zakia
Genome Data Analysis by Computational Intelligence Methods and Applications in R
Advances in gene profiling technologies, heralded by the completion of the human genome, have revolutionized the field of molecular biology by producing large amounts of genetic data that require a powerful bioinformatics tools for a meaningful interpretation of genetic abnormalities that occur in a specific disease state. In recent years, a widely used technique for gene profiling is the affymetrix microarray technology, which enables the study of the expression of thousands of genes simultaneously in a single experiment, creating a huge set of data. In this context, the laboratory of our collaborator Pr. Moulay Jamali (Faculty of Medicine of McGill University, Canada) has used this technology to investigate genes that are selectively regulated by the cooperation between overexpression of an oncogenic receptor called ErbB2 and a tumor suppressor gene called p53. This was achieved by overexpression of ErbB2, in colorectal cancer cells deficient or proficient for p53. Genomic data was generated using the affymetrix method. My thesis work, which focuses on analyzing this novel gene expression data using clustering methods, has revealed novel biological knowledge relative to gene regulation in these cell models. In particular clustering algorithms presented are the K-means, the SOM (Self-Organizing Map) and finally SOTA (Self-organizing Tree Algorithm), an algorithm that manages to automatically determine the optimal number of clusters. From the results of the clustering, differentially expressed genes were identified in each comparison. These candidate genes have a great potential for understanding key mechanisms and functions that may contribute to disease development and progression, in relation to the cooperation between ErbB2 and p53.

2014

Georgios Voulgarakis
Simulator software, capable of simulating various aspects of the Beam Position Monitors
The topic of the following thesis is the development of a simulator software, capable of simulating various aspects of the Beam Position Monitors. The software is intended to be used for educational purposes, in the CERN Accelerator Schools. The software is capable of simulating both Beam Position Monitors, as well as the signal processing electronics which follow, thus allowing the user to define and simulate his own BPM processing circuitry. The simulator has been developed in MATLAB, due to the ease of coding, the ability to easily make changes, as well as the speed of performing array operations. (General advantage of interpreted languages).Nikolaos Katirtzis
Mining Software Repositories for Test-Driven Reuse
The digital age, fairly characterized as information age, has brought about significant changes in everyday life. The widespread adoption of the internet has facilitated information sharing and now the question that arises is how we can exploit it. Open source software repositories provide an interesting kind of information waiting to be exploited. This kind of information could specifically be of use to developers in order to support software reuse. Since traditional search engines cannot solve this task, more specialized search engines, namely Code Search Engines (CSEs) have emerged. However, they also fail to address the problem, as it is not possible for them to adequately describe user’s query -due to its complex structure- and moreover they cannot guarantee that the results are indeed functional. A more recent approach to the problem are the so called Recommendation Systems for Software Engineering (RSSE) and particularly those that make use of Test- Driven Development (TDD). Such systems allow better description of the developer queries, while they sometimes also check for the functionality of the results. However, most of them do not make use of dynamic software repositories, their results are of poor quality and their response time is not satisfactory. The failure of existing systems to address the problem led us to the development of our own RSSE, which allows code searching in growing repositories. A CSE named AGORA or a subsystem -that uses the CSE Searchcode- that enables code searching in GitHub can be used to search for available code. Our system adopts the user’s query, extracting the query from user’s code and using the Vector Space Model (VSM) to compare the query with the results. Various techniques from the areas of Information Retrieval (IR) and Natural Language Processing (NLP) have been employed in order to make this comparison as effective as possible. The user is informed about the relativeness of the results, their complexity and their functionality. The comparison between our system and some popular CSEs proves that its results are satisfactory, in terms of quality and accuracy, while the integration of CSEs AGORA and GitHub is considered successful. Also, the comparison between our system and some popular RSSE systems, proves once more its effectiveness in terms of the quality of the results, as well as its stability in response time.
Chariton Karamitas
Synapse
With the advances in computer technology and telecommunications, computer systems, that were once machines designed to carry out simple mathematical operations, have become composite multiuser systems implemented to be able to execute several multithreaded scientific applications in parallel. This advancement also resulted in a simultaneous increase in the number of threats that a computer system is exposed to, as a result of either physical or remote access on it. Such a hostile environment raises concerns for the essential problem of recognizing delinquent behaviors on a computer system and computing the set of subsystems that may have been affected by malicious actions. In an attempt to deal with this problem, government organizations and secret services from the US, France and Germany, published a series of standards regarding the design, the development and the evaluation of security features of computer systems designed for use at critical state infrastructure. The aforementioned standards, among others, require the implementation of advanced mechanisms for monitoring the actions taking place on such computer systems. This diploma thesis deals with the aforementioned mechanisms, focuses on OpenBSM, evaluates its capabilities and proposes modifications that improve its functionality. Last but not least, this thesis presents Synapse, a tool that, using OpenBSM, is able to determine forms of communication between applications running in parallel on a computer system. For a system administrator, that knows about a certain application having been compromised, Synapse is a powerful tool that can aid her in detecting the set of further applications that may have been affected by the compromise in question.
Evagellos Karvounis
Optimizing the performance of a Web Crawling mechanism based on semantic content
It is common knowledge that surfing the World Wide Web has become a daily activity for almost everyone. Each day Internet users continuously interact with websites. Extracting valuable observations and understanding the real link between the user and the website has, thus, become an important research question. In order to succeed in capturing this information, Search Engines depend upon Web Crawlers, that traverse the Web by following URLs and their hyperlinks. Then, they process and store the website content in repositories that can later be indexed for more efficient execution of user queries. The evolution of Web and the development of related frameworks and standards has helped towards the data machine understandable helped Web 3.0 (aka Semantic Web). Apart from other facets, the transition to the Semantic Web dictates the development of Web Crawlers that can handle and process semantic information. Within the context of this diploma thesis we have focused on optimizing the performance optimization of an existing Web Crawler, Apache Nutch, in order to optimally handle semantic data. The enhanced Web Crawler, SpiTag, traverses the Web focusing on the semantic content of the webpages and aims at finding a more efficient traversing path than Nutch in terms of semantic content. Experiments and results show that SpiTag indeed performs better than Nutch and acquires more information and improved information.
Elli Kasparidou
A web-based tool for visualization and analysis of social network data
Feeding Social network platforms with data is a daily routine for millions of people all over the world, and at the same time one of the major means to reproduce any kind of information. The wealth of information hidden in this massive volume of produced data is an asset to anyone that can understand how to process them. The main goal is to produce structures able to semantically the users’ network; then, the update of information becomes easier. Another issue of concern is the identification of appropriate methods to be employed in order to represent the respective networks in a user friendly way. So far, a multitude of applications have been designed and developed in an effort to provide solutions to the visualization problem of networks (in general, but also social networks in specific), each one with specific advantages and drawbacks. This dissertation aims at developing a web-based visualization system for social networks, capable of updating its data automatically and exporting the graph structures, allowing at the same time the end user to interact with the network. Through the developed tool the end user is able to interact with a hierarchically structured network and reform it in terms of semantic significance. Obviously, constraints related to graph size and system response have been taken under consideration along design and development of the presented tool.
Stylianos Moschoglou
Development of a multi-agent platform of a stock market for the exploration of the causes of financial crashes
Advances in technology and increased Internet penetration have simplified access to stock markets and, especially in recent years, have increased the number of people willing to invest in them. Stock markets consisted exclusively of stock brokers; this however has changed radically and, as a result, the investor mix has expanded demo- graphically as well as qualitatively, including a wide spectrum of social groups varying in education, social behaviors, etc. This diversity in stock markets has led to deviations from the old classical models that economists employed in order to analyze them. As a result, a huge interest arose in finding new models for the replacement of the old ones. One of the main targets of these new models was to represent reality in a better way by eliminating the problems generated by the complexity of the stock markets. In order for this to be achieved, there was an interdisciplinary cooperation with other scientific fields apart from the field of behavioral finance, such as the fields of software engineering and applied mathematics. The outcome of this interdisciplinary collaboration was that stock market models, mainly from the late 90’s, were developed as multi-agent platforms. These multi- agent platforms, especially designed software to simulate a wide range of emergent social and scientific phenomena, contributed profoundly to the optimization of stock market modeling. Researchers, with the assistance of the multi-agent platforms and based on the theory of behavioral finance, were able to construct models that would embed and simulate different social groups with a wide range of behaviors and strat- egies. Within the context of the current diploma thesis we present a stock market mod- eling and simulation multi-agent platform, which is based on the original Sim- StockExchange model. Our platform includes a plethora of different behaviors and possesses all the necessary mechanisms that validate the outputs of our modified model against those of the original one. We focus our attention on the conditions which, when fulfilled, could trigger a fi- nancial crash in the stock markets. More precisely, we study different types of behav- iors among investors, so as to figure out which specific types could precipitate a finan- cial crash. Finally, we present a mechanism with which wealthy and experienced investors in the stock market could impose a financial crash. Results generated from simulation correspond quite well with the ones from real stock markets. There are a lot of poten- tial extensions of the model and some of them are mentioned in the penultimate chapter.
Michail Karasavvas
Fault Detection on Sensor Systems Based on Adaptive Outliers Detection Techniques
Anomaly detection is a very important aspect of contemporary systems, since all an increasing number of human tasks, from simple everyday activities to complex businesses and industry workflows, are becoming more and more automated. In this context, the detection of anomalies in the behavior of systems is an intriguing re- search topic, where researchers are striving to reduce system ”failures” to a minimum, thus requesting as little human intervention as possible. Through proper design, a system should operate with the highest possible accuracy.Within the context of the current thesis, the data generated by real-life operating air-conditioning units, are analyzed and explored. The data comprising mainly measurements of sensor based systems. The conclusions derived during the analysis are then used for the construc- tion of machine learning models. The aim of the machine learning models is the prediction and detection of the unit’s faulty behaviors. The methodology and the techniques employed here could be applied in any sensor system for fault detection analysis.Stylianos Tsampas
Vulnerable System and Wargame
The increasing dependence on technology has pushed humanity to invest more and more resources on the field of Computer Security. This field evolves rapidly, with new technologies, defensive or offensive, being introduced with very high frequency. This particular flow of new technology means that the field is in an ever-changing and unstable state. Systems thought to be adequately secured may become insecure in an instant. One of the most interesting approaches in Computer Security are Intrusion Detection Systems (IDS). Their purpose is the analysis of incoming inputs in a computer system or network and their subsequent classification as malicious or safe. IDS are, at least theoretically, very powerful defense mechanisms, since they can detect attacks regardless of their type. An Intrusion Detection System’s detection scheme might be based on either heuristics or on training using a representative set of data (dataset). As a result, in the second case, the dataset must resemble real life network traffic found in the wild and also, in the case of re-training, a new one should be easy to produce. An environment which simulates realistic attacking conditions eases the collection of attack data and as result eases the training of an IDS. However, related efforts so far were being hindered by the lack of virtualization technology. The current Diploma Thesis focuses on the creation of a simple, easy-to-use, easy-to- install and vulnerable system/framework for the simulation of realistic attacking conditions. At the same time, it focuses on implementing a set of realistic attacks exploiting the vulnerabilities present. The result is a complete system, capable of simulating a real-life networks, which combines state of the art vulnerabilities and attacks with ease to expand and modify. Finally, it can also be used for educational purposes.Vasiliki Gerokosta
Empirical Validation of the Efficiency of Change Metrics and Static Code Attributes in Software Projects for Defect Prediction
Defect prediction is an important issue in Software Engineering, and thus it has generated widespread interest for a considerable period of time. Defects in software become increasingly expensive to fix as the software progress through their life-cycle. Quality assurance via rigorous testing before releasing the product is crucial to keep such costs low. Nevertheless, time and manpower are finite resources. Thus, it makes sense to assign personnel and/or resources to areas of a software system with a higher probable quantity of bugs. Several defect prediction models have been developed by researchers in order to reliably identify defect-prone entities. Τhe task of machine learning solves this problem since its purpose is to develop algorithms capable of improving their own performance by exploiting existing data -stored in huge databases- in order to discover knowledge and interpret several phenomena. The aim in this case is to create a defect prediction model using different software metrics, which shall be able to predict the presence or absence of bugs for each part of the software. In this diploma thesis, we utilized the predictive power of change metrics, i.e. metrics that reflect the changes in the software’s source code, originating from bug repositories and the version database of Eclipse. We implement the classification models of Logistic Regression, Naïve Bayes Classifier and Decision Trees on the dataset and evaluate their efficiency. On an attempt to improve their performance we apply the theory of Ensemble Learning, specifically the boosting theory through the implementation of the AdaBoost algorithm. Our results illustrate how these metrics can be useful in predicting bugs, as long as they are utilized correctly by an appropriate algorithm.
Nina Eleutheriadou
Empirical validation of object-oriented metrics on open source software for fault prediction
The importance of open source software systems has been felt both in software industry and research. Numerous software projects are developed using open source tools and companies invest in open source projects, and they also use such software in their own work. Much research is performed on or using open source software systems because such software is not monopolized and they are free from licensing issues. Since open source software are developed following a style different from the conventional one, there arises a need to measure the quality and reliability of them. Hence, the characteristics of the source code of these projects need to be measured to obtain more information about them. Faults in software systems are a major problem. Knowing the causes of possible defects as well as identifying general software process areas that may need attention from the initialization of a project could save money, time and work. The possibility of early estimating the potential faultiness of software could help on planning, controlling and executing software development activities. Furthermore, there are available metrics for predicting fault-prone classes, which may help software organizations for planning and performing testing activities. Fault-proneness of a software module is the probability that the module contains faults. This may be possible due to the allocation of resources on fault- prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great challenge. In this diploma thesis, we study which prediction algorithm is more effective for detecting faults in classes of open source software. Our ultimate goal is the evaluation of several prediction algorithms, which were implemented and investigated within the objectives of this diploma thesis. Specifically, these algorithms are Logistic and Linear Regression, Decision Trees and AdaBoost. Their effectiveness was the benchmark for the fulfilling expectations of this thesis. Furthermore, our goal is the evaluation of each category of metrics, Change Metrics, Source Code Metrics, Complexity Metrics, Bug Metrics, Churn of Source Code Metrics, Entropy of Source Code Metrics, in order to decide which one is more effective in fault prediction.

2013

Giorgos Kordopatis-Zilos
Design and Development of a Mechanism for the Automated Geotagging of Multimedia
The problem of geotagging emerged because of the incremental amount of images and videos that are found in the web, and denotes a matter of concern between the members of the scientific community. The creation of a system that will achieve this goal is the primary purpose of this diploma thesis. This system receives a set of training and test media followed by their metadata, and under the proper process it becomes capable of estimating the exact geographical location of each query media. For this task, two systems, based on theoretical means, are implemented in order to achieve the main goal. Initially, an approach that uses language models is built for the metadata of the training set to be analyzed. The outcome of this analysis consequences the formation of distinctive vocabularies with respect to the wider geographical areas. Afterwards, the assignment of the query media to the above areas is accomplished. Finally, the estimation of the final position of the query media is based on the media that belong on this particular area. Furthermore, an additional method for the location estimation of the media is developed, which is cited on the sematic analysis of the training set’s metadata and the visual analysis of the media. By means of the sematic analysis, a bag-of-excluded- words (BoEW) is formed in accordance with which, the metadata of the media of the test set is filtered. Yet, the location estimation of each query media is established in a similar manner to the one described on the previous model. From the implementation of the above approaches, useful remarks in proportion to the performance and the sensitivity of the imported sets of data arise. Regarding the achievement of the final goal, the semantic analysis of the media’s metadata appears to be effective.
Spyridon Skoumpakis
Workflow Extraction from UML Activity Diagrams
In the context of Software Engineering every system can have two aspects, static and dynamic. During the investigation of the dynamic part of a system every engineer seeks to automate the extraction of Workflow in a both human and machine processable form, as the Workflow constitutes essentially all the available information of a system in the light of its dynamic aspect. One problem often faced by Software Engineers throughout the development of a new software project is the fact that they can not easily find (online) the workflow for similar types of systems. All they can find is .jpg images of UML Activity Diagrams by corresponding software projects, which graphically represent system’s workflow. As a part of this diploma thesis a potential solution to the above (stated) problem is developed, a program named UADxTractor. This software – based tool receives images of Activity Diagrams as input and subjects them to various levels of processing. First level aims to enable the identification of the main entities of each diagram and the relationships between them. The second processing level involves the detection and storage of text included in each diagram as well as the detection of workflow\'s direction between its entities. In final level, the proposed system, having already obtained all the available information which is included (graphically) in the input image, stores it in a semantically aware structure (Ontology), named Workflow_RDF. To verify the proper operation of UADxTractor several experiments were performed, the results of which are presented along with the necessary conclusions at the end of this paper.
Anastasia Herodotou
Applying Machine Learning Techniques on Software Systems for Fault Diagnosis and Anomaly Detection
Anomaly detection is a very important aspect of contemporary systems, since all areas of human activity, from simple everyday activities to the most complex businesses and industry workflows, are becoming more and more automated. In this context, the detection of anomalies in the behavior of the systems is an intriguing research topic, where researchers are striving to reduce system \"failures\" to a minimum, thus requesting as little as human intervention as possible. Through proper design, a system should operate with the maximum possible accuracy, while humans are prone to inadvertent errors. Within the context of this dissertation we employ machine learning techniques and specifically classification in order to assess the ability to detection deviation in behavior on a system setup on ROS, a popular middleware framework for robotics. The system implements the (well-known in Operating systems) \"dining philosophers\" problem, which inherently supports concurrency and shared resources. Initially, an in-depth discussion on anomaly detection methods is performed and then a thorough examination of SVM (Support Vector Machines) and GMM (Gaussian Mixture Models) classifiers is provided. Next the developed “dining philosophers” system is analyzed, as well as the developed methodology for the detection of abnormalities in the behaviors of the philosophers. Finally, the results of the experiments that were performed are presented followed by a commentary on the performance of the two classifiers.
Christoforos Zolotas
Leonardo Software Entity Semantic Search Engine
Throughout the evolution of Software Engineering great progress has taken place in the management of software projects as well as the methodology that is being followed from the very first sketch of the new product up to its delivery. This progress allowed the production of relatively complex software systems, in due time and with fairly low cost. The aforementioned evolution is primarily the outcome of personal talent, imaginative breakthroughs and prior experience of re- searchers/engineers in task. However, many challenges still reside in the realm of Software Engi- neering both during the design and construction phase of a software project. As a result, many projects fail to serve their goals or even get abandoned prior to completion. The primary reason is the arbitrary nature of software which is among the most visionary and abstract human inventions, accompanied by the ambiguities in human communication. In the late 2000s, OMG announced a new initiative named MDA, which aims to full or at least partial obviation of the above-mentioned challenges. MDA’s primary idea is to shift a remarka- ble portion of the software engineer’s involvement with low level code production to more abstract models, of varying detail level, which isolate him from low level implementation details. The core idea is the production of an initial model of the product, free of specific implementation or platform details called CIM, which in turn is formally transformed to an implementation detailed (PIM) and then to a platform specific one (PSM) that allows the production of fully or partially executable soft- ware. Initially, in this diploma thesis, a thorough exploration of the core MDA ideas alongside with a brief discussion of ontologies takes place. Subsequently, the design and implementation of a soft- ware entity semantic search engine tool is presented. The search resource of this tool is an ontology populated with existing, semantically annotated software projects in such a way so that complete UML model retrieval is possible. The input of the semantic search engine is a functional requirement which then triggers the quest of a UML model which is annotated in the ontology as one that satis- fies it. This quest is put through a series of SPARQL queries, each of which retrieves a small part of the needed UML model. Depending on user input, the semantic search engine implemented as a product of this diploma thesis, is capable of retrieving function, class or package UML models, whe- reas the principle goal is to be as complete as possible given that it abides by Software Engineering rules. The output of the tool is an XMI file containing the desired models, which is compatible with the popular open source tool named ArgoUML. As an extra feature, it is possible to generate “struc- tural” code which could be the basis for the implementation of the retrieved model.
Rafaila Grigoriou
PYTHIA: a Question Answering System for Assisting Software Engineers in Searching Software Projects
Software Engineering has progressed at an accelerating pace during the last decades. Software Engineers that are trying to make decisions related to their systems’ design and development, such as the optimal set of functional requirements and the proper system design and code structure, have to deal with huge amounts of information, which usually are not easy to access. Were this information stored and annotated, defining the proper software architecture could become much easier and could result to much more efficient software. In this context, developers would be able to address other Software Engineers’ solutions to similar projects and could reuse them as off-the-shelf components, or could adjust them to their own needs. Within the context of this diploma thesis PYTHIA (Programmer’s dYnamic Thematic Interactive Advisor) has been designed and developed. PYTHIA is a driver-tool that can provide guidance through requirements elicitation and class modeling of a software project. Information related to already implemented software projects is stored in two ontologies and offer engineers with the ability to access previous design paradigms, reuse, or even evolve them. RequirementsOnt contains information related to the user requirements phase; UMLOnt stores information related to the system specification phase. PYTHIA employs both ontologies, and uses natural language processing techniques in order to assist software developers in defining the proper queries to be sent to the Software search engine. It is a web-application, which provides users with the opportunity to query the ontology either using natural language, or by compiling advanced queries through the corresponding view of the interface. In both cases, due to the use of external dictionaries, the system is able to deal with term disambiguations. PYTHIA may be considered as a platform that promotes reusability of software elements (requirements, classes, components), assisting software engineers to avoid “reinventing the wheel”, thus make their work more efficient and effective.Ioannis Goutas
A Multi-Agent Simulation Framework for the Societal and Behavioral Modeling of Stock Markets
Nowadays, all types of information are available online and in real-­‐time. Data regarding politics, financial regimes and legislation are constantly changing and evolving, thus dictating the need for adaptability, in order for someone to advance professionally, or even personally. Given that Complex Adaptive Systems have been widely applied to simulate and monitorsocietal phenomena, these parameters have to taken into account. Complex adaptive systems comprise software Agents(Complex Multi Agent System) that simulate the desired societal or economic activity and adapt their behavior and decision making based on environment information. Thus, in such systems, agents that are more influenced by politics, their close environment and mass media usually adjust to changes more eagerly. Others that are less prone to this type of influencers, show more stable behaviors. Obviously, an agent society that comprises different agent types in different mixes may behave in interesting ways, which deserve further investigation. In the context of this thesis FinanceCity has been developed. It is a Complex Adaptive multi-­‐Agent system that emphasis on the behavioral changes of different types of populations in constantly evolving environments. Like in the real world, in FinanceCity no agent follows a static strategy; rather it adapts the decision making process based on external stimuli and the initial goal defined. As its name implies, FinanceCity focuses on the analysis of such a complex environment on the financial domain: a stockmarket environment is simulated, where agents buy and sell stocks based on their character, their goal and the societal snapshot. In short, an agent’s behavior is defined by its Static and Dynamic characteristics, as well as its Status Portfolio. Focus of the current work is on the study of the agents’ behavioral changes based on their characteristics, as well the study of the overall system’s balance.
Sotirios Beis
Clustering Evolving Social Networks with Community Detection Techniques
Christina Mpoididou
Use of Technics in Editing of Natural Languages and Data Mining to relate Characteristics of Software Quality and Failure Reports
The subject of the current diploma thesis is in the field of Data Mining and especially Data Mining for Software Engineering data. The information that is provided by this kind of data creates the need to extract knowledge that would be very helpful for further study. In this thesis, we attempt to study a dataset of bug reports with primitives of Semantic Analysis and Natural Language Processing fields, in order to create sets of words that users choose to use mostly when they write reports and also combine them in groups. The aim of word grouping is to generally understand how people describe the software problems. Except for that, we also apply Semantic Relatedness to these groups of words to relate them semantically with some software quality attributes. The final goal of this thesis is to extract knowledge for the content of bug reports and also the kind of bugs that describe, as the kind of reports’ content is identified, something that is very useful for the development procedure of a software application.
Alexander Adamos
Analysis and Development of a algorithm to Detect Slow Attack in Intrusion Detection System
With the continuous evolution of Computer and Network Technology the need for protection of Information Systems is ever growing. In order to detect network attacks, administrators use Intrusion Detection Systems (IDS). Nevertheless, their effectiveness is questionable and one may bypass them, given the required skills. During the last years, a new trend in the world of hacking is introduced, known as “Slow HTTP DoS” attacks. These attacks use the HTTP protocol and manage to occupy all the offered connections by a web-server with legitimate use of the TCP/IP Link Layer. Also, the attacker does not need to send any crafted or malformed packets to the victim. He/she just takes advantage of the known vulnerabilities of the network’s Application Layer 7 (OSI model). These attacks are known as extremely stealth and low bandwidth Denial-of-Service attacks. Within the context of this diploma thesis, we propose the enhancement of a popular Intrusion Detection System, to detect the fore attacks. At first, we examine the software tools that may perform the “Slow HTTP DoS” attacks. Having understood their nature of operation, we have selected the Snort IDS in order to implement our detection plugin. We have developed a new Snort Preprocessor, the Slow Preprocessor, which incorporates the Slow HTTP DoS Module. This module incorporates the Slow HTTP DoS Detection Algorithm and calculates the Attack Entropy metric, by which we estimate the potential of an occurring attack. Furthermore, we create network and server statistics based on the Netflow protocol. The last gives us the opportunity to create the background of a future implementation of a complete detection module that alerts users on the very early stages of such attacks. Our mechanism was subjected to a number of experiments, based on different types of settings. Results are presented and are discussed in the prism of network security.

2012

Chrisafenia Mallia-Stolidou and Asterios Vounotripidis
Software Platform for the Design and Development of 3D Role Playing Games
Evagelia Diamantidou
Creation of Valuable Mechanism of Trust and Fame in Open Transaction Networks
Dimitra Miha and Nikolaos Chandolias
Mechanism of Data Mining From Text with the use of Natural Language and Display That Process in 3D
Grigoris Athanasiadis
Use of Technics of Artificial Intelligence for the Analysis and Development of an Intelligent Softare Agent for E- Auction Commence
Marina-Eirini Stamatiadou
Analyse and Development of Service Oriented Architect for RFID Systems
Nikolaos Tsiotskas
Virtual Project 3D: A Tool for the Graphical Presentation of Software Projects in 3D
Ioannis Papastergiou
Creation of a System to Log and Simulate Network Traffic in Order to Control Attack Detection Mechanism with the Use of Markov
Anna Adamopoulou
Analysis and Reaction of an Algorithm of Multi- Criteria Trust and Fame in E-Shop Systems with Software Agents

2011

Anastasia Mourka
Export of Software Requirements From UML Use Case Diagrams.
Ioanna Kampilauka
Use of Date Mining Technics to Sort and Label of Electrical Power Consumers
Themistoklis Diamantopoulos
Analysis and Development of Algorithms for Auction House to be Used in the Power TAC Competition
Konstantina Valogianni
Analysis of an Agent Architect for the Participation in Energy Stock Market
Ioannis Stamkos
Creation of Profiles for the Users of the Second Life by Using Data Mining
Ioannis Gounaris
Creation of Real-Time Systems with Real-Time Java
Anastasia Skantza and Vasileia Tzamtzi
Creation of Bid Mechanism for E-Commerce Auction Systems
Themistoklis Mavridis
An LDA-based mechanism for the optimization of website ranking in search engines

2010

Nikolaos Stasinopoulos
Algorithm of Export Semantic Knowledge From Software Storage
Theano Mintsi
Algorithm to Export Relations to Software Requirements Using Technics of Natural Language and Data Mining
Emmanouil Spanoudakis
ezHome – Simulation and Control System of a Smart House With the use of Agents