| Sotirios-Filippos Tsarouchis, Maria Th. Kotouza, Fotis E. Psomopoulos and Pericles A. Mitkas
"A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences"
IFIP International Conference on Artificial Intelligence Applications and Innovations, pp. 189-199, Springer, Cham, 2018 May
  The identification of meaningful groups of proteins has always been a major area of interest for structural and functional genomics. Successful protein clustering can lead to significant insight, assisting in both tracing the evolutionary history of the respective molecules as well as in identifying potential functions and interactions of novel sequences. Here we propose a clustering algorithm for same-length sequences, which allows the construction of subset hierarchy and facilitates the identification of the underlying patterns for any given subset. The proposed method utilizes the metrics of sequence identity and amino-acid similarity simultaneously as direct measures. The algorithm was applied on a real-world dataset consisting of clonotypic immunoglobulin (IG) sequences from Chronic lymphocytic leukemia (CLL) patients, showing promising results. @inproceedings{2018Tsarouchis, author={Sotirios-Filippos Tsarouchis and Maria Th. Kotouza and Fotis E. Psomopoulos and Pericles A. Mitkas}, title={A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences}, booktitle={IFIP International Conference on Artificial Intelligence Applications and Innovations}, pages={189-199}, publisher={Springer}, address={Cham}, year={2018}, month={05}, date={2018-05-22}, doi={https://doi.org/10.1007/978-3-319-92016-0_18}, isbn={978-3-319-92016-0}, abstract={The identification of meaningful groups of proteins has always been a major area of interest for structural and functional genomics. Successful protein clustering can lead to significant insight, assisting in both tracing the evolutionary history of the respective molecules as well as in identifying potential functions and interactions of novel sequences. Here we propose a clustering algorithm for same-length sequences, which allows the construction of subset hierarchy and facilitates the identification of the underlying patterns for any given subset. The proposed method utilizes the metrics of sequence identity and amino-acid similarity simultaneously as direct measures. The algorithm was applied on a real-world dataset consisting of clonotypic immunoglobulin (IG) sequences from Chronic lymphocytic leukemia (CLL) patients, showing promising results.} } |
| Maria Th. Kotouza, Konstantinos N. Vavliakis, Fotis E. Psomopoulos and Pericles A. Mitkas
"A Hierarchical Multi-Metric Framework for Item Clustering"
5th International Conference on Big Data Computing Applications and Technologies, pp. 191-197, IEEE/ACM, Zurich, Switzerland, 2018 Dec
   Item clustering is commonly used for dimensionality reduction, uncovering item similarities and connections, gaining insights of the market structure and recommendations. Hierarchical clustering methods produce a hierarchy structure along with the clusters that can be useful for managing item categories and sub-categories, dealing with indirect competition and new item categorization as well. Nevertheless, baseline hierarchical clustering algorithms have high computational cost and memory usage. In this paper we propose an innovative scalable hierarchical clustering framework, which overcomes these limitations. Our work consists of a binary tree construction algorithm that creates a hierarchy of the items using three metrics, a) Identity, b) Similarity and c) Entropy, as well as a branch breaking algorithm which composes the final clusters by applying thresholds to each branch of the tree. ?he proposed framework is evaluated on the popular MovieLens 20M dataset achieving significant reduction in both memory consumption and computational time over a baseline hierarchical clustering algorithm. @inproceedings{KotouzaVPM18, author={Maria Th. Kotouza and Konstantinos N. Vavliakis and Fotis E. Psomopoulos and Pericles A. Mitkas}, title={A Hierarchical Multi-Metric Framework for Item Clustering}, booktitle={5th International Conference on Big Data Computing Applications and Technologies}, pages={191-197}, publisher={IEEE/ACM}, address={Zurich, Switzerland}, year={2018}, month={12}, date={2018-12-17}, url={http://issel.ee.auth.gr/wp-content/uploads/2019/02/BDCAT_2018_paper_24_Proceedings.pdf}, doi={http://10.1109/BDCAT.2018.00031}, abstract={Item clustering is commonly used for dimensionality reduction, uncovering item similarities and connections, gaining insights of the market structure and recommendations. Hierarchical clustering methods produce a hierarchy structure along with the clusters that can be useful for managing item categories and sub-categories, dealing with indirect competition and new item categorization as well. Nevertheless, baseline hierarchical clustering algorithms have high computational cost and memory usage. In this paper we propose an innovative scalable hierarchical clustering framework, which overcomes these limitations. Our work consists of a binary tree construction algorithm that creates a hierarchy of the items using three metrics, a) Identity, b) Similarity and c) Entropy, as well as a branch breaking algorithm which composes the final clusters by applying thresholds to each branch of the tree. ?he proposed framework is evaluated on the popular MovieLens 20M dataset achieving significant reduction in both memory consumption and computational time over a baseline hierarchical clustering algorithm.} } |
| Konstantinos N. Vavliakis, Maria Th. Kotouza, Andreas L. Symeonidis and Pericles A. Mitkas
"Recommendation Systems in a Conversational Web"
Proceedings of the 14th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,, pp. 68-77, SciTePress, 2018 Jan
   In this paper we redefine the concept of Conversation Web in the context of hyper-personalization. We argue that hyper-personalization in the WWW is only possible within a conversational web where websites and users continuously “discuss” (interact in any way). We present a modular system architecture for the conversational WWW, given that adapting to various user profiles and multivariate websites in terms of size and user traffic is necessary, especially in e-commerce. Obviously there cannot be a unique fit-to-all algorithm, but numerous complementary personalization algorithms and techniques are needed. In this context, we propose PRCW, a novel hybrid approach combining offline and online recommendations using RFMG, an extension of RFM modeling. We evaluate our approach against the results of a deep neural network in two datasets coming from different online retailers. Our evaluation indicates that a) the proposed approach outperforms current state-of-art methods in small-medium datasets and can improve performance in large datasets when combined with other methods, b) results can greatly vary in different datasets, depending on size and characteristics, thus locating the proper method for each dataset can be a rather complex task, and c) offline algorithms should be combined with online methods in order to get optimal results since offline algorithms tend to offer better performance but online algorithms are necessary for exploiting new users and trends that turn up. @conference{webist18, author={Konstantinos N. Vavliakis and Maria Th. Kotouza and Andreas L. Symeonidis and Pericles A. Mitkas}, title={Recommendation Systems in a Conversational Web}, booktitle={Proceedings of the 14th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,}, pages={68-77}, publisher={SciTePress}, year={2018}, month={01}, date={2018-01-01}, url={https://issel.ee.auth.gr/wp-content/uploads/2019/02/WEBIST_2018_29.pdf}, doi={http://10.5220/0006935300680077}, isbn={978-989-758-324-7}, abstract={In this paper we redefine the concept of Conversation Web in the context of hyper-personalization. We argue that hyper-personalization in the WWW is only possible within a conversational web where websites and users continuously “discuss” (interact in any way). We present a modular system architecture for the conversational WWW, given that adapting to various user profiles and multivariate websites in terms of size and user traffic is necessary, especially in e-commerce. Obviously there cannot be a unique fit-to-all algorithm, but numerous complementary personalization algorithms and techniques are needed. In this context, we propose PRCW, a novel hybrid approach combining offline and online recommendations using RFMG, an extension of RFM modeling. We evaluate our approach against the results of a deep neural network in two datasets coming from different online retailers. Our evaluation indicates that a) the proposed approach outperforms current state-of-art methods in small-medium datasets and can improve performance in large datasets when combined with other methods, b) results can greatly vary in different datasets, depending on size and characteristics, thus locating the proper method for each dataset can be a rather complex task, and c) offline algorithms should be combined with online methods in order to get optimal results since offline algorithms tend to offer better performance but online algorithms are necessary for exploiting new users and trends that turn up.} } |