CentraleSupélec is an internationally-reputed Higher Education and Research Institution. Its excellence lies in its combination of fundamental and applied sciences for innovation with societal impact. For almost two centuries, CentraleSupélec's top engineers have been practicing their skills and knowledge for the development of corporate institutions and public organizations.
Our Professor Fragkiskos Malliaros from CVN laboratory awarded
The work of our Professor Fragkiskos Malliaros from our CVN laboratory along with Konstantinos Skianis (Ph.D. Student at LIX, Ecole Polytechnique) and Michalis Vazirgiannis (Professor at LIX, Ecole Polytechnique) on graph-based text categorization received the best paper award at the the Annual Conference of the North American Chapter of the Association for Computational Linguistics held in New Orleans, USA, June 1 to June 6, 2018.
Title of the paper: 'Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification'.
Summary: With the rapid growth of social media and networking platforms, the available textual resources have been increased. Text categorization refers to the machine learning task of assigning a document to a set of two or more predefined categories (or classes). A particularly well-known application is the one of sentiment analysis, i.e., the process of computationally identifying and categorizing opinions expressed in a piece of text (e.g., positive, negative, or neutral). The goal of our work is to use graph mining techniques to enhance the task of text categorization. Building upon the fact that graphs can be used to represent textual content, in our work we propose a graph-based framework for text categorization. In particular, we extract graphs from text, where the nodes correspond to terms and the edges capture term co-occurrence relationships –- addressing the term-independence assumption widely considered in many natural language processing tasks. Our methods outperform existing text categorization approaches in various applications (e.g., subjectivity detection and opinion mining in movie reviews), and could further be applied in other text analytics domains, such as the ones of information retrieval and web search.