WEMB

logo_WEMB_new_small.png

From Cognitive Linguistics to Language Engineering, and Back

In the past decade, advancements in deep learning, particularly in the field of natural language processing (NLP) and text mining, have significantly enhanced semantic analysis tasks such as text classification, word sense disambiguation, machine translation, text summarization, question answering, and sentiment analysis. This progress is largely attributed to the concept of word embedding, a word’s meaning representation obtained through numeric coordinates, also known as vectors. Current word embeddings, derived from large textual corpora, have demonstrated efficacy but raise questions about their alignment with human language processing. The WEMB project aims to address this by pursuing two objectives: firstly, gaining a deeper understanding of how word embeddings align with human language processing, and secondly, leveraging this understanding to develop a new generation of embeddings for NLP tasks.

The project employs a “from mind to application and back” approach, bridging the expertise of UniBO in language processing with ISTI-CNR’s proficiency in NLP. WEMB focuses on three key aspects: Embeddings and Cross-Modality: Investigating the relationship between embeddings that incorporate cross-modal information (e.g., from text and images) and traditional text-based embeddings in language processing. Embeddings and Misspellings: Exploring the connection between embeddings and misspellings, a prevalent linguistic behavior in a growing number of texts for various reasons. Embeddings and Word Senses: Examining the relationship between embeddings and word senses, particularly among different embeddings associated with different senses of the same ambiguous word. Through these investigations, WEMB aims to contribute to both a theoretical understanding of word embeddings in human language processing and the practical development of enhanced embeddings for NLP applications.


Members

Fabrizio Sebastiani
Fabrizio Sebastiani
PI, Responsabile RL ISTI
CNR-ISTI
Marianna Bolognesi
Marianna Bolognesi
Vice PI, Responsabile RL UniBo
UniBo
Andrea Esuli
Andrea Esuli
Core Member
CNR-ISTI
Alejandro Moreo
Alejandro Moreo
Core Member
CNR-ISTI
Gianluca Sperduti
Gianluca Sperduti
Core Member
CNR-ISTI
Andrea Pedrotti
Andrea Pedrotti
Core Member
CNR-ISTI
Adele Loia
Adele Loia
Core Member
UniBo/IUSS
Giulia Rambelli
Giulia Rambelli
External Collaborator
UniBo
Giovanni Puccetti
Giovanni Puccetti
External Collaborator
CNR-ISTI
Caterina Villani
Caterina Villani
External Collaborator
UniBo