Università per Stranieri di Siena. Ricerche del CESIM e dell’Osservatorio dell’italiano diffuso fra stranieri e delle lingue immigrate in Italia
Vol. LIV, 3.2025
The IncluInstIt corpus: initial considerations for tagging genderinclusive language in Italian
Abstract
In this paper we introduce IncluInstIT, a novel corpus of Italian genderinclusive language curated from Instagram, alongside an innovative tagging system tailored to identify inclusive morphological strategies. Unlike existing corpora, IncluInstIT captures emergent gender inclusive language forms – such as, universal feminines, ə, u, x, and split forms – used in informal digital communication. Comprising over 4,800 pre-processed posts, this corpus reflects a dynamic spectrum of inclusive expressions across hashtags, offering a diachronic view of evolving gender representations. We here present an initial annotation scheme, enriched with newly defined gender tags, with the goal of discussing ways in which NLP tools can investigate fairness and inclusivity
Licenza
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Copyright
- Abstract viewed - 0 times
