In recent years, hatred directed against women has spread exponentially, especially in online social media. Although this alarming phenomenon has given rise to many studies both from the viewpoint of computational linguistics and from that of machine learning, less effort has been devoted to analysing whether models for the detection of misogyny are affected by bias. An emerging topic that challenges traditional approaches for the creation of corpora is the presence of social bias in natural language processing (NLP). Many NLP tasks are subjective, in the sense that a variety of valid beliefs exist about what the correct data labels should be; some tasks, for example misogyny detection, are highly subjective, as different people have very different views about what should or should not be labelled as misogynous. An increasing number of scholars have proposed strategies for assessing the subjectivity of annotators, in order to reduce bias both in computational resources and in NLP models. In this work, we present two corpora: a corpus of messages posted on Twitter after the liberation of Silvia Romano on the 9th of May, 2020 and corpus of comments constructed starting from posts on Facebook that contained misogyny, developed through an experimental annotation task, to explore annotators’ subjectivity. For a given comment, the annotation procedure consists in selecting one or more chunk from each text that is regarded as misogynistic and establishing whether a gender stereotype is present. Each comment is annotated by at least three annotators in order to better analyse their subjectivity. The annotation process was carried by trainees who are engaged in an internship program. We propose a qualitative-quantitative analysis of the resulting corpus, which may include non-harmonised annotations.
University of Chieti-Pescara G. D'Annunzio, Italy
University of Chieti-Pescara G. D'Annunzio, Italy - ORCID: 0009-0000-5408-0104
University of Turin, Italy - ORCID: 0000-0001-9337-7250
University of Turin, Italy - ORCID: 0000-0001-8110-6832
University of Chieti-Pescara G. D'Annunzio, Italy
University of Chieti-Pescara G. D'Annunzio, Italy - ORCID: 0000-0002-5441-0035
Titolo del capitolo
An experimental annotation task to investigate annotators’ subjectivity in a Misogyny dataset
Autori
Alice Tontodimamma, Stefano Anzani, Marco Antonio Stranisci, Valerio Basile, Elisa Ignazzi, Lara Fontanella
Lingua
English
DOI
10.36253/979-12-215-0106-3.49
Opera sottoposta a peer review
Anno di pubblicazione
2023
Copyright
© 2023 Author(s)
Licenza d'uso
Licenza dei metadati
Titolo del libro
ASA 2022 Data-Driven Decision Making
Sottotitolo del libro
Book of short papers
Curatori
Enrico di Bella, Luigi Fabbris, Corrado Lagazio
Opera sottoposta a peer review
Anno di pubblicazione
2023
Copyright
© 2023 Author(s)
Licenza d'uso
Licenza dei metadati
Editore
Firenze University Press, Genova University Press
DOI
10.36253/979-12-215-0106-3
eISBN (pdf)
979-12-215-0106-3
eISBN (xml)
979-12-215-0107-0
Collana
Proceedings e report
ISSN della collana
2704-601X
e-ISSN della collana
2704-5846