Utilize este identificador para referenciar este registo: http://hdl.handle.net/10071/16796
Registo completo
Campo DCValorIdioma
dc.contributor.authorVicente, M.-
dc.contributor.authorBatista, F.-
dc.contributor.authorCarvalho, J. P.-
dc.contributor.editorKóczy, László T. and Medina-Moreno, Jesús; Ramírez-Poussa, Eloísa-
dc.date.accessioned2018-11-29T16:22:11Z-
dc.date.available2018-11-29T16:22:11Z-
dc.date.issued2018-
dc.identifier.isbn978-3-030-01632-6-
dc.identifier.issn1860-949X-
dc.identifier.urihttps://ciencia.iscte-iul.pt/id/ci-pub-50916-
dc.identifier.urihttp://hdl.handle.net/10071/16796-
dc.description.abstractTwitter provides a simple way for users to express feelings, ideas and opinions, makes the user generated content and associated metadata, available to the community, and provides easy-to-use web and application programming interfaces to access data. The user profile information is important for many studies, but essential information, such as gender and age, is not provided when accessing a Twitter account. However, clues about the user profile, such as the age and gender, behaviors, and preferences, can be extracted from other content provided by the user. The main focus of this paper is to infer the gender of the user from unstructured information, including the username, screen name, description and picture, or by the user generated content. We have performed experiments using an English labelled dataset containing 6.5 M tweets from 65 K users, and a Portuguese labelled dataset containing 5.8 M tweets from 58 K users. We have created four distinct classifiers, trained using a supervised approach, each one considering a group of features extracted from four different sources: user name and screen name, user description, content of the tweets, and profile picture. Features related with the activity, such as number of following and number of followers, were discarded, since these features were found not indicative of gender. A final classifier that combines the prediction of each one of the four previous individual classifiers achieves the best performance, corresponding to 93.2% accuracy for English and 96.9% accuracy for Portuguese data.eng
dc.language.isoeng-
dc.publisherSpringer International Publishing-
dc.relationinfo:eu-repo/grantAgreement/FCT/3599-PPCDT/132048/PT-
dc.relationSFRH/BSAB/136312/2018-
dc.relationinfo:eu-repo/grantAgreement/FCT/5876/147282/PT-
dc.rightsopenAccess-
dc.subjectGender classificationeng
dc.subjectTwitter userseng
dc.subjectGender databaseeng
dc.subjectText miningeng
dc.titleGender detection of Twitter users based on multiple information sourceseng
dc.typebookPart-
dc.event.locationChameng
dc.event.date2018-
dc.pagination39 - 54-
dc.peerreviewedyes-
dc.journalInteractions Between Computational Intelligence and Mathematics Part 2. Studies in Computational Intelligence-
dc.volume794-
degois.publication.firstPage39-
degois.publication.lastPage54-
degois.publication.locationChameng
degois.publication.titleGender detection of Twitter users based on multiple information sourceseng
dc.description.versioninfo:eu-repo/semantics/acceptedVersion-
dc.identifier.doi10.1007/978-3-030-01632-6_3-
dc.subject.fosDomínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informaçãopor
dc.date.embargo2019-11-29
Aparece nas coleções:CTI-CLI - Capítulos de livros internacionais

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
gender_detection.pdfPós-print776,32 kBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis Logotipo do Orcid 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.