Utilize este identificador para referenciar este registo: http://hdl.handle.net/10071/25429
Registo completo
Campo DCValorIdioma
dc.contributor.authorVicente, M.-
dc.contributor.authorBatista, F.-
dc.contributor.authorCarvalho, J. P.-
dc.contributor.editorCarvalho, J. P., Lesot, M.-J., Kaymak, U., Vieira, S., Bouchon-Meunier, B., and Yager, R. R.-
dc.date.accessioned2022-05-17T14:33:09Z-
dc.date.available2022-05-17T14:33:09Z-
dc.date.issued2016-
dc.identifier.isbn978-3-319-40581-0-
dc.identifier.issn1865-0929-
dc.identifier.urihttp://hdl.handle.net/10071/25429-
dc.description.abstractThe gender information of a Twitter user is not known a priori when analysing Twitter data, because user registration does not include gender information. This paper proposes an approach for creating extended gender labelled datasets of Twitter users. The process involves creating a smaller database of active Twitter users and to manually label the gender. The process follows by extracting features from unstructured information found on each user profile and by creating a gender classification model. The model is then applied to a larger dataset, thus providing automatic labels and corresponding confidence scores, which can be used to estimate the most accurately labeled users. The resulting databases can be further enriched with additional information extracted, for example, from the profile picture and from the user location. The proposed approach was successfully applied to English and Portuguese users, leading to two large datasets containing more than 57K labeled users each.eng
dc.language.isoeng-
dc.publisherSpringer-
dc.relationinfo:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UID%2FCEC%2F50021%2F2013/PT-
dc.rightsopenAccess-
dc.subjectGender classificationeng
dc.subjectTwitter userseng
dc.subjectGender databaseeng
dc.subjectText miningeng
dc.titleCreating extended gender labelled datasets of Twitter userseng
dc.typeconferenceObject-
dc.event.title16th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2016-
dc.event.typeConferênciapt
dc.event.locationEindhoveneng
dc.event.date2016-
dc.pagination690 - 702-
dc.peerreviewedyes-
dc.journalInformation Processing and Management of Uncertainty in Knowledge-Based Systems. Communications in Computer and Information Science-
dc.volume611-
degois.publication.firstPage690-
degois.publication.lastPage702-
degois.publication.locationEindhoveneng
degois.publication.titleCreating extended gender labelled datasets of Twitter userseng
dc.date.updated2022-05-17T15:30:42Z-
dc.description.versioninfo:eu-repo/semantics/acceptedVersion-
dc.identifier.doi10.1007/978-3-319-40581-0_56-
dc.subject.fosDomínio/Área Científica::Ciências Naturais::Matemáticaspor
dc.subject.fosDomínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informaçãopor
iscte.identifier.cienciahttps://ciencia.iscte-iul.pt/id/ci-pub-30894-
iscte.alternateIdentifiers.wosWOS:WOS:000387430000056-
Aparece nas coleções:IT-CRI - Comunicações a conferências internacionais

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
conferenceobject_30894.pdfVersão Aceite1,84 MBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis Logotipo do Orcid 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.