Imbalanced learning in assessing the risk of corruption in public administration

Registro completo de metadados
MetadadosDescriçãoIdioma
Autor(es): dc.creatorVasconcelos, Marcelo Oliveira-
Autor(es): dc.creatorChaim, Ricardo Matos-
Autor(es): dc.creatorCavique, Luís-
Data de aceite: dc.date.accessioned2022-02-15T14:11:06Z-
Data de disponibilização: dc.date.available2022-02-15T14:11:06Z-
Data de envio: dc.date.issued2021-12-21-
Data de envio: dc.date.issued2021-12-21-
Data de envio: dc.date.issued2020-
Fonte completa do material: dc.identifierhttp://hdl.handle.net/10400.2/11541-
Fonte: dc.identifier.urihttp://educapes.capes.gov.br/handle/10400.2/11541-
Descrição: dc.descriptionThis research aims to identify the corruption of the civil servants in the Federal District, Brazilian Public Administration. For this purpose, a predictive model was created integrating data from eight different systems and applying logistic regression to real datasets that, by their nature, present a low percentage of examples of interest in identifying patterns for machine learning, a situation defined as a class imbalance. In this study, the imbalance of classes was considered extreme at a ratio of 1:707 or, in percentage terms, 0.14% of the interest class to the population. Two possible approaches were used, balancing with resampling techniques using synthetic minority oversampling technique SMOTE and applying algorithms with specific parameterization to obtain the desired standards of the minority class without generating bias from the dominant class. The best modeling result was obtained by applying it to the second approach, generating an area value on the ROC curve of around 0.69. Based on sixty-eight features, the respective coefficients that correspond to the risk factors for corruption were found. A subset of twenty features is discussed in order to find practical utility after the discovery process.-
Descrição: dc.descriptions. L.Cavique would like to thank the FCT Projects of Scientific Research and Technological Development in Data Science and Artificial Intelligence in Public Administration, 2018–2022 (DSAIPA/DS/0039/2018), for its support.-
Descrição: dc.descriptioninfo:eu-repo/semantics/publishedVersion-
Idioma: dc.languageen-
Relação: dc.relationDSAIPA/DS/0039/2018-
Direitos: dc.rightsopenAccess-
Palavras-chave: dc.subjectData enrichment-
Palavras-chave: dc.subjectImbalanced learning-
Palavras-chave: dc.subjectCorruption-
Palavras-chave: dc.subjectPublic administration-
Título: dc.titleImbalanced learning in assessing the risk of corruption in public administration-
Tipo de arquivo: dc.typeaula digital-
Aparece nas coleções:Repositório Aberto - Universidade Aberta (Portugal)

Não existem arquivos associados a este item.