Statistical approaches enabling technology-specific assay interference prediction from large screening data sets

Registro completo de metadados
MetadadosDescriçãoIdioma
Autor(es): dc.contributorUniversity of Vienna-
Autor(es): dc.contributorUniversidade de São Paulo (USP)-
Autor(es): dc.contributorBayer AG-
Autor(es): dc.contributorUniversidade Estadual Paulista (UNESP)-
Autor(es): dc.creatorPalmacci, Vincenzo-
Autor(es): dc.creatorHirte, Steffen-
Autor(es): dc.creatorHernández González, Jorge Enrique-
Autor(es): dc.creatorMontanari, Floriane-
Autor(es): dc.creatorKirchmair, Johannes-
Data de aceite: dc.date.accessioned2025-08-21T19:57:13Z-
Data de disponibilização: dc.date.available2025-08-21T19:57:13Z-
Data de envio: dc.date.issued2025-04-29-
Data de envio: dc.date.issued2024-06-01-
Fonte completa do material: dc.identifierhttp://dx.doi.org/10.1016/j.ailsci.2024.100099-
Fonte completa do material: dc.identifierhttps://hdl.handle.net/11449/304131-
Fonte: dc.identifier.urihttp://educapes.capes.gov.br/handle/11449/304131-
Descrição: dc.descriptionHigh throughput screening (HTS) technologies allow the biological testing of hundreds of thousands of compounds per day. Typically, a substantial proportion of the initial hits obtained by HTS are artifacts caused by assay interference. Therefore, global and technology-specific in silico models for identifying and predicting compounds interfering with biological assays have been developed. The global models benefit from training on large screening data sets, while the specialized models benefit from training on assay technology-specific experimental data. In this work, we develop and explore strategies for generating better predictors of technology-specific assay interference by utilizing the large bioactivity data matrices global models are trained on and employing partially new compound labeling approaches to maintain the assay technology awareness of specialized models. We demonstrate the utility of the statistically derived interference labels in machine learning using fluorescence-based assay interference as a representative example. Our random forest and multi-layer perceptron classifiers showed improved performance compared to existing models, achieving Matthews correlation coefficients (MCCs) of up to 0.47 on holdout data and up to 0.45 on an external test set. These results demonstrate that accurate assay-specific interference labels can be derived from large bioactivity data matrices, enabling the development of new machine-learning models without the need for further experimental data.-
Descrição: dc.descriptionDepartment of Pharmaceutical Sciences Division of Pharmaceutical Chemistry Faculty of Life Sciences University of Vienna-
Descrição: dc.descriptionVienna Doctoral School of Pharmaceutical Nutritional and Sport Sciences (PhaNuSpo) University of Vienna-
Descrição: dc.descriptionDepartment of Machine Learning Research Bayer AG-
Descrição: dc.descriptionDepartment of Physics Sao Paulo State University Rua Cristóvão Colombo 2265, São José do Rio Preto, CEP-
Descrição: dc.descriptionChristian Doppler Laboratory for Molecular Informatics in the Biosciences Department for Pharmaceutical Sciences University of Vienna-
Descrição: dc.descriptionDepartment of Physics Sao Paulo State University Rua Cristóvão Colombo 2265, São José do Rio Preto, CEP-
Idioma: dc.languageen-
Relação: dc.relationArtificial Intelligence in the Life Sciences-
???dc.source???: dc.sourceScopus-
Palavras-chave: dc.subjectAssay interfering compounds-
Palavras-chave: dc.subjectBiological assays-
Palavras-chave: dc.subjectFluorescence-
Palavras-chave: dc.subjectHigh-throughput screening-
Palavras-chave: dc.subjectMachine learning-
Título: dc.titleStatistical approaches enabling technology-specific assay interference prediction from large screening data sets-
Tipo de arquivo: dc.typelivro digital-
Aparece nas coleções:Repositório Institucional - Unesp

Não existem arquivos associados a este item.