Facial expressions recognition in sign language based on a two-stream swin transformer model integrating RGB and texture map images.

Registro completo de metadados
MetadadosDescriçãoIdioma
Autor(es): dc.creatorRamírez Cerna, Lourdes-
Autor(es): dc.creatorRodríguez Melquiades, José Antonio-
Autor(es): dc.creatorEscobedo Cárdenas, Edwin Jonathan-
Autor(es): dc.creatorCámara Chávez, Guillermo-
Autor(es): dc.creatorMiranda, Dayse Garcia-
Data de aceite: dc.date.accessioned2025-08-21T15:56:45Z-
Data de disponibilização: dc.date.available2025-08-21T15:56:45Z-
Data de envio: dc.date.issued2025-08-06-
Data de envio: dc.date.issued2024-
Fonte completa do material: dc.identifierhttps://www.repositorio.ufop.br/handle/123456789/20748-
Fonte completa do material: dc.identifierhttps://www.cys.cic.ipn.mx/ojs/index.php/CyS/article/view/5119/3974-
Fonte completa do material: dc.identifierhttp://dx.doi.org/10.13053/CyS-29-2-5119-
Fonte: dc.identifier.urihttp://educapes.capes.gov.br/handle/capes/1028193-
Descrição: dc.descriptionThe study of facial expressions in sign language has become a significant research area, as these expressions not only convey personal states, but also enhance the meaning of signs within specific contexts. The absence of facial expressions during communication can lead to misinterpretations, underscoring the need for datasets that include facial expressions in sign language. To address this, we present the Facial-BSL dataset, which consists of videos capturing eight distinct facial expressions used in Brazilian Sign Language. Additionally, we propose a two-stream model designed to classify facial expressions in a sign language context. This model utilizes RGB images to capture local facial information and texture map images to record facial movements. We assessed the performance of several deep learning architectures within this two-stream framework, including Convolutional Neural Networks (CNNs) and Vision Transformers. In addition, experiments were conducted using public datasets such as CK+, KDEF-dyn, and LIBRAS. The two-stream architecture based on the Swin Transformer model demonstrated superior performance on the KDEF-dyn and LIBRAS datasets and achieved a second-place ranking on the CK+ dataset, with an accuracy of 97% and an F1-score of 95%.-
Formato: dc.formatapplication/pdf-
Idioma: dc.languageen-
Direitos: dc.rightsrestrito-
Palavras-chave: dc.subjectFacial expressions in sign language-
Palavras-chave: dc.subjectTwo-stream architecture-
Palavras-chave: dc.subjectSwin transformer-
Palavras-chave: dc.subjectSwin transformer-
Título: dc.titleFacial expressions recognition in sign language based on a two-stream swin transformer model integrating RGB and texture map images.-
Aparece nas coleções:Repositório Institucional - UFOP

Não existem arquivos associados a este item.