ISSN: 0186-1042 ISSN-e: 2448-8410
Methodology for the management of scientific literature in Spanish through information retrieval approaches using natural language processing

Versiones

PDF (English)

Palabras clave

information management
natural language processing in spanish texts
text-similarity
semantic and probabilistic approaches
recovering scientific documents

Cómo citar

Padilla Cuevas, J., Cuevas-Rasgado, A. D., Reyes-Ortiz, J. A., & Bravo, M. (2026). Methodology for the management of scientific literature in Spanish through information retrieval approaches using natural language processing. Contaduría Y Administración, 71(2), e552. https://doi.org/10.22201/fca.24488410e.2026.5282 (Original work published 17 de febrero de 2026)

Resumen

The scientific papers are written in natural language, with a significant proportion being in Spanish, and have no structure processable by computers, which results in tedious and time-consuming manual analysis. Thus, managing scientific texts in Spanish is a challenge that requires advanced computational methods. Therefore, this paper presents a novel methodology that includes three Information Retrieval (IR) approaches based on Natural Language Processing (NLP). The main aim is the information management from scientific documents in Spanish. The IR approaches implemented in the methodology are based on textual, probabilistic, and semantic similarity to retrieve documents regarding a question. The proposed methodology is applied to the scientific Spanish literature generated during the COVID-19 pandemic. An evaluation process based on 100 queries over 249,474 scientific documents to accurate the recoverability of relevant documents was carried out. The results show that the probabilistic approach implemented in the methodology achieved an 85% f-measure, supported by the Latent Dirichlet Allocation (LDA) topic discovery algorithm. Finally, the proposed methodology is considered domain-independent to retrieve documents in Spanish.

https://doi.org/10.22201/fca.24488410e.2026.5282
PDF (English)
Creative Commons License

Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.

Derechos de autor 2026 Contaduría y Administración

Descargas

Los datos de descargas todavía no están disponibles.