Categorization and Integration of Opinion Columns Content in Web Pages Applying Natural Language Processing Techniques
Main Article Content
The application of Natural Language Processing techniques for text analysis is presented, describing the process carried out from data extraction to the identification and detection of opinions automatically. The texts analyzed were opinion columns that reflect the criteria of people on current issues. The foregoing to provide an agile way to identify topics of interest in the community to provide those interested in a summary of what is expressed on these topics. For this purpose, an algorithm was implemented that allows information to be extracted accurately and cleanly from web pages and later another algorithm that oversees carrying out the automatic categorization of the extracted information, generating an accurate summary of the main topics in each writing.
Moreno A. [Internet] Procesamiento del lenguaje natural ¿qué es?, 2023. Disponible en: https://www.iic.uam.es/inteligencia/que-es-procesamiento-del-lenguaje-natural/
Kaur G, Sharma A. A deep learning-based model using hybrid feature extraction approach for consumer sentiment analysis. Journal of Big Data. 2023; 10(1):10-18. https://doi.org/10.1186/s40537-022-00680-6 DOI: https://doi.org/10.1186/s40537-022-00680-6
Haque R, Islam N, Tasneem M, Das AK. Multi-class sentiment classification on Bengali social media comments using machine learning. International Journal of Cognitive Computing in Engineering. 2023; 4: 21-35. https://doi.org/10.1016/j.ijcce.2023.01.001 DOI: https://doi.org/10.1016/j.ijcce.2023.01.001
Martínez N, Téllez J, Barrero J, Chaves L. Automatic method for the prediction of the commercial appraisal of a property in Bogota city. 7th Congreso Internacional de Innovación y Tendencias En Ingeniería. 2021. https://doi.org/10.1109/CONIITI53815.2021.9619685 DOI: https://doi.org/10.1109/CONIITI53815.2021.9619685
Báez P, Arancibia AP, Chaparro MI, Bucarey T, Núñez F, Dunstan J. Natural language processing for clinical text in Spanish: The case of waiting lists in Chile. Revista Médica Clínica Las Condes. 2022; 33(6): 576-582. https://doi.org/10.1016/j.rmclc.2022.10.002 DOI: https://doi.org/10.1016/j.rmclc.2022.10.002
Garrido-Muñoz I, Montejo-Ráez A, Martínez-Santiago F. Exploring gender bias in Spanish deep learning models. CEUR Workshop Proceedings. 2022; 3224: 44-47
Wang J, Li J, Zhang Y. Text3D: 3D Convolutional Neural Networks for Text Classification. Electronics (Switzerland). 2023; 12(14):e87. https://doi.org/10.3390/electronics12143087 DOI: https://doi.org/10.3390/electronics12143087
Gouthami S, Hegde NP. An improved sentiment classification model using BERT classification with ranger Adabelief Optimizer. Journal of Theoretical and Applied Information Technology. 2023; 101(12): 5102-5113.
Catelli R, Pelosi S, Comito C, Pizzuti C, Esposito M. Lexicon-based sentiment analysis to detect opinions and attitude towards COVID-19 vaccines on Twitter in Italy. Computers in Biology and Medicine, 2023; 158:e106876. https://doi.org/10.1016/j.compbiomed.2023.106876 DOI: https://doi.org/10.1016/j.compbiomed.2023.106876
Yang Z, Zhang L, Wang X, Mai Y. ESG Text Classification: An Application of the Prompt-Based Learning Approach. Journal of Financial Data Science. 2023; 5(1): 47-57. https://doi.org/10.3905/jfds.2022.1.115 DOI: https://doi.org/10.3905/jfds.2022.1.115
De Santis E, Rizzi A. Prototype Theory Meets Word Embedding: A Novel Approach for Text Categorization via Granular Computing. Cognitive Computation. 2023; 15(3): 976-997. https://doi.org/10.1007/s12559-023-10132-9 DOI: https://doi.org/10.1007/s12559-023-10132-9
Siddiqui T, Amer, A. A comprehensive review on text classification and text mining techniques using spam dataset detection. Mathematics and Computer Science. 2024; 2: 1-18. https://doi.org/10.1002/9781119896715.ch1 DOI: https://doi.org/10.1002/9781119896715.ch1
Das RK, Islam M, Khushbu SA. BTSD: A curated transformation of sentence dataset for text classification in Bangla language. Data in Brief. 2023; 50:e109445. https://doi.org/10.1016/j.dib.2023.109445 DOI: https://doi.org/10.1016/j.dib.2023.109445
Bi H, Li B, Qiu Y, Change M. EnvText: A Chinese text mining tool for environmental domain with advanced BERT model. Software Impacts. 2023; 17:e100559. https://doi.org/10.1016/j.simpa.2023.100559 DOI: https://doi.org/10.1016/j.simpa.2023.100559
Palai P, Agrawal K, Mishra DP, Salkuti SR. Text grouping: a comprehensive guide. IAES International Journal of Artificial Intelligence. 2023; 12(3): 1476-1483. https://doi.org/10.11591/ijai.v12.i3.pp1476-1483 DOI: https://doi.org/10.11591/ijai.v12.i3.pp1476-1483
Fonseca CA, de Souza Netto RS, Bodolay AN, Carvalho Guelpeli MV. AnoTex: Structured data filtering routine of the scientific article genre as contribution to PLN. Texto Livre. 2018; 11(3): 40-64. https://doi.org/10.17851/1983-3652.11.3.40-64 DOI: https://doi.org/10.17851/1983-3652.11.3.40-64
Accepted 2023-09-06
Published 2023-06-26
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors grant the journal and Universidad del Valle the economic rights over accepted manuscripts, but may make any reuse they deem appropriate for professional, educational, academic or scientific reasons, in accordance with the terms of the license granted by the journal to all its articles.
Articles will be published under the Creative Commons 4.0 BY-NC-SA licence (Attribution-NonCommercial-ShareAlike).