Evaluating the Effectiveness of the Lexrank and LSA Algorithm in Automatic Text Summarization for Indonesian Language

Galih Wiratmoko

doi:10.59188/eduvest.v5i2.1663

Authors

Galih Wiratmoko Universitas Muhamadiyah Surakarta

DOI:

https://doi.org/10.59188/eduvest.v5i2.1663

Keywords:

Automatic text summarization, Latent sematic analysis, Lexrank, Bahasa Indonesia

Abstract

The aim of this study is to evaluate how effective the Lexrank algorithm and Latent semantic analysis (LSA) are in automatic text summarization for the Indonesian language. This research focuses on natural language processing and handling of excessive data. We applied both algorithms to generate text summaries using the INDOSUM dataset, which contains about 20,000 news articles in Indonesian with manual summaries. To assess performance, the ROUGE metric was used, which includes aspects of precision, recall, and F1 score. In all tested metrics, LSA outperformed Lexrank. LSA had a precision of 0.57, recall of 0.67, and an F1 score of 0.59, whereas Lexrank had a precision of 0.46, recall of 0.52, and an F1 score of 0.48. These result indicate that LSA is better at gathering important information from the original text than Lexrank.

References

Bhuyan, S. S., Mahanta, S. K., Pakray, P., & Favre, B. (2023). Textual entailment as an evaluation metric for abstractive text summarization. Natural Language Processing Journal, 4, 100028.

Dhivyaa, C. R., Nithya, K., Janani, T., Kumar, K. S., & Prashanth, N. (2022). Transliteration based generative pre-trained transformer 2 model for Tamil text summarization. 2022 International Conference on Computer Communication and Informatics (ICCCI), 1–6.

El-Kassas, W. S., Salama, C. R., Rafea, A. A., & Mohamed, H. K. (2021). Automatic text summarization: A comprehensive survey. Expert Systems with Applications, 165, 113679.

Fan, J., Tian, X., Lv, C., Zhang, S., Wang, Y., & Zhang, J. (2023). Extractive social media text summarization based on MFMMR-BertSum. Array, 20, 100322.

Gunawan, F. E., Juandi, A. V., & Soewito, B. (2015). An automatic text summarization using text features and singular value decomposition for popular articles in Indonesia language. 2015 International Seminar on Intelligent Technology and Its Applications (ISITIA), 27–32.

Hernández-Castañeda, Á., García-Hernández, R. A., Ledeneva, Y., & Millán-Hernández, C. E. (2020). Extractive automatic text summarization based on lexical-semantic keywords. IEEE Access, 8, 49896–49907.

Khan, B., Shah, Z. A., Usman, M., Khan, I., & Niazi, B. (2023). Exploring the landscape of automatic text summarization: a comprehensive survey. IEEE Access.

Kumar, Y., Kaur, K., & Kaur, S. (2021). Study of automatic text summarization approaches in different languages. Artificial Intelligence Review, 54(8), 5897–5929.

Kurniawan, K., & Louvan, S. (2018). Indosum: A new benchmark dataset for indonesian text summarization. 2018 International Conference on Asian Language Processing (IALP), 215–220.

Madhuri, J. N., & Kumar, R. G. (2019). Extractive text summarization using sentence ranking. 2019 International Conference on Data Science and Communication (IconDSC), 1–3.

Shah, P., & Desai, N. P. (2016). A survey of automatic text summarization techniques for Indian and foreign languages. 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), 4598–4601.

Wahab, M. H. H., Ali, N. H., Hamid, N. A. W. A., Subramaniam, S. K., Latip, R., & Othman, M. (2023). A review on optimization-based automatic text summarization approach. IEEE Access, 12, 4892–4909.

Wang, M., Xie, P., Du, Y., & Hu, X. (2023). T5-based model for abstractive summarization: A semi-supervised learning approach with consistency loss functions. Applied Sciences, 13(12), 7111.

Widyassari, A. P., Affandy, A., Noersasongko, E., Fanani, A. Z., Syukur, A., & Basuki, R. S. (2019). Literature review of automatic text summarization: research trend, dataset and method. 2019 International Conference on Information and Communications Technology (ICOIACT), 491–496.

Wu, G.-H., & Guo, Y.-T. (2015). An enhanced LSA-based approach for update summarization. 2015 12th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), 493–497.

Wu, K., Shi, P., & Pan, D. (2015). An approach to automatic summarization for chinese text based on the combination of spectral clustering and LexRank. 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 1350–1354.

Evaluating the Effectiveness of the Lexrank and LSA Algorithm in Automatic Text Summarization for Indonesian Language

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Current Issue

Information

Language

Browse