Résumés
Résumé
À partir d’une analyse des 829 résumés et titres publiés de la revue Management international de 2009 à 2023, cette note de recherche souligne le défi que représente l’interprétation de données textuelles non structurées et propose un outil d’analyse automatisé. L’étude utilise la modélisation des thèmes pour dévoiler les structures thématiques cachées à l’aide de la méthode LDA (Latent Dirichlet Allocation), contribuant à l’affinement du cadre thématique de la revue. Il en ressort les étapes de pré-traitement, la validation et la visualisation des données comme des aspects cruciaux de la conduite d’une analyse avec cette méthode. Cette étude propose un ensemble de bonnes pratiques en matière de modélisation thématique permettant d’identifier les tendances afin d’informer et éventuellement actualiser les stratégies éditoriales.
Mots-clés :
- analyse de données textuelles,
- modélisation des thèmes,
- LDA,
- analyse de contenu,
- traitement du langage naturel (NLP),
- nuages de mots
Abstract
Analysing 829 abstracts and articles published in Management international over the 2009-2023 period, this research highlights the difficulties of interpreting unstructured textual data and suggests in response a tool capable of providing automated analysis. It also uses Latent Dirichlet Allocation (LDA) theme modelling to uncover hidden structures and achieve a more granular understanding of the thematic framework within which the journal has operated. The spotlight here is on data pre-processing, validation and visualisation, all crucial aspects of the types of analyses that become feasible when this method is used. The paper ends by suggesting a thematic modelling best practice that should make it possible to identify major and minor trends in order that future editorial strategies may be better informed and potentially more cutting-edge in nature.
Keywords:
- textual data analytics,
- thematic modelling,
- LDA,
- contents analysis,
- natural language processing,
- word clouds
Resumen
Sobre la base de un análisis de 829 resúmenes y títulos publicados en la revista Management international entre 2009 y 2023, este informe de investigación pone de relieve el desafío de interpretar datos textuales no estructurados y propone una herramienta de análisis automatizada. El estudio utiliza la modelización de temas para descubrir estructuras temáticas ocultas sirviéndose del método de la Asignación Latente de Dirichlet (ALD), lo que contribuye al ajuste del marco temático de la revista. Se destacan las etapas de preprocesamiento, validación y visualización de datos como aspectos cruciales para realizar un análisis con este método. Este estudio propone un conjunto de mejores prácticas en materia de modelización temática para identificar tendencias con el fin de informar y, eventualmente, actualizar las estrategias editoriales.
Palabras clave:
- análisis de datos textuales,
- modelización de temas,
- modelización temática,
- LDA,
- análisis de contenido,
- procesamiento del lenguaje natural (NLP),
- nubes de palabras
Parties annexes
Bibliographie
- Barès, F. & Alie, G. (2024). Évolution de la revue Management international : Détection et analyse des communautés des articles publiés entre 2009 et 2023. Management international, 28(2), 134-141. https://doi.org/10.59876/a-h4mp-0kf2
- Blei, D. M., Ng, A.Y. & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
- Blei, D. M. & Lafferty, J. D. (2009). Visualizing Topics with Multi-Word Expressions. arXiv :0907.1013[stat.ML]. https://doi.org/10.48550/arXiv.0907.1013
- Denny, M. & Spirling, A. (2017). Text Preprocessing For Unsupervised Learning: Why it Matters, When it Misleads, And What To Do About It. Political Analysis Dataverse: Harvard Dataverse, V1. http://dx.doi.org/10.2139/ssrn.2849145
- Grimmer, J. & Stewart, B. M. (2013). Text as Data : The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts. Political Analysis, 21(03), 267-297. https://doi.org/10.1093/pan/mps028
- Han, J., Kamber, M. & Pei, J. (2012). Data Mining: Concepts and Techniques (Third edition). Morgan Kaufmann.
- Hopkins, D. J. & King, G. (2010). A Method of Automated Nonparametric Content Analysis for Social Science. American Journal of Political Science, 54(1), 229-247. https://doi. org/10.1111/j.1540-5907.2009.00428.x
- Kantardzic, M. (2011). Data Mining: Concepts, Models, Methods, and Algorithms (2nd Edition). John Wiley & Sons Ltd., New Jersey. http://dx.doi.org/10.1002/9781118029145
- Krishnan, A. (2023). Exploring the Power of Topic Modeling Techniques in Analyzing Customer Reviews: A Comparative Analysis. arXiv:2308.11520[cs.CL]. https://doi.org/10.48550/arXiv.2308.11520
- Liu, B. (2011). Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Second edition). Springer Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19460-3
- Parlina, A. & Kusumarani, R. (2023). A Latent Dirichlet Allocation – Based bibliometric exploration of tp-3 journals in management information system. Article in Jurnal Studi Komunikas dan Media, 27(1). https://doi.org/10.17933/jskm.2023.5082
- Piepenbrink, A. & Gaur, A. (2017). Topic models as a novel approach to identify themes in content analysis. Conference Paper in Academy of Management Proceedings, vol. 2017(1). https://doi.org/10.5465/AMBPP.2017.141
- Ramage, D., Hall, D., Nallapati, R. & Manning, C. D. (2009). Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. Proc. of the Conf. on Empirical Methods in Natural Language Processing, 248-256. https://aclanthology.org/D09-1026
- Ramage, D., Rosen, E., Chuang, J., Manning, C. D. & McFarland, D. A. (2009). Topic Modeling for the Social Sciences. Computer Science Department School of Education, Stanford university.
- Sievert, C. & Shirley, K. (2014). LDAvis: A method for visualizing and interpreting topics. In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, p. 63-70. Association for Computational Linguistics. https://doi.org/10.3115/v1/W14-3110
- Van Rossum, G. & Drake, F. L. (2011). The python language reference manual. Network Theory Ltd.