This research aimed to prepare guidelines for authors by investigating forms and functions of keywords assigned by authors in theses and dissertations defended in 2023 in the Graduate Program in Information Science at Unesp. The exploratory and descriptive study utilized a sample collected in the Unesp Institutional Repository. A corpus of 31 theses and 14 dissertations submitted to the Unesp Institutional Repository comprised a total of 183 keywords in Portuguese without duplicates and an average of 4.7 keywords, considering 213 keywords with duplicates. The analysis results initially identified that the Repository has a tutorial on using the Unesp Thesaurus to control vocabulary and that the authors use natural language to assign keywords. The findings reveal that, out of the 183 keywords, 89 (48%) are exclusive, singular and specific to the area of Information Science, candidates for descriptors in the Unesp Thesaurus. The other 94 keywords (51.3%) have 40 (21.3%) exact descriptors, and the other 54 (29.5%) present forms and functions that serve as examples for inclusion in the tutorial instructions. Based on the results obtained, it is concluded that the percentage of 21% overlap between keywords and descriptors reveals that the Unesp Thesaurus was consulted by the authors when filling out keyword metadata and that the low number of exact descriptors and exclusive keywords indicate that they need to be included as new terms. It is recommended, therefore, to define an Indexing Policy that considers the need for hybrid coexistence between natural language and vocabulary control.
- Theses and dissertations,
- Keywords,
- Author-supplied keywords,
- Controlled vocabularies,
- institutional repository
Cette recherche visait à préparer des lignes directrices pour les auteurs en étudiant les formes et les fonctions des mots-clés attribués par les auteurs dans les thèses et mémoires soutenus en 2023 dans le programme d'études supérieures en sciences de l'information de l'Unesp. L'étude, à la fois exploratoire et descriptive, a utilisé un échantillon collecté dans le référentiel institutionnel de l'Unesp. Un corpus de 31 thèses et 14 mémoires soumis au Dépôt institutionnel de l'Unesp a été composé avec un total de 183 mots-clés en portugais sans doublons, et une moyenne de 4,7 mots-clés en considérant un total de 213 mots-clés avec doublons. Les résultats de l'analyse ont initialement identifié que le référentiel dispose d'un didacticiel avec des instructions sur la façon d'utiliser le thésaurus de l'Unesp pour contrôler le vocabulaire et que les auteurs utilisent le langage naturel pour attribuer des mots-clés. Les résultats révèlent que, sur les 183 mots-clés, 89 (48%) sont exclusifs, singuliers et spécifiques au domaine des sciences de l'information, candidats aux descripteurs du Thésaurus de l'Unesp. Les 94 autres mots-clés (51,3 %) ont 40 (21,3 %) descripteurs exacts, et les 54 autres (29,5 %) présentent des formes et des fonctions qui servent d'exemples à inclure dans les instructions du didacticiel. Sur la base des résultats obtenus, on conclut que le pourcentage de chevauchement de 21% entre mots-clés et descripteurs révèle que le Thésaurus de l'Unesp a été consulté par les auteurs lors du remplissage des métadonnées des mots-clés et que le faible nombre de descripteurs exacts et de mots-clés exclusifs indiquent qu'ils ont besoin à inclure comme nouveaux termes. Il est donc recommandé de définir une politique d'indexation qui prenne en compte la nécessité d'une coexistence hybride entre le langage naturel et le contrôle du vocabulaire.
Mots-clés :
- Thèses et mémoires,
- Mots-clés,
- Mots-clés fournis par l'auter,
- Vocabulaires contrôlés
- Ercan, G., & Cicekli, I. (2007). Using lexical chains for keyword extraction. Information Processing and Management, 43, 1705-1714. https://doi.org/10.1016/j.ipm.2007.01.015
- Freitas, M. P., & Dal´Evedove, P. R. (2019). Consistência na indexação por atribuição no repositório institucional da UFSCAR. In XX Encontro Nacional de Pesquisa Em Ciência da Informação. Florianópolis: Universidade Federal de Santa Catarina. https://conferencias.ufsc.br/index.php/enancib/2019/paper/view/1203/811
- Fujita, M. S. L. (2024). Analysis of the functions of Keywords assigned by authors in scientific publications in events and journals. Digital Journal of Library and Information Science, Campinas, SP. v.22, e024020, 2024. https://periodicos.sbu.unicamp.br/ojs/index.php/rdbci/article/view/8676208/en
- Fujita, M. S. L. & Panuto, J. C. (2024) Guidelines on assigning the subjects of theses and dissertations in repositories. IFLA Journal, 50(1), 160-9, 2024. https://doi.org/10.1177/03400352231217275
- Fujita, M. S. L., & Tartarotti, R. Dal´E. (2020). Análise de palavras-chave da produção científica de pesquisadores: o autor como indexador. Informação & Informação, 25(3), 332 – 374. http://www.uel.br/revistas/informacao
- Gil-Leiva, I., & Alonso-Arroyo, A. (2007). Keywords given by authors of scientific articles in database descriptors. Journal of the American Society for Information Science & Technology, 58(8), 1175–1187. https://doi.org/10.1002/asi.20595
- Gil-Leiva, I., & Alonso-Arroyo, A. (2007). Keywords given by authors of scientific articles in database descriptors. Journal of the American Society for Information Science & Technology, 58(8), 1175–1187. https://doi.org/10.1002/asi.20595
- Golub, K.,Tyrkkö, J., Hansson, J., & Ahlström, I. (2020). Subject indexing in humanities: a comparison between a local university repository and an international bibliographic service. https://doi.org/10.1108/JD-12-2019-0231
- Gonçalves, A. L. (2008). Uso de resumos e palavras-chave em Ciências Sociais: uma avaliação.Encontros Bibli: revista eletrônica de biblioteconomia e ciência da informação, 13(26), 78-93. https://doi.org/10.5007/1518-2924.2008v13n26p78
- Han, M-J. K., Harrington, P., Black, A., & Kudeki, D. (2016). Aligning author-supplied keywords for ETDS with domain-specific controlled vocabularies. In: Classification & Indexing Satellite Conference, 2016 (pp. 1-10). http://hdl.handle.net/2142/97879
- Holstrom, C. (2019). Moving Towards an Actor-Based Model for Subject Indexing. NASKO: North American Symposium on Knowledge Organization 7(1), 120-128. https://doi.org/10.7152/nasko.v7i1.15631
- Khatir, A., & Ganjefar, S. (2018). The analysis of the distribution and focus of keywords in theses and dissertations and compliance with descriptors, title, and abstract. Iranian Journal of Information Processing and Management, 34(1) pp.411-428. https://www.academia.edu/106472352/The_Analysis_of_the_Distribution_and_Focus_of_Keywords_in_Theses_and_Dissertations_and_Compliance_with_Descriptors_Title_and_Abstract
- Kipp, M. (2009). User, author and professional indexing in context: an exploration of tagging practices on CiteULike. Canadian Journal of Information and Library Science, 35(1), 1-41.
- Lancaster, F. W. (2003) Indexing and abstracting in theory and practice. 3rd ed. Facet Publishing.
- Li, M. (2018). Classifying and ranking topic terms based on a novel approach: role differentiation of author keywords. Scientometrics, 116, 77–100. https://doi.org/10.1007/s11192-018-2741-7
- Lardera, M., & Hjørland, B. (2020). Keyword. In: Hjørland, B. & Gnolli, C. (2020) Encyclopedia of knowledge organization. https://www.isko.org/cyclo/keyword
- Lu, W., Li, X., Zhifeng, L. & Cheng, Q. (2019). How do author-selected keywords function semantically in scientific manuscripts? Knowledge Organization, 46(6), 403-18 https://doi.org/10.5771/0943-7444-2019-6-402.
- Mathes, A. (2004). Folksonomies – cooperative classification and communication through shared metadata [Online Report]. Journal of Computer-Mediated Communication, 47. https://adammathes.com/academic/computer-mediated-communication/folksonomies.pdf
- Maurer, M.B. & Shakeri, S. (2016). Disciplinary differences: LCSH and keyword assignment for ETDs from different disciplines. Cataloging & Classification Quarterly, 54(4), 213-243. https://doi.org/10.1080/01639374.2016.1141133
- Névéol, A., Doğan, R. I., & Zhiyong, L. (2010). Author keywords in biomedical journal articles. AMIA 2010 Symposium Proceedings. p.537-541. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041277/
- Phillips, M. E., Tarver, H., & Zavalina, O. L. (2019). Using metadata record graphs to understand controlled vocabulary and keyword usage for subject representation in the UNT theses and dissertations collection. Cadernos BAD, (1), 61-76. http://hdl.handle.net/20.500.11959/brapci/134597
- Schwing, T., McCutcheon, S., & Maurer, M. B. (2012). Uniqueness Matters: LCSH and Keywords in the Library Catalog's ETD Records. Cataloging and Classification Quarterly, 50(8), 903-928. https://doi.org/10.1080/01639374.2012.703164
- Strader, C. R. (2009) Author-assigned keywords versus Library of Congress Subject Headings: implications for the cataloguing of electronic theses and dissertations. Library Resources & Technical Services, 53(4), 243-50. https://doi.org/10.5860/lrts.53n4.243
- Terra, A. L., Agustín Lacruz, C., Bernardes, Ó., Fujita, M. S. L. & Bueno De La Fuente, G. (2021). Subject-access metadata on ETD supplied by authors: A case study about keywords, titles and abstracts in a Brazilian academic repository. Journal of Academic Librarianship, 47, 102268. https://doi.org/10.1016/j.acalib.2020.102268
- UNESP (n.d.). Rede de Bibliotecas da Unesp. Comissão Permanente do Tesauro Unesp. Tutorial para uso do tesauro Unesp. Unesp.
- Wolverton, R. E., Hoover, L., & Fowler, R. (2011). Subject Analysis of Theses and Dissertations: A Survey. Technical Services Quarterly, 28(2), 208–209. https://doi.org/10.1080/07317131.2011.546276
- Yi-Fang, B.W., & Quanzhi, L. (2008). Document keyphrases as subject metadata: incorporating document key concepts in search results. Information Retrieval, 11, 229–49, 2008 https://doi.org/ 10.1007/s10791-008-9044-1
- Zhang, J., Yu, Q. Zheng, F., Long, C., Lu, Z., & Duan, Z. (2016). Comparing keywords plus of WOS and author. keywords: a case study of patient adherence research. Journal of the Information Science and Technology, 67(4) 967–972. https://doi.org/10.1002/asi.23437