Abstract:Corpus-based terminology is currently gaining ground on the international front. It is therefore important that terminologists working on the South African Bantu languages not only take note of this development, but that they should also follow this trend, even if they do not have the same measure of access to highly sophisticated software. The aim of this article is therefore to establish whether it is possible to retrieve definitional information on key concepts from untagged, running text by making use of affordable and easily accessible software such as WordSmith Tools. In order to answer this question, a case study is done in Northern Sotho, using textual material on linguistics as basis for a special field corpus. Syntactic and lexical patterns serving as textual markers of definitional information are identified and the success rate of the computational retrieval of definitional information is analysed and evaluated. Attention is also paid to the retrieval of specifically conceptual information, which turned out to be a fortunate by-product of semi-automatic retrieval of definitional information. Finally, it is illustrated how definitional information retrieved can be utilised in the writing of a formal terminological definition. Keywords: TERMINOLOGY, SOUTH AFRICAN BANTU LANGUAGES, DEFINITIONAL INFORMATION, SEMI-AUTOMATIC INFORMATION RETRIEVAL, TERMINOLOGICAL DE-FINITIONS, CONCEPTUAL RELATIONSHIPS, LEXICAL PATTERNS, SYNTACTIC PATTERNS, TEXTUAL MARKERS, KEYWORD-IN-CONTEXT (KWIC), WORDSMITH TOOLSOpsomming: Semi-outomatiese herwinning van definisie-inligting: 'n NoordSothogevallestudie. Korpus-gebaseerde terminologie is tans besig om veld te wen op die internasionale front. Dit is daarom belangrik dat terminoloë wat binne die Suid-Afrikaanse Bantoetale werk, nie net sal kennis neem van hierdie ontwikkeling nie, maar dat hulle ook hierdie neiging sal volg, selfs al het hulle nie dieselfde mate van toegang tot gesofistikeerde rekenaarprogrammatuur nie. Die doel van hierdie artikel is daarom om vas te stel of dit moontlik is om definisie-inligting oor sleutelkonsepte uit ongemerkte, lopende teks te herwin deur bekostigbare en toeganklike sagteware soos WordSmith Tools te gebruik. Ten einde hierdie vraag te beantwoord, is 'n gevallestudie in Noord-Sotho gedoen, met gebruikmaking van teksmateriaal oor die linguistiek as basis vir 'n gespesialiseerde korpus. Sintaktiese en leksikale patrone wat as tekstuele merkers van defini-*
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.