1996
DOI: 10.1108/eb026966
|View full text |Cite
|
Sign up to set email alerts
|

A Stemming Algorithm for Latin Text Databases

Abstract: This paper describes the design of a stemming algorithm for searching databases of Latin text. The algorithm uses a simple longest‐match approach with some recoding but differs from most stemmers in its use of two separate suffix dictionaries (one for nouns and adjectives and one for verbs) for processing query and database words. These dictionaries and the associated stemming rules are arranged in such a way that the stemmer does not need to know the grammatical category of the word that is being stemmed. It … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0
3

Year Published

1997
1997
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 28 publications
(17 citation statements)
references
References 14 publications
0
13
0
3
Order By: Relevance
“…On the one hand, there has been generally less IR work done in these languages and on the other hand, the application of stemming algorithms requires the implementation of considerable linguistic knowledge, which is not always available. In any case, it is possible to find proposals and algorithms for specific languages, among which are Latin itself, despite its being a dead language [7], Malay [8], French [9], [10] or Arabic [11].…”
mentioning
confidence: 99%
“…On the one hand, there has been generally less IR work done in these languages and on the other hand, the application of stemming algorithms requires the implementation of considerable linguistic knowledge, which is not always available. In any case, it is possible to find proposals and algorithms for specific languages, among which are Latin itself, despite its being a dead language [7], Malay [8], French [9], [10] or Arabic [11].…”
mentioning
confidence: 99%
“…We have also implemented a manual approach in read_ontology that inputs a user‐defined ontology as a text file. To link characters to terms in the ontology, we first simplify terms using the Schinke algorithm (Schinke, Greengrass, Robertson, & Willett, ), useful for Latin terms common in anatomical datasets (e.g. ‘humerus’ becomes ‘humer’).…”
Section: Overview Of the Phenotools Packagementioning
confidence: 99%
“…Even though the Longest-match approach requires the compilation of all possible combinations of suffixes; it has less computational complexity because the arrangement of suffixes in suffix list are in their decreasing order of length and has less time complexity because it involves in single pass of the suffix match. In addition, longest-match approach is often easier to program [5,10].…”
Section: Stemming Approachmentioning
confidence: 99%