Oliver Hellwig scite author profile

Oliver Hellwig

5Publications

63Citation Statements Received

62Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Zurich, Heidelberg University, Heinrich Heine University Düsseldorf

Publications

Order By: Most citations

Sanskrit Word Segmentation Using Character-level Recurrent and Convolutional Neural Networks

Hellwig¹,

Nehrdich²

2018

View full text Add to dashboard Cite

The paper introduces end-to-end neural network models that tokenize Sanskrit by jointly splitting compounds and resolving phonetic merges (Sandhi). Tokenization of Sanskrit depends on local phonetic and distant semantic features that are incorporated using convolutional and recurrent elements. Contrary to most previous systems, our models do not require feature engineering or extern linguistic resources, but operate solely on parallel versions of raw and segmented text. The models discussed in this paper clearly improve over previous approaches to Sanskrit word segmentation. As they are language agnostic, we will demonstrate that they also outperform the state of the art for the related task of German compound splitting.

show abstract

Evaluating Neural Morphological Taggers for Sanskrit

Gupta¹,

Krishna²,

Goyal³

et al. 2020

View full text Add to dashboard Cite

Neural sequence labelling approaches have achieved state of the art results in morphological tagging. We evaluate the efficacy of four standard sequence labelling models on Sanskrit, a morphologically rich, fusional Indian language. As its label space can theoretically contain more than 40,000 labels, systems that explicitly model the internal structure of a label are more suited for the task, because of their ability to generalise to labels not seen during training. We find that although some neural models perform better than others, one of the common causes for error for all of these models is mispredictions due to syncretism. 1

show abstract

An NLP-based cross-document approach to narrative structure discovery

Reiter

Frank

Hellwig

2014

Lit Linguist Computing

View full text Add to dashboard Cite

SanskritTagger: A Stochastic Lexical and POS Tagger for Sanskrit

Hellwig

2009

View full text Add to dashboard Cite

SanskritTagger is a stochastic tagger for unpreprocessed Sanskrit text. The tagger tokenises text with a Markov model and performs part-of-speech tagging with a Hidden Markov model. Parameters for these processes are estimated from a manually annotated corpus of currently about 1.500.000 words. The article sketches the tagging process, reports the results of tagging a few short passages of Sanskrit text and describes further improvements of the program. The article describes design and function of SanskritTagger, a tokeniser and part-of-speech (POS) tagger, which analyses "natural", i.e. unannotated Sanskrit text by repeated application of stochastic models. This tagger has been developped during the last few years as part of a larger project for digitalisation of Sanskrit texts (cmp. (Hellwig, 2002)) and is still in the state of steady improvement. The article is organised as follows: Section 1 gives a short overview about linguistic problems found in Sanskrit texts which influenced the design of the tagger. Section 2 describes the actual implementation of the tagger. In section 3, the performance of the tagger is evaluated on short passages of text from different thematic areas. In addition, this section describes possible improvements in future versions.

show abstract

Wörterbuch der mittelalterlichen indischen Alchemie

Hellwig¹

2009

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Oliver Hellwig

Sanskrit Word Segmentation Using Character-level Recurrent and Convolutional Neural Networks

Evaluating Neural Morphological Taggers for Sanskrit

An NLP-based cross-document approach to narrative structure discovery

SanskritTagger: A Stochastic Lexical and POS Tagger for Sanskrit

Wörterbuch der mittelalterlichen indischen Alchemie

Contact Info

Product

Resources

About