Information extraction from legal texts: the potential of discourse analysis

Moens, Marie‐Francine; Uyttendaele, Caroline; Dumortier, Jos

doi:10.1006/ijhc.1999.0296

Cited by 24 publications

(17 citation statements)

References 20 publications

(25 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…It allows representation of text structure in the form of a text specific grammar. The use of a text grammar is appealing for several reasons [6,19], the most notable being that many text types can be decomposed into a limited set of constituents that combine with one another in regular ways.…”

Section: Knowledge Representationmentioning

confidence: 99%

See 1 more Smart Citation

Use of a text grammar for generating highlight abstracts of magazine articles

Moens

Dumortier²

2000

Journal of Documentation

Self Cite

View full text Add to dashboard Cite

Browsing a database of article abstracts is one way to select and buy relevant magazine articles online. Our research contributes to the design and development of text grammars for abstracting texts in unlimited subject domains. We developed a system that parses texts based on the text grammar of a specific text type and that extracts sentences and statements which are relevant for inclusion in the abstracts. The system employs knowledge of the discourse patterns that are typical of news stories. The results are encouraging and demonstrate the importance of discourse structures in text summarisation.

show abstract

Section: Knowledge Representationmentioning

confidence: 99%

“…It is acknowledged that discourse structures are important when analysing text for summarisation [4][5][6]. The typical discourse patterns of news stories can be employed for abstracting magazine articles.…”

Section: Linguistic Backgroundmentioning

confidence: 99%

Use of a text grammar for generating highlight abstracts of magazine articles

Moens

Dumortier²

2000

Journal of Documentation

Self Cite

View full text Add to dashboard Cite

show abstract

“…A similar kind of exploitation of genre-specific structural conventions can be found in the SALOMON system of Moens et al (1999), which extracts relevant information from criminal cases (with the eventual goal of producing short indicative summaries). The system makes use of the fact that these cases have a highly conventionalized functional structure in which for example victim and perpetrator are identified in text segments preceding the one in which the alleged offences and the opinion of the court are detailed.…”

Section: Information Extractionmentioning

confidence: 99%

“…Mizuta et al (2006) use a flat discourse structure based on the discourse zoning of Teufel and Moens (2002) for IE from biology articles. While Moens et al (1999) assume that their legal texts have a hierarchical discourse structure that can be described in terms of a text grammar (Kintsch and van Dijk 1978), their work on IE from legal texts only use its sequential upper level. In contrast, Maslennikov and Chua's (2007) IE approach uses a hierarchical discourse structure.…”

Section: Information Extractionmentioning

confidence: 99%

“…One can identify very general communicative roles or intentions -e.g., assist another element in fulfilling its communicative role(s) -or more specific ones -e.g., present a conclusion drawn from previous discourse elements. There are also communicative roles that are specific to particular genrese.g., identify the alleged offenses in textual descriptions of criminal cases (Moens, Uyttendaele and Dumortier 1999), and present advantage of methodology in scientific research papers . In fact, one aspect of many textual genres is their conventionalized high-level functional structure.…”

Section: Functionsmentioning

confidence: 99%

See 1 more Smart Citation

Discourse structure and language technology

2011

View full text Add to dashboard Cite

An increasing number of researchers and practitioners in Natural Language Engineering face the prospect of having to work with entire texts, rather than individual sentences. While it is clear that text must have useful structure, its nature may be less clear, making it more difficult to exploit in applications. This survey of work on discourse structure thus provides a primer on the bases of which discourse is structured along with some of their formal properties. It then lays out the current state-of-the-art with respect to algorithms for recognizing these different structures, and how these algorithms are currently being used in Language Technology applications. After identifying resources that should prove useful in improving algorithm performance across a range of languages, we conclude by speculating on future discourse structure-enabled technology.

show abstract

First steps in building a model for the retrieval of court decisions

Moens¹,

Busser²

2002

International Journal of Human-Computer Studies

Self Cite

View full text Add to dashboard Cite

The MOSAIC project investigates a retrieval model for court decisions based on structured and unstructured (natural language) information in legal cases. This paper focuses on how relevant information in court decisions can function as a key for retrieval and on the automated construction of case representations. Techniques of automated concept learning and rhetorical structure identification are among the most promising ones. # Automated retrieval from large document collections was one of the earliest applications of computer science to law. In 1961 the US Air Force contracted with the University of Pittsburgh for building a full text retrieval system for legal documents. As a result, finding legal information through electronics (FLITE) system saw its first productive use in 1964. A few years later, the US Department of Justice developed the JURIS system, which has been in use since 1971. More recently, commercial systems such as LEXIS-NEXIS and Westlaw, which offer interactive retrieval through terminals at the customer's office, have gained widespread acceptance in circles of legal professionals. In the European Union, there are many databases of court decisions, most of which can be consulted by any citizen via the World Wide Web. Present-day retrieval systems allow users to express their query with a set of key terms. In some systems, key terms can be used in combination with Boolean operators. The result of such a search is a list of documents. These are usually sorted by 'relevance', which most of the time is simply computed as a function of the frequency of occurrence of the search terms in the documents. Documents are returned as being relevant if they contain the query terms or if keywords that were manually assigned to them match the query terms. Thus, current commercial retrieval systems for searching court decisions either rely on a manually built index of cases or on a full-text search. In manual indexing, texts or textual passages are linked to the concepts of a pre-defined thesaurus or classification scheme; case texts are linked by means of citation links or common descriptors. The disadvantages of manual indexing are the huge cost}a problem that will only aggravate with the current growth of the number of cases}and the large amount of inconsistencies between different indexers. In a full-text search, each term (except for stopwords z ) acts as a search key. The major disadvantage of a full-text search is that its retrieval results are quite unreliable, since the occurrence of a particular key word or key phrase in a text is no guarantee that the text is a relevant output for the search request. This is a fundamental and rather problematic issue, which bears on most retrieval tasks but is especially important when retrieving legal cases. You cannot search for meaning in a case solely by taking into account word occurrences. Another disadvantage of full-text searches is that they often overwhelm users with documents that are not or only marginally relevant.z Stopwords are small words}like articl...

show abstract

Information extraction from legal texts: the potential of discourse analysis

Cited by 24 publications

References 20 publications

Use of a text grammar for generating highlight abstracts of magazine articles

Use of a text grammar for generating highlight abstracts of magazine articles

Discourse structure and language technology

First steps in building a model for the retrieval of court decisions

Contact Info

Product

Resources

About