Genevieve Lallich scite author profile

Genevieve Lallich

2Publications

1Citation Statement Received

16Citation Statements Given

How they've been cited

How they cite others

Affiliations

Pierre Mendès-France University, Claude Bernard University Lyon 1

Publications

Order By: Most citations

The deconstruction of a text: the permanence of the generalized Zipf law—the inter-textual relationship between entropy and effort amount

2015

View full text Add to dashboard Cite

Zipf's law has intrigued people for a long time. This distribution models a certain type of statistical regularity observed in a text. George K. Zipf showed that, if a word is characterised by its frequency, then, rank and frequency are not independent and approximately verify the relationship:Rank Â frequency % constant Various explanations have been advanced to explain this law. In this article, we talk about the Mandelbrot process, which includes two very different approaches. In the first approach, Mandelbrot studies language generation as the transmission of a signal and bases it on information theory, using the entropy concept. In the second, geometric approach, he draws a parallel with the fractal theory, where each word of the text is a sequence of characters framed by two separators, meaning a simple geometric pattern. This leads us to hypothesise that, since the statistical regularities observed have several possible explanations, Zipf's law carries other patterns. To verify this hypothesis, we chose a text, which we modified and degraded in several successive stages. We called T i the text degraded at step i. We then segmented T i into words. We found that rank and frequency were not independent and approximately verified the relationship:The coefficient b i increases with each step i. We call Eq. (1) the generalized Zipf law. We found statistical regularities in the deconstruction of the text. We notably observed a linear relationship between the entropy H i and the amount of effort E i of the various degraded texts T i . To verify our assumptions, we degraded a text of approximately 200 pages. At each step, we calculated various parameters such as entropy, the amount of effort, and the

show abstract

Talisman

Stefanini

Berrendonner

Lallich

et al. 1992

View full text Add to dashboard Cite

Natural language processing raises the problem of ambiguities and multiple solutions which follow frnm them. The knowledge gained when using the morphosyutactic atmlyser CRISSTAL showed how necessary it was to overcome this issue. The architecture with sequential levels, in which each module corresponds to a linguistic level (pretreatments, morphology, syntax, semantics) has shown its limits. A sequential architecture does not allow a real exchange between different modules. This le~als to the non availability of the linguistic information for the reduction of ambiguities, at the moment they are needed. The necessity for cooperation between different modules has lead us to envisage a new architecture which stems from the techniques of distributed artificial intelligence. Mots-cl6sEnvironnement d'int6gratiou d'outils linguistiques, langue naturelle, franqais 6crit, intelligence artificielle distribu6e, syst~mes multi-agents, syst~mes gouvem6s par des lois, protocole de communication.ACRES DE COLING-92, NANTES, 23-28 AOr.3"r 1992 4 9 0 PROC. OF COLING-92, NANTES. AUC. 23-28, 1992

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.