Untitled

Beeferman, Doug; Berger, Adam; Lafferty, John

doi:10.1023/a:1007506220214

Cited by 436 publications

(32 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To begin with, we calculate the traditional mutual information entropy [21] for x and y as: . Here is the frequency that herb x and herb y occurred, and I(x, y,i) is the indicator function of x and y , showing whether herb x and y coexist in the formula i .…”

Section: Methodsmentioning

confidence: 99%

Herb network construction and co-module analysis for uncovering the combination rule of traditional Chinese herbal formulae

et al. 2010

View full text Add to dashboard Cite

BackgroundTraditional Chinese Medicine (TCM) is characterized by the wide use of herbal formulae, which are capable of systematically treating diseases determined by interactions among various herbs. However, the combination rule of TCM herbal formulae remains a mystery due to the lack of appropriate methods.MethodsFrom a network perspective, we established a method called Distance-based Mutual Information Model (DMIM) to identify useful relationships among herbs in numerous herbal formulae. DMIM combines mutual information entropy and “between-herb-distance” to score herb interactions and construct herb network. To evaluate the efficacy of the DMIM-extracted herb network, we conducted in vitro assays to measure the activities of strongly connected herbs and herb pairs. Moreover, using the networked Liu-wei-di-huang (LWDH) formula as an example, we proposed a novel concept of “co-module” across herb-biomolecule-disease multilayer networks to explore the potential combination mechanism of herbal formulae.ResultsDMIM, when used for retrieving herb pairs, achieves a good balance among the herb’s frequency, independence, and distance in herbal formulae. A herb network constructed by DMIM from 3865 Collaterals-related herbal formulae can not only nicely recover traditionally-defined herb pairs and formulae, but also generate novel anti-angiogenic herb ingredients (e.g. Vitexicarpin with IC50=3.2 μM, and Timosaponin A-III with IC50=3.4 μM) as well as herb pairs with synergistic or antagonistic effects. Based on gene and phenotype information associated with both LWDH herbs and LWDH-treated diseases, we found that LWDH-treated diseases show high phenotype similarity and identified certain “co-modules” enriched in cancer pathways and neuro-endocrine-immune pathways, which may be responsible for the action of treating different diseases by the same LWDH formula.ConclusionsDMIM is a powerful method to identify the combination rule of herbal formulae and lead to new discoveries. We also provide the first evidence that the co-module across multilayer networks may underlie the combination mechanism of herbal formulae and demonstrate the potential of network biology approaches in the studies of TCM.

show abstract

Section: Methodsmentioning

confidence: 99%

Herb network construction and co-module analysis for uncovering the combination rule of traditional Chinese herbal formulae

et al. 2010

View full text Add to dashboard Cite

show abstract

“…Bolzano (1) already noted the need for specific organization in scientific texts, while Ingarden devotes his book (2) to understanding the process by which a text is understood and assimilated. Modern methods (3,4) combine the work of linguists with those of computer scientists, physicists, physiologists, and researchers from many other fields to cover a wide range of texts, from the phoneme (5), going on to words (6-9, ʈ) and grammar (10,11), and all of the way to global text analysis (12) and the evolution of language (13,14).…”

mentioning

confidence: 99%

Hierarchical structures induce long-range dynamical correlations in written texts

Alvarez-Lacalle¹,

Dorow

Eckmann

et al. 2006

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

Thoughts and ideas are multidimensional and often concurrent, yet they can be expressed surprisingly well sequentially by the translation into language. This reduction of dimensions occurs naturally but requires memory and necessitates the existence of correlations, e.g., in written text. However, correlations in word appearance decay quickly, while previous observations of long-range correlations using random walk approaches yield little insight on memory or on semantic context. Instead, we study combinations of words that a reader is exposed to within a ''window of attention,'' spanning about 100 words. We define a vector space of such word combinations by looking at words that co-occur within the window of attention, and analyze its structure. Singular value decomposition of the co-occurrence matrix identifies a basis whose vectors correspond to specific topics, or ''concepts'' that are relevant to the text. As the reader follows a text, the ''vector of attention'' traces out a trajectory of directions in this ''concept space.'' We find that memory of the direction is retained over long times, forming power-law correlations. The appearance of power laws hints at the existence of an underlying hierarchical network. Indeed, imposing a hierarchy similar to that defined by volumes, chapters, paragraphs, etc. succeeds in creating correlations in a surrogate random text that are identical to those of the original text. We conclude that hierarchical structures in text serve to create long-range correlations, and use the reader's memory in reenacting some of the multidimensionality of the thoughts being expressed.hierarchy ͉ language ͉ power laws ͉ singular value decomposition L anguage is a central link through which we interact with other people. As a channel of communication it is limited by our physical ability to speak only one word at a time. The question arises therefore how the complex products of our brain are transformed into the linear string of words that comprise speech or text. Since our mental processes are far from being onedimensional, the use of memory is essential, as is the existence of some type of correlations in time.Such questions have a long and intense history. Bolzano (1) already noted the need for specific organization in scientific texts, while Ingarden devotes his book (2) to understanding the process by which a text is understood and assimilated. Modern methods (3, 4) combine the work of linguists with those of computer scientists, physicists, physiologists, and researchers from many other fields to cover a wide range of texts, from the phoneme (5), going on to words (6-9, ʈ) and grammar (10, 11), and all of the way to global text analysis (12) and the evolution of language (13,14).Recent interest has focused on applying methods of statistical physics to identify possible trends and correlations in text (15-18). In ref. 18, for example, the authors study the distribution of words across different works by the same authors, combining notions of information, entropy, and statistics to def...

show abstract

“…The performance of the algorithms applied in the annotated corpora was calculated using four widely known metrics: Precision, Recall, Beeferman's Pk [35] and WindowDiff. [36] For the segmentation task, Precision is defined as the proportion of boundaries chosen that agree with a reference segmentaPublished by Sciedu Press tion.…”

Section: Evaluation Metricsmentioning

confidence: 99%

“…However, both metrics suffer by the fact that they penalize near-misses of boundaries as full-misses, causing them to drastically overestimate the error. Beeferman's Pk [35] metric attempts to correct the erroneous calculation of penalties performed by Precision and Recall by computing penalties using a sliding a window of size k across the text, where k is defined as half of the mean reference segment size. Penalties are calculated by taking into account both the number of windows as well as whether boundaries appear in different segments in the reference and in the hypothesis segmentations for every window examined.…”

Section: Evaluation Metricsmentioning

confidence: 99%

Combining Information Extraction and Text Segmentation methods in Greek Texts

Fragkou

2018

AIR

View full text Add to dashboard Cite

This paper leverages semantic information that is elicited from information extraction techniques, to text segmentation algorithms. The purpose here is to examine whether semantic information boosts segmentation accuracy. Present study is performed in a Greek corpus. Semantic extraction is performed through an already existing NER tool for Greek (focusing on four named entity types) as well as (manually performed) co-reference resolution. Produced results reveal that, the proposed approach can be very promising in improving text segmentation performance as a result of extracting valuable semantic information. They also reveal that, manual annotation in specific information extraction tasks constitutes a unique option due to lack of freely available automatic annotation tools especially in languages such as Greek.

show abstract

Untitled

Cited by 436 publications

References 25 publications

Herb network construction and co-module analysis for uncovering the combination rule of traditional Chinese herbal formulae

Herb network construction and co-module analysis for uncovering the combination rule of traditional Chinese herbal formulae

Hierarchical structures induce long-range dynamical correlations in written texts

Combining Information Extraction and Text Segmentation methods in Greek Texts

Contact Info

Product

Resources

About