Ana V. Coronado scite author profile

Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered (i.e., they self-attract each other), while irrelevant words are distributed randomly in the text. Since a reference corpus is not needed, our approach is especially suitable for single documents for which no a priori information is available. In addition, we show that our method works also in generic symbolic sequences (continuous texts without spaces), thus suggesting its general applicability.

show abstract

Identifying characteristic scales in the human genome

Carpena

Bernaola‐Galván²,

Coronado³

et al. 2007

Phys. Rev. E

View full text Add to dashboard Cite

The scale-free, long-range correlations detected in DNA sequences contrast with characteristic lengths of genomic elements, being particularly incompatible with the isochores (long, homogeneous DNA segments). By computing the local behavior of the scaling exponent alpha of detrended fluctuation analysis (DFA), we discriminate between sequences with and without true scaling, and we find that no single scaling exists in the human genome. Instead, human chromosomes show a common compositional structure with two characteristic scales, the large one corresponding to the isochores and the other to small and medium scale genomic elements.

show abstract

Size Effects on Correlation Measures

Coronado

Carpena

2005

J Biol Phys

View full text Add to dashboard Cite

Abstract. The detection and quantification of long-range correlations in time series is a fundamental tool to characterize the properties of different dynamical systems, and is applied in many different fields, including physics, biology or engineering. Due to the diversity of applications, many techniques for measuring correlations have been designed. Here, we study systematically the influence of the length of a time series on the results obtained from several techniques commonly used to detect and quantify long-range correlations: the autocorrelation analysis, Hurst's analysis, and detrended fluctuation analysis (DFA). Using the Fourier filtering method, we generate artificial time series with known and controlled long-range correlations and with a broad range of lengths, and apply on them the different correlation measures we have studied. Our results indicate that while the DFA method is practically unaffected by the length of the time series, and almost always provides accurate results, the results from Hurst's analysis and the autocorrelation analysis strongly depend on the length of the time series.

show abstract

Improving statistical keyword detection in short texts: Entropic and clustering approaches

Carretero-Campos

Bernaola‐Galván

Coronado

et al. 2013

Physica A: Statistical Mechanics and its Applications

View full text Add to dashboard Cite

Segmentation of time series with long-range fractal correlations

et al. 2012

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ana V. Coronado

Level statistics of words: Finding keywords in literary texts and symbolic sequences

Identifying characteristic scales in the human genome

Size Effects on Correlation Measures

Improving statistical keyword detection in short texts: Entropic and clustering approaches

Segmentation of time series with long-range fractal correlations

Contact Info

Product

Resources

About