Patrick Hagge Cording scite author profile

Given a string S of length N on a fixed alphabet of σ symbols, a grammar compressor produces a context-free grammar G of size n that generates S and only S. In this paper we describe data structures to support the following operations on a grammar-compressed string: rankc(S, i) (return the number of occurrences of symbol c before position i in S); selectc(S, i) (return the position of the ith occurrence of c in S); and access(S, i, j) (return substring S[i, j]). For rank and select we describe data structures of size O(nσ log N ) bits that support the two operations in O(log N ) time. We propose another structure that uses O(nσ log(N/n)(log N ) 1+ ) bits and that supports the two queries in O(log N/ log log N ), where > 0 is an arbitrary constant. To our knowledge, we are the first to study the asymptotic complexity of rank and select in the grammar-compressed setting, and we provide a hardness result showing that significantly improving the bounds we achieve would imply a major breakthrough on a hard graph-theoretical problem. Our main result for access is a method that requires O(n log N ) bits of space and O(log N + m/ log σ N ) time to extract m = j − i + 1 consecutive symbols from S. Alternatively, we can achieve O(log N/ log log N +m/ log σ N ) query time using O(n log(N/n)(log N ) 1+ ) bits of space. This matches a lower bound stated by Verbin and Yu for strings where N is polynomially related to n.

show abstract

Detection of previously undiagnosed cases of COPD in a high-risk population identified in general practice

Løkke

Ulrik

Dahl

et al. 2012

COPD: Journal of Chronic Obstructive Pulmonary Disease

View full text Add to dashboard Cite

show abstract

Early detection of COPD in general practice

Ulrik

Løkke²,

Dahl³

et al. 2011

COPD

View full text Add to dashboard Cite

Background and aimEarly detection enables the possibility for interventions to reduce the future burden of COPD. The Danish National Board of Health recommends that individuals >35 years with tobacco/occupational exposure, and at least 1 respiratory symptom should be offered a spirometry to facilitate early detection of COPD. The aim, therefore, was to provide evidence for the feasibility and impact of doing spirometry in this target population.MethodsParticipating general practitioners (GPs) (n = 335; 10% of the Danish GPs) recruited consecutively, subjects with >35 years exposure, no previous diagnosis of obstructive lung disease, and at least 1 of the following symptoms: cough, dyspnea, wheezing, sputum, or recurrent respiratory infection. Data on age, smoking status, pack-years, body mass index (BMI), dyspnea score (Medical Research Council, MRC), and pre-bronchodilator spirometry (FEV1, FEV1% predicted, FEV1/FVC) were obtained.ResultsA total of 3.095 (51% females) subjects was included: mean age 58 years, BMI 26.3, and 31.5 pack-years. The majority of subjects (88%) reported MRC score 1 or 2. FEV1/FVC-ratio ≤ 0.7 was found in 34.8% of the subjects; the prevalence of airway obstruction increased with age and decreased with increasing BMI, and was higher in men and current smokers. According to the level of FEV1, 79% of the subjects with airway obstruction had mild to moderate COPD.ConclusionsMore than one-third of the recruited subjects had airway obstruction (FEV1/ FVC < 0.7). Early detection of COPD appears to be feasible through offering spirometry to adults with tobacco/occupational exposure and at least 1 respiratory symptom.

show abstract

Early Detection Of COPD In General Practice

Ulrik¹,

Løkke²,

Dahl³

et al. 2010

View full text Add to dashboard Cite

Compressed Subsequence Matching and Packed Tree Coloring

2015

View full text Add to dashboard Cite

We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size n compressing a string of size N and a pattern string of size m over an alphabet of size σ, our algorithm uses O(n + nσ w ) space and O(n + nσ w + m log N log w · occ) or O(n + nσ w log w + m log N · occ) time. Here w is the word size and occ is the number of occurrences of the pattern. Our algorithm uses less space than previous algorithms and is also faster for occ = o( n log N ) occurrences. The algorithm uses a new data structure that allows us to efficiently find the next occurrence of a given character after a given position in a compressed string. This data structure in turn is based on a new data structure for the tree color problem, where the node colors are packed in bit strings.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.