Given a string S of length N on a fixed alphabet of σ symbols, a grammar compressor produces a context-free grammar G of size n that generates S and only S. In this paper we describe data structures to support the following operations on a grammar-compressed string: rankc(S, i) (return the number of occurrences of symbol c before position i in S); selectc(S, i) (return the position of the ith occurrence of c in S); and access(S, i, j) (return substring S[i, j]). For rank and select we describe data structures of size O(nσ log N ) bits that support the two operations in O(log N ) time. We propose another structure that uses O(nσ log(N/n)(log N ) 1+ ) bits and that supports the two queries in O(log N/ log log N ), where > 0 is an arbitrary constant. To our knowledge, we are the first to study the asymptotic complexity of rank and select in the grammar-compressed setting, and we provide a hardness result showing that significantly improving the bounds we achieve would imply a major breakthrough on a hard graph-theoretical problem. Our main result for access is a method that requires O(n log N ) bits of space and O(log N + m/ log σ N ) time to extract m = j − i + 1 consecutive symbols from S. Alternatively, we can achieve O(log N/ log log N +m/ log σ N ) query time using O(n log(N/n)(log N ) 1+ ) bits of space. This matches a lower bound stated by Verbin and Yu for strings where N is polynomially related to n.
Of the at-risk subjects studied, 22% were diagnosed with COPD. A case-finding strategy providing questionnaire assessment and diagnostic spirometry to high-risk subjects in primary care, and therefore, identifies a large proportion of undiagnosed COPD patients, especially in the early stages of the disease.
Background and aimEarly detection enables the possibility for interventions to reduce the future burden of COPD. The Danish National Board of Health recommends that individuals >35 years with tobacco/occupational exposure, and at least 1 respiratory symptom should be offered a spirometry to facilitate early detection of COPD. The aim, therefore, was to provide evidence for the feasibility and impact of doing spirometry in this target population.MethodsParticipating general practitioners (GPs) (n = 335; 10% of the Danish GPs) recruited consecutively, subjects with >35 years exposure, no previous diagnosis of obstructive lung disease, and at least 1 of the following symptoms: cough, dyspnea, wheezing, sputum, or recurrent respiratory infection. Data on age, smoking status, pack-years, body mass index (BMI), dyspnea score (Medical Research Council, MRC), and pre-bronchodilator spirometry (FEV1, FEV1% predicted, FEV1/FVC) were obtained.ResultsA total of 3.095 (51% females) subjects was included: mean age 58 years, BMI 26.3, and 31.5 pack-years. The majority of subjects (88%) reported MRC score 1 or 2. FEV1/FVC-ratio ≤ 0.7 was found in 34.8% of the subjects; the prevalence of airway obstruction increased with age and decreased with increasing BMI, and was higher in men and current smokers. According to the level of FEV1, 79% of the subjects with airway obstruction had mild to moderate COPD.ConclusionsMore than one-third of the recruited subjects had airway obstruction (FEV1/ FVC < 0.7). Early detection of COPD appears to be feasible through offering spirometry to adults with tobacco/occupational exposure and at least 1 respiratory symptom.
No abstract
We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size n compressing a string of size N and a pattern string of size m over an alphabet of size σ, our algorithm uses O(n + nσ w ) space and O(n + nσ w + m log N log w · occ) or O(n + nσ w log w + m log N · occ) time. Here w is the word size and occ is the number of occurrences of the pattern. Our algorithm uses less space than previous algorithms and is also faster for occ = o( n log N ) occurrences. The algorithm uses a new data structure that allows us to efficiently find the next occurrence of a given character after a given position in a compressed string. This data structure in turn is based on a new data structure for the tree color problem, where the node colors are packed in bit strings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.