We address the problems faced by analysts who have to sift through large amounts of data quickly and accurately in order to make sense of the information contained within the data or the circumstance represented in the data. We discuss the approach we have taken to visual analytics from the perspective of the Data-Frame Theory of Sense-making and its extension to Causal Reasoning, and how the cognitive strategies that are invoked in these processes need to be supported. We identify 20 problems that designers of visual analytics-type systems need to address in order to support sense-making. In particular, we discuss design issues associated with three exemplar problems: (i) Black holes -the problem of representing missing data; (ii) Keyholes -the problem of being able to access and view only a small part of a large dataset or only part of a problem; and (iii) Brown worms -the problem of dealing with and representing misleading or deceptive data.
Bacterial genomics is making an increasing contribution to the fields of medicine and public health microbiology. Consequently, accurate species identification of bacterial genomes is an important task, particularly as the number of genomes stored in online databases increases rapidly and new species are frequently discovered. Existing database entries require regular re-evaluation to ensure that species annotations are consistent with the latest species definitions. We have developed an automated method for bacterial species identification that is an extension of ribosomal multilocus sequence typing (rMLST). The method calculates an ‘rMLST nucleotide identity’ (rMLST-NI) based on the nucleotides present in the protein-encoding ribosomal genes derived from bacterial genomes. rMLST-NI was used to validate the species annotations of 11839 publicly available
Klebsiella
and
Raoultella
genomes based on a comparison with a library of type strain genomes. rMLST-NI was compared with two whole-genome average nucleotide identity methods (OrthoANIu and FastANI) and the k-mer based Kleborate software. The results of the four methods agreed across a dataset of 11839 bacterial genomes and identified a small number of entries (n=89) with species annotations that required updating. The rMLST-NI method was 3.5 times faster than Kleborate, 4.5 times faster than FastANI and 1600 times faster than OrthoANIu. rMLST-NI represents a fast and generic method for species identification using type strains as a reference.
Campylobacter jejuni
(C.jejuni) is the most common causative agent of bacterial food poisoning worldwide and is known to be genetically highly diverse.
C. jejuni
is increasingly resistant to fluoroquinolone antibiotics, but very few studies have investigated variant-specific patterns of resistance across time. Here we use statistical modelling and clustering techniques to investigate patterns of fluoroquinolone resistance amongst 10,359 UK isolates from human disease sampled over 20 years. We observed six distinct patterns of fluoroquinolone sensitivity/resistance in
C. jejuni
across time, grouping by clonal complex (CC). Some CCs were fully resistant, some shifted from susceptible to resistant following a sigmoidal shape, and some remained susceptible over time. Our findings indicate that the fluoroquinolone resistance patterns of
C. jejuni
are complicated and cannot be analysed as a single species but divided into variant dynamics so that the factors driving resistance can be thoroughly investigated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.