PurposeEffective diagnosis of tuberculosis (TB) relies on accurate interpretation of radiological patterns found in a chest radiograph (CXR). Lack of skilled radiologists and other resources, especially in developing countries, hinders its efficient diagnosis. Computer-aided diagnosis (CAD) methods provide second opinion to the radiologists for their findings and thereby assist in better diagnosis of cancer and other diseases including TB. However, existing CAD methods for TB are based on the extraction of textural features from manually or semi-automatically segmented CXRs. These methods are prone to errors and cannot be implemented in X-ray machines for automated classification.MethodsGabor, Gist, histogram of oriented gradients (HOG), and pyramid histogram of oriented gradients (PHOG) features extracted from the whole image can be implemented into existing X-ray machines to discriminate between TB and non-TB CXRs in an automated manner. Localized features were extracted for the above methods using various parameters, such as frequency range, blocks and region of interest. The performance of these features was evaluated against textural features. Two digital CXR image datasets (8-bit DA and 14-bit DB) were used for evaluating the performance of these features.ResultsGist (accuracy 94.2% for DA, 86.0% for DB) and PHOG (accuracy 92.3% for DA, 92.0% for DB) features provided better results for both the datasets. These features were implemented to develop a MATLAB toolbox, TB-Xpredict, which is freely available for academic use at http://sourceforge.net/projects/tbxpredict/. This toolbox provides both automated training and prediction modules and does not require expertise in image processing for operation.ConclusionSince the features used in TB-Xpredict do not require segmentation, the toolbox can easily be implemented in X-ray machines. This toolbox can effectively be used for the mass screening of TB in high-burden areas with improved efficiency.
The software gap-the discrepancy between the need for new software and the aggregate capacity of the workforce to produce it-is a serious problem for scientific software. Although users appreciate the convenience (and, thus, improved productivity)
Ziehl-Neelsen stained microscopy is a crucial bacteriological test for tuberculosis detection, but its sensitivity is poor. According to the World Health Organization (WHO) recommendation, 300 viewfields should be analyzed to augment sensitivity, but only a few viewfields are examined due to patient load. Therefore, tuberculosis diagnosis through automated capture of the focused image (autofocusing), stitching of viewfields to form mosaics (autostitching), and automatic bacilli segmentation (grading) can significantly improve the sensitivity. However, the lack of unified datasets impedes the development of robust algorithms in these three domains. Therefore, the Ziehl-Neelsen sputum smear microscopy image database (ZNSM iDB) has been developed, and is freely available. This database contains seven categories of diverse datasets acquired from three different bright-field microscopes. Datasets related to autofocusing, autostitching, and manually segmenting bacilli can be used for developing algorithms, whereas the other four datasets are provided to streamline the sensitivity and specificity. All three categories of datasets were validated using different automated algorithms. As images available in this database have distinctive presentations with high noise and artifacts, this referral resource can also be used for the validation of robust detection algorithms. The ZNSM-iDB also assists for the development of methods in automated microscopy.
Telescoping languages is a strategy to automatically generate highly-optimized domain-specific libraries. The key idea is to create specialized variants of library procedures through extensive offline processing. This paper describes a telescoping system, called ARGen, which generates highperformance Fortran or C libraries from prototype Matlab code for the linear algebra library, ARPACK. ARGen uses variable types to guide procedure specializations on possible calling contexts.ARGen needs to infer Matlab types in order to speculate on the possible variants of library procedures, as well as to generate code. This paper shows that our type-inference system is powerful enough to generate all the variants needed for ARPACK automatically from the Matlab development code. The ideas demonstrated here provide a basis for building a more general telescoping system for Matlab.
While the popularity of using high-level programming languages such as MATLAB for scientific and engineering applications continues to grow, its poor performance compared to traditional languages such as Fortran or C continues to impede its deployment in full-scale simulations and data analysis. Additionally, its poor memory performance limits its performance. To ameliorate performance, we have been developing a MATLAB and Octave compiler that improves performance of MATLAB code by performing type inference and using the resulting type information to remove common bottlenecks. We observe that unlike past results, scalarizing array statements, instead of vectorizing scalar statements, is more fruitful when compiling MATLAB to C or C++. Two important situations where such scalarization helps is in expressions containing array subscripts and sequences of related array statements. In both cases, it is possible to generate fused loops and replace array temporaries by scalars, thus reducing the memory bandwidth pressure. Additional array temporaries are obviated in the case of array subscripts. Further, starting with vectorized statements guarantees that the resulting loops can be parallelized, creating opportunities for a mix of thread-level and instruction-level parallelism as well as GPU execution. We have implemented this strategy in a MATLAB compiler that compiles portions of MATLAB to C++ or CUDA C. Evaluation results on a set of benchmarks selected from diverse domains shows speed improvements ranging from 1.5x to almost 17x on an eight-core Intel Core 2 Duo machine.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.