The rapidly-changing deep learning landscape presents a unique opportunity for building inference accelerators optimized for specific datacenter-scale workloads. We propose Full-stack Accelerator Search Technique (FAST), a hardware accelerator search framework that defines a broad optimization environment covering key design decisions within the hardware-software stack, including hardware datapath, software scheduling, and compiler passes such as operation fusion and tensor padding. In this paper, we analyze bottlenecks in stateof-the-art vision and natural language processing (NLP) models, including EfficientNet [91] and BERT [19], and use FAST to design accelerators capable of addressing these bottlenecks. FAST-generated accelerators optimized for single workloads improve Perf/TDP by 3.7× on average across all benchmarks compared to TPU-v3. A FASTgenerated accelerator optimized for serving a suite of workloads improves Perf/TDP by 2.4× on average compared to TPU-v3. Our return on investment analysis shows that FAST-generated accelerators can potentially be practical for moderate-sized datacenter deployments.
CCS CONCEPTS• Hardware → Electronic design automation; • Computer systems organization → Parallel architectures.
Automated segmentation and dermoscopic hair detection are one of the significant challenges in computer-aided diagnosis (CAD) of melanocytic lesions. Additionally, due to the presence of artifacts and variation in skin texture and smooth lesion boundaries, the accuracy of such methods gets hampered. The objective of this research is to develop an automated hair detection and lesion segmentation algorithm using lesion-specific properties to improve the accuracy. The aforementioned objective is achieved in two ways. Firstly, a novel hair detection algorithm is designed by considering the properties of dermoscopic hair. Second, a novel chroma-based geometric deformable model is used to effectively differentiate the lesion from the surrounding skin. The speed function incorporates the chrominance properties of the lesion to stop evolution at the lesion boundary. Automatic initialization of the initial contour and chrominance-based speed function aids in providing robust and flexible segmentation. The proposed approach is tested on 200 images from PH2 and 900 images from ISBI 2016 datasets. Average accuracy, sensitivity, specificity, and overlap scores of 93.4, 87.6, 95.3, and 11.52% respectively are obtained for the PH2 dataset. Similarly, the proposed method resulted in average accuracy, sensitivity, specificity, and overlap scores of 94.6, 82.4, 97.2, and 7.20% respectively for the ISBI 2016 dataset. Statistical and quantitative analyses prove the reliability of the algorithm for incorporation in CAD systems. Graphical Abstract Overview of proposed system.
Blood vessels of the brain provide the human brain with the required nutrients and oxygen. As a vulnerable part of the cerebral blood supply, pathology of small vessels can cause serious problems such as Cerebral Small Vessel Diseases (CSVD). It has also been shown that CSVD is related to neurodegeneration, such as Alzheimer’s disease. With the advancement of 7 Tesla MRI systems, higher spatial image resolution can be achieved, enabling the depiction of very small vessels in the brain. Non-Deep Learning-based approaches for vessel segmentation, e.g., Frangi’s vessel enhancement with subsequent thresholding, are capable of segmenting medium to large vessels but often fail to segment small vessels. The sensitivity of these methods to small vessels can be increased by extensive parameter tuning or by manual corrections, albeit making them time-consuming, laborious, and not feasible for larger datasets. This paper proposes a deep learning architecture to automatically segment small vessels in 7 Tesla 3D Time-of-Flight (ToF) Magnetic Resonance Angiography (MRA) data. The algorithm was trained and evaluated on a small imperfect semi-automatically segmented dataset of only 11 subjects; using six for training, two for validation, and three for testing. The deep learning model based on U-Net Multi-Scale Supervision was trained using the training subset and was made equivariant to elastic deformations in a self-supervised manner using deformation-aware learning to improve the generalisation performance. The proposed technique was evaluated quantitatively and qualitatively against the test set and achieved a Dice score of 80.44 ± 0.83. Furthermore, the result of the proposed method was compared against a selected manually segmented region (62.07 resultant Dice) and has shown a considerable improvement (18.98%) with deformation-aware learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.