With the introduction of fully convolutional neural networks, deep learning has raised the benchmark for medical image segmentation on both speed and accuracy, and different networks have been proposed for 2D and 3D segmentation with promising results. Nevertheless, most networks only handle relatively small numbers of labels (<10), and there are very limited works on handling highly unbalanced object sizes especially in 3D segmentation. In this paper, we propose a network architecture and the corresponding loss function which improve segmentation of very small structures. By combining skip connections and deep supervision with respect to the computational feasibility of 3D segmentation, we propose a fast converging and computationally efficient network architecture for accurate segmentation. Furthermore, inspired by the concept of focal loss, we propose an exponential logarithmic loss which balances the labels not only by their relative sizes but also by their segmentation difficulties. We achieve an average Dice coefficient of 82% on brain segmentation with 20 labels, with the ratio of the smallest to largest object sizes as 0.14%. Less than 100 epochs are required to reach such accuracy, and segmenting a 128×128×128 volume only takes around 0.4 s.
IMPORTANCE Chest radiography is the most common diagnostic imaging examination performed in emergency departments (EDs). Augmenting clinicians with automated preliminary read assistants could help expedite their workflows, improve accuracy, and reduce the cost of care. OBJECTIVE To assess the performance of artificial intelligence (AI) algorithms in realistic radiology workflows by performing an objective comparative evaluation of the preliminary reads of anteroposterior (AP) frontal chest radiographs performed by an AI algorithm and radiology residents. DESIGN, SETTING, AND PARTICIPANTS This diagnostic study included a set of 72 findings assembled by clinical experts to constitute a full-fledged preliminary read of AP frontal chest radiographs. A novel deep learning architecture was designed for an AI algorithm to estimate the findings per image. The AI algorithm was trained using a multihospital training data set of 342 126 frontal chest radiographs captured in ED and urgent care settings. The training data were labeled from their associated reports. Image-based F1 score was chosen to optimize the operating point on the receiver operating characteristics (ROC) curve so as to minimize the number of missed findings and overcalls per image read. The performance of the model was compared with that of 5 radiology residents recruited from multiple institutions in the US in an objective study in which a separate data set of 1998 AP frontal chest radiographs was drawn from a hospital source representative of realistic preliminary reads in inpatient and ED settings. A triple consensus with adjudication process was used to derive the ground truth labels for the study data set. The performance of AI algorithm and radiology residents was assessed by comparing their reads with ground truth findings. All studies were conducted through a web-based clinical study application system. The triple consensus data set
A first step towards a n understanding of the semantic content in a video is the reliable detection and recognition of actions performed by objects. This is a dificult problem due t o the enormous vaeability in a n action's appearance when seen from different viewpoints and/or at different times. In this paper we address the recognition of actions by taking a novel approach that models actions as special types of 3d objects. Specifically, we observe that any action can be represented as a generalized cylinder, called the action cylinder. Reliable recognition is achieved by recovering the viewpoint transformation between reference (model) and given action cylinders. A set of 8 corresponding points from time-wise corresponding cross-sections is shown t o be suficient t o align the two cylinders under perspective projection. A surprising conclusion from visualizing actions as objects i s that rigid, articulated, and nonrigid actions can all be modeled an a uniform framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.