Convolutional neural networks have been successfully applied to semantic segmentation problems. However, there are many problems that are inherently not pixel-wise classification problems but are nevertheless frequently formulated as semantic segmentation. This ill-posed formulation consequently necessitates hand-crafted scenario-specific and computationally expensive post-processing methods to convert the per pixel probability maps to final desired outputs. Generative adversarial networks (GANs) can be used to make the semantic segmentation network output to be more realistic or better structure-preserving, decreasing the dependency on potentially complex post-processing. In this work, we propose EL-GAN: a GAN framework to mitigate the discussed problem using an embedding loss. With EL-GAN, we discriminate based on learned embeddings of both the labels and the prediction at the same time. This results in much more stable training due to having better discriminative information, benefiting from seeing both 'fake' and 'real' predictions at the same time. This substantially stabilizes the adversarial training process. We use the TuSimple lane marking challenge to demonstrate that with our proposed framework it is viable to overcome the inherent anomalies of posing it as a semantic segmentation problem. Not only is the output considerably more similar to the labels when compared to conventional methods, the subsequent post-processing is also simpler and crosses the competitive 96% accuracy threshold.
Studying joint kinematics is of interest to improve prosthesis design and to characterize postoperative motion. State of the art techniques register bones segmented from prior computed tomography or magnetic resonance scans with X-ray fluoroscopic sequences. Elimination of the prior 3D acquisition could potentially lower costs and radiation dose. Therefore, we propose to substitute the segmented bone surface with a statistical shape model based estimate. A dedicated dynamic reconstruction and tracking algorithm was developed estimating the shape based on all frames, and pose per frame. The algorithm minimizes the difference between the projected bone contour and image edges. To increase robustness, we employ a dynamic prior, image features, and prior knowledge about bone edge appearances. This enables tracking and reconstruction from a single initial pose per sequence. We evaluated our method on the distal femur using eight biplane fluoroscopic drop-landing sequences. The proposed dynamic prior and features increased the convergence rate of the reconstruction from 71% to 91%, using a convergence limit of 3 mm. The achieved root mean square point-to-surface accuracy at the converged frames was 1.48 ± 0.41 mm. The resulting tracking precision was 1-1.5 mm, with the largest errors occurring in the rotation around the femoral shaft (about 2.5° precision).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.