The automatic detection and tracking of general objects (like persons, animals or cars), text and logos in a video is crucial for many video understanding tasks, and usually real-time processing as required. We propose OmniTrack, an efficient and robust algorithm which is able to automatically detect and track objects, text as well as brand logos in realtime. It combines a powerful deep learning based object detector (YoloV3) with high-quality optical flow methods. Based on the reference YoloV3 C++ implementation, we did some important performance optimizations which will be described. The major steps in the training procedure for the combined detector for text and logo will be presented. We will describe then the OmniTrack algorithm, consisting of the phases preprocessing, feature calculation, prediction, matching and update. Several performance optimizations have been implemented there as well, like doing the object detection and optical flow calculation asynchronously. Experiments show that the proposed algorithm runs in real-time for standard definition (720x576) video on a PC with a Quadro RTX 5000 GPU.
Computational pathology is revolutionizing the field of pathology by integrating advanced computer vision and machine learning technologies into diagnostic workflows. Recently, self-supervised learning (SSL) has emerged as a promising solution to learn representations from histology patches, leveraging large volumes of unannotated whole slide images (WSI). In particular, Masked Image Modeling (MIM) showed remarkable results and robustness over purely contrastive learning methods. In this work, we explore the application of MIM to histology using iBOT, a self-supervised transformer-based framework. Through a wide range of downstream tasks over seven cancer indications, we provide recommendations on the pre-training of large models for histology data using MIM. First, we demonstrate that in-domain pre-training with iBOT outperforms both ImageNet pre-training and a model pre-trained with a purely contrastive learning objective, MoCo V2. Second, we show that Vision Transformers models (ViT), when scaled appropriately, have the capability to learn pan-cancer representations that benefit a large variety of downstream tasks. Finally, our iBOT ViT-Base model, pre-trained on more than 40 million histology images from 16 different cancer types, achieves state-of-the-art performance in most weakly-supervised WSI classification tasks compared to other SSL frameworks.
The need for developing new biomarkers is increasing with the emergence of many targeted therapies. In this study, we used artificial intelligence (AI) to develop a multimodal model (PULS-AI) predicting the survival of solid tumor patients treated with antiangiogenic treatments. Our retrospective, multicentric study included 616 patients with 7 different cancer types: renal cell carcinoma, colorectal carcinoma, hepatocellular carcinoma, gastrointestinal carcinoma, melanoma, breast cancer, and sarcoma. A set of 196 patients was left out of the study for validation. Clinical data including patient, treatment, and cancer metadata were collected at baseline for all patients, as well as computed tomography (CT) and ultrasound (US) images. Radiologists annotated all metastases on the CT images and the visible tumor lesion on the US images. AI models were used to extract relevant features from the regions of interest on CT and US images. In addition, handcrafted features related to the tumor burden were extracted from the annotations of all lesions on CT such as the number of lesions and the tumor burden volume per organ (lungs, liver, skull, bone, other). Finally, a Cox regression model was fitted to the set of imaging features and clinical features. The annotation process led to 1147 annotated US images with lesions delineation and 4564 reviewed CTs, of which 989 were selected and fully annotated with a total of 9516 annotated lesions.The developed model reaches an average concordance index of 0.71 (0.67-0.75, 95% CI). Using a risk threshold of 50%, PULS-AI model is able to significantly isolate (log-rank test P-value < 0.001) high-risk patients from low-risk patients (respective median OS of 12 and 32 months) with a hazard ratio of 3.52 (2.35-5.28, 95% CI). The results of this study show that AI algorithms are able to extract relevant information from radiology images and to aggregate data from multiple modalities to build powerful prognostic tools. Such tools may provide assistance to oncology clinicians in therapeutic decision-making. Citation Format: Kathryn Schutte, Fabien Brulport, Sana Harguem-Zayani, Jean-Baptiste Schiratti, Ridouane Ghermi, Paul Jehanno, Alexandre Jaeger, Talal Alamri, Raphael Naccache, Leila Haddag-Miliani, Teresa Orsi, Jean-Philippe Lamarque, Isaline Hoferer, Littisha Lawrance, Baya Benatsou, Imad Bousaid, Mickael Azoulay, Antoine Verdon, François Bidault, Corinne Balleyguier, Victor Aubert, Etienne Bendjebbar, Charles Maussion, Nicolas Loiseau, Benoit Schmauch, Meriem Sefta, Gilles Wainrib, Thomas Clozel, Samy Ammari, Nathalie Lassau. PULS-AI: A multimodal artificial intelligence model to predict survival of solid tumor patients treated with antiangiogenics [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1924.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.