OmniTrack: Real-Time Detection and Tracking of Objects, Text and Logos in Video

Fassold, Hannes; Ghermi, Ridouane

doi:10.1109/ism46123.2019.00057

Cited by 8 publications

(7 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Fassold and Germi [31] proposed a method for video text tracking in a real-time environment. Their approach combines deep learning and an object detector to achieve improved results.…”

Section: B Text Localization In Videomentioning

confidence: 99%

A New Deep Wavefront Based Model for Text Localization in 3D Video

Nandanwar

Shivakumara

Ramachandra

et al. 2022

IEEE Trans. Circuits Syst. Video Technol.

View full text Add to dashboard Cite

With the evolution of electronic devices, such as 3D cameras, addressing the challenges of text localization in 3D video (e.g., for indexing) is increasingly drawing the attention of the multimedia and video processing community. Existing methods focus on 2D video and their performance in the presence of the challenges in 3D video, such as shadow areas associated with text and irregularly sized and shaped text, degrades. This paper proposes the first approach that successfully addresses the challenges of 3D video in addition to those of 2D. It employs a number of innovations, among which, the first is the Generalized Gradient Vector Flow (GGVF) for dominant points detection. The second is the Wavefront concept for text candidate point detection from those dominant points. In addition, an Adaptive B-Spline Polygon Curve Network (ABS-Net) is proposed for accurate text localization in 3D videos by constructing tight fitting bounding polygons using text candidate points. Extensive experiments on custom (3D video) and standard datasets (2D video and scene text) show that the proposed method is practical and useful, and overall outperforms existing state-of-the-art methods.

show abstract

“…Fassold and Germi [31] proposed a method for video text tracking in a real-time environment. Their approach combines deep learning and an object detector to achieve improved results.…”

Section: B Text Localization In Videomentioning

confidence: 99%

A New Deep Wavefront Based Model for Text Localization in 3D Video

Nandanwar

Shivakumara

Ramachandra

et al. 2022

IEEE Trans. Circuits Syst. Video Technol.

View full text Add to dashboard Cite

show abstract

“…The automatic detection and tracking of general objects in a video provides semantic information which is crucial for many high-level computer vision tasks in various application areas like surveillance, autonomous driving, automatic video annotation and brand monitoring. Our proposed Detic-Track algorithm for object detection and tracking is based upon the OmniTrack algorithm [1], which combines the YoloV4 object detector with TV-L1 optical flow and is real-time capable. For the Detic-Track algorithm, we have extended this algorithm in several ways.…”

Section: Algorithmmentioning

confidence: 99%

“…In contrast to YoloV4 (which detects only the 80 MS-COCO classes), Detic is able to detect significantly more object categories. Specifically, we employ a pretrained Detic model which detects the 1, 203 object classes from the LVIS dataset 1 . Furthermore, instead of using the whole bounding box for tracking a detected object, we utilize only the part of the bounding box corresponding to the object mask.…”

Section: Algorithmmentioning

confidence: 99%

Detic-Track: Robust Detection and Tracking of Objects in Video

Fassold

2022

2022 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

Self Cite

View full text Add to dashboard Cite

The automatic detection and tracking of objects in a video is crucial for many video understanding tasks. We propose a novel deep learning based algorithm for object detection and tracking, which is able to detect more than 1,000 object classes and tracks them robustly, even for challenging content. The robustness of the tracking is due to the usage of optical flow information. Additionally, we utilize only the part of the bounding box corresponding to the object shape for the tracking.

show abstract

“…Furthermore, even on devices capable of consuming 360 • videos interactively, an user might prefer a lean-back mode, without the need to navigate around actively to explore the content. The initial prototype of the automatic camera path generator (more details can be found in [12]) is based on the information about the scene objects (persons, animals, ...), which is extracted with the method given in [13]. For each scene object, a saliency score is calculated based on several influencing factors (object class and size, motion magnitude, neighbours of object), which indicates the interestingness of the object.…”

Section: Automatic Camera Path Generatormentioning

confidence: 99%

The Hyper360 toolset for enriched 360$^\circ$ video

Fassold,

Karakottas,

Tsatsou

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

360• video is a novel media format, rapidly becoming adopted in media production and consumption as part of todays ongoing virtual reality revolution. Due to its novelty, there is a lack of tools for producing highly engaging 360 • video for consumption on a multitude of platforms. In this work, we describe the work done so far in the Hyper360 project on tools for 360 • video. Furthermore, the first pilots which have been produced with the Hyper360 tools are presented.

show abstract

OmniTrack: Real-Time Detection and Tracking of Objects, Text and Logos in Video

Cited by 8 publications

References 10 publications

A New Deep Wavefront Based Model for Text Localization in 3D Video

A New Deep Wavefront Based Model for Text Localization in 3D Video

Detic-Track: Robust Detection and Tracking of Objects in Video

The Hyper360 toolset for enriched 360$^\circ$ video

Contact Info

Product

Resources

About