Vasileios Mezaris scite author profile

Deep Learning (DL) has become a crucial technology for multimedia computing. It o ers a powerful instrument to automatically produce high-level abstractions of complex multimedia data, which can be exploited in a number of applications including object detection and recognition, speech-to-text, media retrieval, multimodal data analysis, and so on. e availability of a ordable large-scale parallel processing architectures, and the sharing of e ective open-source codes implementing the basic learning algorithms, caused a rapid di usion of DL methodologies, bringing a number of new technologies and applications that outperform in most cases traditional machine learning technologies. In recent years, the possibility of implementing DL technologies on mobile devices has a racted signi cant a ention. anks to this technology, portable devices may become smart objects capable of learning and acting. e path towards these exciting future scenarios, however, entangles a number of important research challenges. DL architectures and algorithms are hardly adapted to the storage and computation resources of a mobile device. erefore, there is a need for new generations of mobile processors and chipsets, small footprint learning and inference algorithms, new models of collaborative and distributed processing, and a number of other fundamental building blocks. is survey reports the state of the art in this exciting research area, looking back to the evolution of neural networks, and arriving to the most recent results in terms of methodologies, technologies and applications for mobile environments.

show abstract

Fast shot segmentation combining global and local visual descriptors

Apostolidis

Mezaris

2014

111

View full text Add to dashboard Cite

Comparison of Fine-Tuning and Extension Strategies for Deep Convolutional Neural Networks

Pittaras

Markatopoulou

Mezaris

et al. 2016

View full text Add to dashboard Cite

In this study we compare three different fine-tuning strategies in order to investigate the best way to transfer the parameters of popular deep convolutional neural networks that were trained for a visual annotation task on one dataset, to a new, considerably different dataset. We focus on the concept-based image/video annotation problem and use ImageNet as the source dataset, while the TRECVID SIN 2013 and PAS-CAL VOC-2012 classification datasets are used as the target datasets. A large set of experiments examines the effectiveness of three fine-tuning strategies on each of three different pre-trained DCNNs and each target dataset. The reported results give rise to guidelines for effectively fine-tuning a DCNN for concept-based visual annotation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.