Nikolaos Tsipas scite author profile

Art and technology have always been very tightly intertwined, presenting strong influences on each other. On the other hand, technological evolution led to today’s digital media landscape, elaborating mediated communication tools, thus providing new creative means of expression (i.e., new-media art). Rich-media interaction can expedite the whole process into an augmented schooling experience though art cannot be easily enclosed in classical teaching procedures. The current work focuses on the deployment of a modern-art web-guide, aiming at enhancing traditional approaches with machine-assisted blended-learning. In this perspective, “machine” has a two-folded goal: to offer highly-interdisciplinary multimedia services for both in-class demonstration and self-training support, and to crowdsource users’ feedback, as to train artificial intelligence systems on painting movements semantics. The paper presents the implementation of the “Istoriart” website through the main phases of Analysis, Design, Development, and Evaluation, while also answering typical questions regarding its impact on the targeted audience. Hence, elaborating on this constructive case study, initial hypotheses on the multidisciplinary usefulness, and contribution of the new digital services are put into test and verified.

show abstract

Crowdsourcing Audio Semantics by Means of Hybrid Bimodal Segmentation with Hierarchical Classification

Vrysis¹,

Tsipas²,

Dimoulas³

et al. 2016

J. Audio Eng. Soc.

View full text Add to dashboard Cite

1D/2D Deep CNNs vås. Temporal Feature Integration for General Audio Classification

Vrysis¹,

Tsipas²,

Thoidis³

et al. 2020

J. Audio Eng. Soc.

View full text Add to dashboard Cite

Audiovisual production, restoration-archiving and content management methods to preserve local tradition and folkloric heritage

Dimoulas

Kalliris

Chatzara

et al. 2014

Journal of Cultural Heritage

View full text Add to dashboard Cite

Efficient audio-driven multimedia indexing through similarity-based speech / music discrimination

Tsipas

Vrysis

Dimoulas

et al. 2017

Multimed Tools Appl

View full text Add to dashboard Cite

Semi-supervised audio-driven TV-news speaker diarization using deep neural embeddings

Tsipas

Vrysis

Konstantoudakis

et al. 2020

View full text Add to dashboard Cite

In this paper, an audio-driven, multimodal approach for speaker diarization in multimedia content is introduced and evaluated. The proposed algorithm is based on semi-supervised clustering of audio-visual embeddings, generated using deep learning techniques. The two modes, audio and video, are separately addressed; a long short-term memory Siamese neural network is employed to produce embeddings from audio, whereas a pre-trained convolutional neural network is deployed to generate embeddings from two-dimensional blocks representing the faces of speakers detected in video frames. In both cases, the models are trained using cost functions that favor smaller spatial distances between samples from the same speaker and greater spatial distances between samples from different speakers. A fusion stage, based on hypotheses derived from the established practices in television content production, is deployed on top of the unimodal sub-components to improve speaker diarization performance. The proposed methodology is evaluated against VoxCeleb, a large-scale dataset with hundreds of available speakers and AVL-SD, a newly developed, publicly available dataset aiming at capturing the peculiarities of TV news content under different scenarios. In order to promote reproducible research and collaboration in the field, the implemented algorithm is provided as an open-source software package.

show abstract

Web Radio Automation for Audio Stream Management in the Era of Big Data

2020

View full text Add to dashboard Cite

Radio is evolving in a changing digital media ecosystem. Audio-on-demand has shaped the landscape of big unstructured audio data available online. In this paper, a framework for knowledge extraction is introduced, to improve discoverability and enrichment of the provided content. A web application for live radio production and streaming is developed. The application offers typical live mixing and broadcasting functionality, while performing real-time annotation as a background process by logging user operation events. For the needs of a typical radio station, a supervised speaker classification model is trained for the recognition of 24 known speakers. The model is based on a convolutional neural network (CNN) architecture. Since not all speakers are known in radio shows, a CNN-based speaker diarization method is also proposed. The trained model is used for the extraction of fixed-size identity d-vectors. Several clustering algorithms are evaluated, having the d-vectors as input. The supervised speaker recognition model for 24 speakers scores an accuracy of 88.34%, while unsupervised speaker diarization scores a maximum accuracy of 87.22%, as tested on an audio file with speech segments from three unknown speakers. The results are considered encouraging regarding the applicability of the proposed methodology.

show abstract

Augmenting Social Multimedia Semantic Interaction through Audio-Enhanced Web-TV Services

Tsipas

Zapartas

Vrysis

et al. 2015

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nikolaos Tsipas

Machine-Assisted Learning in Highly-Interdisciplinary Media Fields: A Multimedia Guide on Modern Art

Crowdsourcing Audio Semantics by Means of Hybrid Bimodal Segmentation with Hierarchical Classification

1D/2D Deep CNNs vås. Temporal Feature Integration for General Audio Classification

Audiovisual production, restoration-archiving and content management methods to preserve local tradition and folkloric heritage

Efficient audio-driven multimedia indexing through similarity-based speech / music discrimination

Semi-supervised audio-driven TV-news speaker diarization using deep neural embeddings

Web Radio Automation for Audio Stream Management in the Era of Big Data

Augmenting Social Multimedia Semantic Interaction through Audio-Enhanced Web-TV Services

Contact Info

Product

Resources

About