Amir Shirian scite author profile

Amir Shirian

4Publications

37Citation Statements Received

192Citation Statements Given

How they've been cited

How they cite others

162

192

Affiliations

University of Warwick, Coventry (United Kingdom), University of Tehran

Publications

Order By: Most citations

Compact Graph Architecture for Speech Emotion Recognition

Shirian

Guha

2021

View full text Add to dashboard Cite

We propose a deep graph approach to address the task of speech emotion recognition. A compact, efficient and scalable way to represent data is in the form of graphs. Following the theory of graph signal processing, we propose to model speech signal as a cycle graph or a line graph. Such graph structure enables us to construct a Graph Convolution Network (GCN)-based architecture that can perform an accurate graph convolution in contrast to the approximate convolution used in standard GCNs. We evaluated the performance of our model for speech emotion recognition on the popular IEMOCAP and MSP-IMPROV databases. Our model outperforms standard GCN and other relevant deep graph architectures indicating the effectiveness of our approach. When compared with existing speech emotion recognition methods, our model achieves comparable performance to the state-of-the-art with significantly fewer learnable parameters (∼30K) indicating its applicability in resource-constrained devices. Our code is available at /github.com/AmirSh15/Compact SER.

show abstract

Dynamic Emotion Modeling With Learnable Graphs and Graph Inception Network

Shirian

Tripathi

Guha

2022

IEEE Trans. Multimedia

View full text Add to dashboard Cite

show abstract

Self-Supervised Graphs for Audio Representation Learning With Limited Labeled Data

Shirian

Somandepalli

Guha

2022

IEEE J. Sel. Top. Signal Process.

View full text Add to dashboard Cite

show abstract

Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data

Shirian¹,

Somandepalli²,

Guha³

2022

Preprint

View full text Add to dashboard Cite

Large scale databases with high-quality manual annotations are scarce in audio domain. We thus explore a self-supervised graph approach to learning audio representations from highly limited labeled data. Considering each audio sample as a graph node, we propose a subgraph-based framework with novel self-supervision tasks that can learn effective audio representations. During training, subgraphs are constructed by sampling the entire pool of available training data to exploit the relationship between the labeled and unlabeled audio samples. During inference, we use random edges to alleviate the overhead of graph construction. We evaluate our model on three benchmark audio databases, and two tasks: acoustic event detection and speech emotion recognition. Our semi-supervised model performs better or on par with fully supervised models and outperforms several competitive existing models. Our model is compact (240k parameters), and can produce generalized audio representations that are robust to different types of signal noise.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Amir Shirian

Compact Graph Architecture for Speech Emotion Recognition

Dynamic Emotion Modeling With Learnable Graphs and Graph Inception Network

Self-Supervised Graphs for Audio Representation Learning With Limited Labeled Data

Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data

Contact Info

Product

Resources

About