Adnan Ahmed Siddiqui scite author profile

Deep convolutional neural network (DCNN) and recurrent neural network (RNN) have been proved as an imperious research area in multimedia understanding and obtained remarkable action recognition performance. However, videos contain rich motion information with varying dimensions. Existing recurrent based pipelines fail to capture long-term motion dynamics in videos with various motion scales and complex actions performed by multiple actors. Consideration of contextual and salient features is more important than mapping a video frame into a static video representation. This research work provides a novel pipeline by analyzing and processing the video information using a 3D convolution (C3D) network and newly introduced deep bidirectional LSTM. Like popular two-stream convent, we also introduce a two-stream framework with one modification; that is, we replace the optical flow stream by saliency-aware stream to avoid the computational complexity. First, we generate a saliency-aware video stream by applying the saliency-aware method. Secondly, a two-stream 3D-convolutional network (C3D) is utilized with two different types of streams, i.e., RGB stream and saliency-aware video stream, to collect both spatial and semantic temporal features. Next, a deep bidirectional LSTM network is used to learn sequential deep temporal dynamics. Finally, time-series-pooling-layer and softmax-layers classify human activity and behavior. The introduced system can learn long-term temporal dependencies and can predict complex human actions. Experimental results demonstrate the significant improvement in action recognition accuracy on different benchmark datasets.

show abstract

Lightweight Deep Learning Classification Model for Identifying Low-Resolution CT Images of Lung Cancer

Marappan

Mujib

Siddiqui

et al. 2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

With an astounding five million fatal cases every year, lung cancer is among the leading causes of mortality worldwide for both men and women. The diagnosis of lung illnesses can benefit from the information a computed tomography (CT) scan can offer. The major goals of this study are to diagnose lung cancer and its seriousness and to identify malignant lung nodules from the provided input lung picture. This paper applies unique deep learning techniques to identify the exact location of the malignant lung nodules. Using a DenseNet model, mixed ground glass is analyzed in low-dose, low-resolution CT scan images of nodules (mGGNs) with a slice thickness of 5 mm in this study. This was done to categorize and identify many histological subtypes of lung cancer. Low-resolution CT scans are used to pathologically classify invasive adenocarcinoma (IAC) and minimally invasive adenocarcinoma (MIA). 105 low-resolution CT images with 5 mm thick slices from 105 patients at Lishui Central Hospital were selected. To detect and distinguish, IAC and MIA, extend and enhance deep learning two- and three-dimensional DenseNet models are used. The two-dimensional DenseNet model was shown to perform much better than the three-dimensional DenseNet model in terms of classification accuracy (76.67%), sensitivity (63.3%), specificity (100%), and area under the receiver operating characteristic curve (0.88). Finding the histological subtypes of persons with lung cancer should aid doctors in making a more precise diagnosis, even if the image quality is not outstanding.

show abstract

Communicator for Hearing-Impaired Persons using Pakistan Sign Language (PSL)

Wasim¹,

Siddiqui²,

Shaikh³

et al. 2018

ijacsa

View full text Add to dashboard Cite

Communication with a hearing-impaired individual is a big challenge for a normal person. Hearingimpaired people uses hand gesture language (sign language) to communicate with each other, which is not easy to understand by a normal person because he/she is not trained to understand sign language. This communication gap between a hearing-impaired and a normal person created big problem for hearing-impaired individuals during their shopping, hospitalization, at their schools and homes. Especially in case of emergency, it is very difficult to understand the statement of a hearing-impaired one's who uses sign language. In the last few years researchers and developers from all over the world presented different ideas and works to solve this problem but no such solution is available to resolve this issue and can create two-way communication between hearing-impaired and normal persons. This paper presented a detail description about a two-way communication system based on Pakistan Sign Language (PSL). This duplex system is developed through conversion from the text in simple English into hand gestures and vice versa. However, conversion from hand gestures is available not only in text but also with voice providing more convenience to normal person. Main objective is to facilitate a large population and making hearingimpaired persons, the vital part of our civilization. A normal person can enter the text (sentence) in application, after the checking of spelling and grammar, the text is divided into tokens and sub-tokens. A token is a gesture against each word of the text while sub-tokens are the gestures of each character of the words. The combination of tokens created the gestures of text. On the other hand when gestures were input in to the application, using image processing technique, the nature of hand gesture were recognized and converted into corresponding text or voice.

show abstract

Object’s Shape Recognition using Local Binary Patterns

Wasim¹,

Siddiqui²,

Aziz³

et al. 2017

ijacsa

View full text Add to dashboard Cite

AbstractThis paper discusses the concept of object's shape identification using local binary pattern technique (LBP). Since LBP is computationally simple it has been utilized successfully for recognition of various objects. LBP which has the potential to be used in various identification related fields was applied on a number of different shaped objects, the process converted the given image in to 3x3 binary matrices and several rounds of computation yields the final decision parameter, which is known as merit function. This parameter was then exploited to uniquely identify the shape of different objects.

show abstract

Acoustic Propagation in Ocean Sediments using Biot Model

Shaikh

Huang²,

Khan

et al. 2022

View full text Add to dashboard Cite

Single Modality-Based Event Detection Framework for Complex Videos

Arif¹,

Siddiqui²,

Kumar³

et al. 2020

IJACSA

View full text Add to dashboard Cite

Event detection of rare and complex events in large video datasets or in unconstrained user-uploaded videos on internet is a challenging task. The presence of irregular camera movement, viewpoint changes, illumination variations and significant changes in the background make extremely difficult to capture underlying motion in videos. In addition, extraction of features using different modalities (single streams) may offer computational complexities and cause abstraction of confusing and irrelevant spatial and semantic features. To address this problem, we present a single stream (RGB only) based on feature of spatial and semantic features extracted by modified 3D Residual Convulsion Network. We combine the spatial and semantic features based on this assumption that difference between both types of features can discover the accurate and relevant features. Moreover, introduction of temporal encoding builds the relationship in consecutive video frames to explore discriminative long-term motion patterns. We conduct extensive experiments on prominent publically available datasets. The obtained results demonstrate the great power of our proposed model and improved accuracy compared with existing state-ofthe-art methods.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.