Diabetic Retinopathy (DR) is one of the major causes of visual impairment and blindness across the world. It is usually found in patients who suffer from diabetes for a long period. Major focus of this work is to derive optimal representation of retinal images that further helps to improve the performance of DR recognition models. In order to extract optimal representation, features extracted from multiple pre-trained ConvNet models are blended using proposed multi-modal fusion module. These final representations are used to train a Deep Neural Network (DNN) used for DR identification and severity level prediction. As each ConvNet extract different features, fusing them using 1-D pooling, and cross pooling lead to better representation than using features extracted from a single ConvNet. Experimental studies on benchmark Kaggle APTOS 2019 contest dataset reveals that the model trained on proposed blended feature representations is superior to the existing methods. In addition, we notice that cross average pooling based fusion of features from Xception and VGG16 is the most appropriate for DR recognition. With the proposed model, we achieve an accuracy of 97.41%, and a kappa statistic of 94.82 for DR identification and an accuracy of 81.7% and a kappa statistic of 71.1% for severity level prediction. Another interesting observation is that, DNN with dropout at input layer converges faster when trained using blended features, than compared to the same model trained using uni-modal deep features.
Coronavirus Disease 2019 (COVID-19) is a deadly infection that affects the respiratory organs in humans as well as animals. By 2020, this disease turned out to be a pandemic affecting millions of individuals across the globe. Conducting rapid tests for a large number of suspects preventing the spread of the virus has become a challenge. In the recent past, several deep learning based approaches have been developed for automating the process of detecting COVID-19 infection from Lung Computerized Tomography (CT) scan images. However, most of them rely on a single model prediction for the final decision which may or may not be accurate. In this paper, we propose a novel ensemble approach that aggregates the strength of multiple deep neural network architectures before arriving at the final decision. We use various pre-trained models such as VGG16, VGG19, InceptionV3, ResNet50, ResNet50V2, InceptionResNetV2, Xception, and MobileNet and fine-tune them using Lung CT Scan images. All these trained models are further used to create a strong ensemble classifier that makes the final prediction. Our experiments exhibit that the proposed ensemble approach is superior to existing ensemble approaches and set state-of-the-art results for detecting COVID-19 infection from lung CT scan images.
With the advent of social networking and internet, it is very common for the people to share their reviews or feedback on the products they purchase or on the services they make use of or sharing their opinions on an event. These reviews could be useful for the others if analyzed properly. But analyzing the enormous textual information manually is impossible and automation is required. The objective of sentiment analysis is to determine whether the reviews or opinions given by the people give a positive sentiment or a negative sentiment. This has to be predicted based on the given textual information in the form of reviews or ratings. Earlier linear regression and SVM based models are used for this task but the introduction of deep neural networks has displaced all the classical methods and achieved greater success for the problem of automatically generating sentiment analysis information from textual descriptions. Most recent progress in this problem has been achieved through employing recurrent neural networks (RNNs) for this task. Though RNNs are able to give state of the art performance for the tasks like machine translation, caption generation and language modeling, they suffer from the vanishing or exploding gradients problems when used with long sentences. In this paper we use LSTMs, a variant of RNNs to predict the sentiment analysis for the task of movie review analysis. LSTMs are good in modeling very long sequence data. The problem is posed as a binary classification task where the review can be either positive or negative. Sentence vectorization methods are used to deal with the variability of the sentence length. In this paper we try to investigate the impact of hyper parameters like dropout, number of layers, activation functions. We have analyzed the performance of the model with different neural network configurations and reported their performance with respect to each configuration. IMDB bench mark dataset is used for the experimental studies.
A machine learning model is introduced to recognize the severity level of the Diabetic Retinopathy (DR), a disease observed in the people suffering from diabetes for a long time and is one of the causes of vision loss and blindness. Major objective of this approach is to generate an effective feature representation of the fundus images so that the level of severity can be identified with less effort and using limited number of samples for training. Color fundus images of the retina are collected, preprocessed and deep features are extracted by feeding them to a deep Convolutional Network, Neural Architecture Search Network (NASNet) which searches for the best convolutional layer (or "cell") in NASNet search space. The representations of retinal images in deep space are given as input to the classification model to get the severity level of the disease. The proposed model is applied on the benchmark APTOS 2019 retinal fundus image dataset to evaluate the performance of the proposed model. Our experimental studies indicate that ⱱ-Support Vector Machine (ⱱ-SVM) when trained using the projected deep features leads to an improvement in accuracy compared to other machine learning models for fundus image classification. In addition, from the experimental studies we understand that deep features from NASNet give better representation compared to the handcrafted features and features obtained using other projections. We observe that deep features transformed using t-distributed stochastic neighbor embedding (t-SNE) gives more discriminative representations of retinal images and help to achieve an accuracy of 77.90%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.