The main cause of death related to cancer worldwide is from hepatic cancer. Detection of hepatic cancer early using computed tomography (CT) could prevent millions of patients’ death every year. However, reading hundreds or even tens of those CT scans is an enormous burden for radiologists. Therefore, there is an immediate need is to read, detect, and evaluate CT scans automatically, quickly, and accurately. However, liver segmentation and extraction from the CT scans is a bottleneck for any system, and is still a challenging problem. In this work, a deep learning-based technique that was proposed for semantic pixel-wise classification of road scenes is adopted and modified to fit liver CT segmentation and classification. The architecture of the deep convolutional encoder–decoder is named SegNet, and consists of a hierarchical correspondence of encode–decoder layers. The proposed architecture was tested on a standard dataset for liver CT scans and achieved tumor accuracy of up to 99.9% in the training phase.
Sign language is the most natural and effective way for communications among deaf and normal people. American Sign Language (ASL) alphabet recognition (i.e. fingerspelling) using marker-less vision sensor is a challenging task due to the difficulties in hand segmentation and appearance variations among signers. Existing color-based sign language recognition systems suffer from many challenges such as complex background, hand segmentation, large inter-class and intra-class variations. In this paper, we propose a new user independent recognition system for American sign language alphabet using depth images captured from the low-cost Microsoft Kinect depth sensor. Exploiting depth information instead of color images overcomes many problems due to their robustness against illumination and background variations. Hand region can be segmented by applying a simple preprocessing algorithm over depth image. Feature learning using convolutional neural network architectures is applied instead of the classical handcrafted feature extraction methods. Local features extracted from the segmented hand are effectively learned using a simple unsupervised Principal Component Analysis Network (PCANet) deep learning architecture. Two strategies of learning the PCANet model are proposed, namely to train a single PCANet model from samples of all users and to train a separate PCANet model for each user, respectively. The extracted features are then recognized using linear Support Vector Machine (SVM) classifier. The performance of the proposed method is evaluated using public dataset of real depth images captured from various users. Experimental results show that the performance of the proposed method outperforms state-of-the-art recognition accuracy using leave-one-out evaluation strategy.
Deep Convolutional Neural Networks (DCNN) are currently the predominant technique commonly used to learn visual features from images. However, the complex structure of most recent DCNNs impose two major requirements namely, huge labeled dataset and high computational resources. In this paper, we develop a new efficient deep unsupervised network to learn invariant image representation from unlabeled visual data. The proposed Deep Convolutional Self-organizing Maps (DCSOM) network comprises a cascade of convolutional SOM layers trained sequentially to represent multiple levels of features. The 2D SOM grid is commonly used for either data visualization or feature extraction. However, this work employs high dimensional map size to create a new deep network. The N-Dimensional SOM (ND-SOM) grid is trained to extract abstract visual features using its classical competitive learning algorithm. The topological order of the features learned from ND-SOM helps to absorb local transformation and deformation variations exhibited in the visual data. The input image is divided into an overlapped local patches where each local patch is represented by the N-coordinates of the winner neuron in the ND-SOM grid. Each dimension of the ND-SOM can be considered as a non-linear principal component and hence it can be exploited to represent the input image using N-Feature Index Image (FII) bank. Multiple convolutional SOM layers can be cascaded to create a deep network structure. The output layer of the DCSOM network computes local histograms of each FII bank in the final convolutional SOM layer. A set of experiments using MNIST handwritten digit database and all its variants are conducted to evaluate the robust representation of the proposed DCSOM network. Experimental results reveal that the performance of DCSOM outperforms state-of-the-art methods for noisy digits and achieve a comparable performance with other complex deep learning architecture for other image variations.
A Radio mean labeling of a connected graph is an injective function from the vertex set, , to the set of natural numbers such that for any two distinct vertices and of , ⌈ ⌉. The radio mean number of , , is the maximum number assigned to any vertex of. The radio mean number of , , is the minimum value of , taken over all radio mean labeling of. This work has three contributions. The first one is proving two theorems which find the radio mean number for cycles and paths. The second contribution is proposing an approximate algorithm which finds an upper bound for radio mean number of a given graph. The third contribution is that we introduce a novel integer linear programing formulation for the radio mean problem. Finally, the experimental results analysis and statistical test proved that the Integer Linear Programming Model overcame the proposed approximate algorithm according to CPU time only. On the other hand, both the Integer Linear Programming Model and the proposed approximate algorithm had the same upper bound of the radio mean number of G.
One of the most promising techniques used in various sciences is deep neural networks (DNNs). A special type of DNN called a convolutional neural network (CNN) consists of several convolutional layers, each preceded by an activation function and a pooling layer. The feature map of the previous layer is sampled by the pooling layer (that seems to be an important layer) to create a new feature map with condensed resolution. This layer significantly reduces the spatial dimension of the input. It always accomplished two main goals. As a first step, it reduces the number of parameters or weights to minimize computational costs. The second step is to prevent the overfitting of the network. In addition, pooling techniques can significantly reduce model training time and computational costs. This paper provides a critical understanding of traditional and modern pooling techniques and highlights the strengths and weaknesses for readers. Moreover, the performance of pooling techniques on different datasets is qualitatively evaluated and reviewed. This study is expected to contribute to a comprehensive understanding of the importance of CNNs and pooling techniques in computer vision challenges.
Radio frequency identification (RFID) is a rapidly developing technology, and RFID sensors have become important components in many common technology applications. The passive ultra-high frequency (UHF) tags used in RFID sensors have a higher data transfer rate and longer read range and usually come in unique small and portable application designs. However, these tags suffer from significant frequency interference when mounted on metallic materials or placed near liquid surfaces. This paper presents the recent advancements made in passive UHF-RFID tag designs proposed to resolve the interference problems. We focus on those designs that are intended to improve antenna read range as well as scalability designs for miniaturized applications.
Echocardiography is an ultrasound-based imaging modality that helps the physician to visualize heart chambers and valves motion activity. Recently, deep learning plays an important role in several clinical computer-assisted diagnostic systems. There is a real need to employ deep learning methodologies to increase such systems. In this paper, we proposed a deep learning system to classify several echocardiography views and identify its physiological location. Firstly, the spatial CNN features are extracted from each frame in the echo-motion. Secondly, we proposed novel temporal features based on neutrosophic sets. The neutrosophic temporal motion features are extracted from echo-motion activity. To extract the deep CNN features, we activated a pre-trained deep ResNet model. Then, both spatial and neutrosophic temporal CNN features were fused based on features concatenation technique. Finally, the fused CNN features were fed into deep long short-term memory network to classify echo-cardio views and identify their location. During our experiments, we employed a public echocardiography dataset that consisted of 432 videos for eight cardio-views. We have investigated several pre-trained network activation performances. ResNet architecture activation achieved the best accuracy score among several pre-trained networks. The Proposed system based on fused spatial neutrosophic temporal deep features achieved 96.3% accuracy and 95.75% sensitivity. For the classification of cardio-views location, the proposed system achieved 99.1% accuracy. The proposed system achieved more accuracy than previous deep learning methods with a significant decrease in the training time cost. The experimental results showed promising results for our proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.