Office-based laryngeal surgery performed using a CO2 laser was shown to be a feasible treatment option for various types of vocal lesions. However, patients should not undergo this procedure if they have multiple bulky lesions or lesions involving the subglottic area, the laryngeal ventricle, or (in cases of inadequate laryngeal stability) the free edge of a vocal fold.
This paper aims to present an algorithm that specifically enhances maxillary sinuses using a novel contrast enhancement technique based on the adaptive morphological texture analysis for occipitomental view radiographs. First, the skull X-ray (SXR) is decomposed into rotational blocks (RBs). Second, each RB is rotated into various directions and processed using morphological kernels to obtain the dark and bright features. Third, a gradient-based block segmentation decomposes the interpolated feature maps into feature blocks (FBs). Finally, the histograms of FBs are equalized and overlaid locally to the input SXR. The performance of the proposed method was evaluated on an independent dataset, which comprises of 145 occipitomental view-based human SXR images. According to the experimental results, the proposed method is able to increase the diagnosis accuracy by 83.45% compared with the computed tomography modality as the gold standard.
BACKGROUND
Dysphonia influences the quality of life by interfering with communication. However, laryngoscopic examination is expensive and not readily accessible in primary care units. Experienced laryngologists are required to achieve an accurate diagnosis.
OBJECTIVE
This study sought to detect various vocal fold diseases through pathological voice recognition using artificial intelligence.
METHODS
We collected 29 normal voice samples and 527 samples of individuals with voice disorders, including vocal atrophy (n=210), unilateral vocal paralysis (n=43), organic vocal fold lesions (n=244), and adductor spasmodic dysphonia (n=30). The 556 samples were divided into two sets: 440 samples as the training set and 116 samples as the testing set. A convolutional neural network approach was applied to train the model and findings were compared with human specialists.
RESULTS
The convolutional neural network model achieved a sensitivity of 0.70, a specificity of 0.90, and an overall accuracy of 65.5% for distinguishing normal voice, vocal atrophy, unilateral vocal paralysis, organic vocal fold lesions, and adductor spasmodic dysphonia. Compared to human specialists, the overall accuracy was 58.6% and 49.1% for the two laryngologists, and 38.8% and 34.5% for the two general ear, nose, and throat doctors.
CONCLUSIONS
We developed an artificial intelligence-based screening tool for common vocal fold diseases, which possessed high specificity after training with our Mandarin pathological voice database. This approach has clinical potential to use artificial intelligence for general vocal fold disease screening via voice and includes a quick survey during a general health examination. It can be applied in telemedicine for areas that lack laryngoscopic abilities in primary care units.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.