DNA sequence classification is an important challenge in genomic studies due to non-linear and chaotic behavior of DNA oxidation signals of Adenine, Cytosine, Guanine, and Thymine bases. To achieve genotype identification of samples derived from biological sources accurately, Machine Learning (ML) methods have been commonly preferred instead of expert-based methods due to the ability in handling such these complex-structured biological sequences. Reducing the dimension without sacrificing important information that should not be omitted during the classification process is an important task in ML applications. This st udy presents a new feature extraction method to detect two sub-types of hepatitis nucleic acid trace files. The proposed method combines both discrete wavelet transform (DWT) and entropy. The DWT decomposes the bases signals up to three levels and thus all necessary information that is hidden in both spatial and frequency domains is aimed to captured. To achieve a good summarization of DNA trace files having different length, multi-scale permutation entropy (MPE) measures are then computed from approximate and detail coefficients o f signals s tored in the s ub-bands. Different feature sets are extracted with the proposed method using real data covering 200 hepatitis DNA trace files and then fed to a simple memory-based learning classifier, k-NN. The classification performance of the proposed feature extraction method is compared against a method based on MPE features without wavelet decomposition. The results indicate, in classifying hepatitis DNA trace files, the average accuracy reaches up to nearly 99% with feature sets based on proposed method even at 30% training samples proportion. Cite this article as: Yiğit ÖE, Öz E. Feature extraction for DNA capillary electrophesis signals based on discrete wavelet transform combined with multi-scale permutation entropy. Sigma
Borsa İstanbul (BIST) tarafından hesaplanan şehir endeksleri, belirli bir bölgeye yatırım yapmak isteyen yatırımcılara önemli bir yol göstericidir. Türkiye'de 13 ilin finansal performansları 2009 yılı başından itibaren bu endeksler tarafından yansıtılmaktadır. Bu çalışmada, şehir endeksi serilerinin çoğunlukla oynaklığı ile ilgilenen önceki çalışmalardan farklı olarak, teknik göstergeleri farklı makine öğrenmesi modellerine entegre ederek şehir endeksi serilerine ait hareketin yönünü tahmin etmede kullanılabilen bileşik bir prosedür önerilmiştir. Önerilen prosedür, BIST'de işlem gören en yüksek hisse senedi sayısına sahip İstanbul şehir endeksi (XSIST) serisine uygulanmıştır. Hacim, oynaklık, trend ve momentuma dayalı 38 farklı teknik gösterge hesaplanmış ve XSIST serisinin günlük değişiminde en etkili göstergeler, 6 farklı makine öğrenmesi modellerine girdi olarak seçilmiştir. Öğrenme modellerin performansı, karmaşıklık matrislerine dayalı metrikler yardımıyla karşılaştırılmıştır.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.