“…Data classification is one of the most important tasks for different applications, such as text categorization, tone recognition, image classification, microarray gene expression, and protein structure prediction ( Choi et al, 2017 ; Johnson and Zhang, 2017 ; Malhotra et al, 2017 ; Aggarwal et al, 2018 ; Fang et al, 2018 ; Mikołajczyk and Grochowski, 2018 ; Kerkeni et al, 2019 ; Saritas and Yasar, 2019 ; Yildirim et al, 2019 ; Chandrasekar et al, 2020 ). Many types of information (e.g., language, music, and gene) can be represented as sequential data that often contains related information separated by many time steps, and these long-term dependencies are difficult to model as we must retain information from the whole sequence with greater complexity of the model ( Trinh et al, 2018 ; Liu et al, 2019 ; Shewalkar, 2019 ; Yu et al, 2019 ; Zhao et al, 2020 ).…”