“…A theoretical study of such methods from the pre-Deep Learning era is provided in [26]. Rank-level decisionlevel fusion, such as Borda Count voting [27], [28], [29] and Reciprocal Rank Voting [30] are less popular, but have been successfully applied in the field of biometric identification [31], [32]. There are several works targeting multimodal fusion through learning-based methods, e.g., using SVM, LSTM, or neural network fusion layers [33], [34], [35], [7], [36], [37].…”