“…In this group, the baselines use different approaches to deal with the multi-modal issue without considering the label dependence issue. Specifically, in these approaches, a linear layer of L dimensions Approaches Acc HL F1 BR (Shen et al, 2004) 0.222 0.371 0.386 CC (Read et al, 2011) 0.225 0.377 0.386 RAkLA (Tsoumakas et al, 2011) 0.242 0.376 0.397 AC (Kim et al, 2018) 0.388 0.240 0.492 LSAN (Xiao et al, 2019) 0.393 0.209 0.501 DRS2S 0.436 0.215 0.523 GMFN (Zadeh et al, 2018b) 0.396 0.195 0.517 RAVEN 0.416 0.195 0.517 MulT (Tsai et al, 2019) 0 with sigmoid activation is used to predict the emotions. ( 7) GMFN 2 (Zadeh et al, 2018b), which explicitly models the multi-modal interactions by capturing uni-modal, bi-modal and tri-modal interactions.…”