“…However, many of the widely spoken languages in the world are still low-resourced due to the lack of standard large-scale corpora. The LID research with those low-resource languages is commonly conducted with independently created small-scale in-house corpora, which have limited non-lingual variations [18]. As a consequence, training state-of-theart DNN-based LID systems using such small corpus is prone to overfitting.…”