“…Although it is an under-resourced language, the presence of Malayalam in the form of articles and data repositories on the internet has been growing steadily over the years. It has featured in a limited number of NLP tasks, including morphological analysis (Bhavukam et al, 2018), POS tagging (Akhil et al, 2020) and NER (Ajees and Idicula, 2018). However, many studies use small locally generated data sets (Nambiar et al, 2019) or domain specific data sets (Kumar et al, 2019), (Devi et al, 2016), which usually are not freely available.…”