The upsurge in the volume of unwanted emails called spam has created an intense need for the development of more dependable and robust antispam filters. Machine learning methods of recent are being used to successfully detect and filter spam emails. We present a systematic review of some of the popular machine learning based email spam filtering approaches. Our review covers survey of the important concepts, attempts, efficiency, and the research trend in spam filtering. The preliminary discussion in the study background examines the applications of machine learning techniques to the email spam filtering process of the leading internet service providers (ISPs) like Gmail, Yahoo and Outlook emails spam filters. Discussion on general email spam filtering process, and the various efforts by different researchers in combating spam through the use machine learning techniques was done. Our review compares the strengths and drawbacks of existing machine learning approaches and the open research problems in spam filtering. We recommended deep leaning and deep adversarial learning as the future techniques that can effectively handle the menace of spam emails.
The field of natural language processing enables machines to read and understand the languages human being speaks. There are three major languages in Nigeria: Yorùbá, Igbo and Hausa. Yorùbá, a major Nigeria language spoken by over fifty million people which has the potentials of serving as medium for scientific and technological development deserves more recognition than it is in Nigeria today. Developing a computational model for English language and Yoruba language noun-phrases involve a profound understanding of the syntactic and grammatical features of the two languages as well as their vocabularies since they are not related syntactically and grammatically. Twenty nine rules were formulated for the noun phrase translations which were specified using the context free grammar (CFG). We then modeled and recognized the grammar of the language using the finite state automata (FSA) whose operations was based on the first set techniques. The first sets techniques allow the parser to choose which production rule to apply based on the first input word of an input phrase. We also developed a bilingual lexicon which is made up of words in English language with their corresponding Yoruba counterparts and their equivalent part of speech. The model was implemented using PHP Hypertext Preprocessor (PhP) programming language and my structured query language (SQL) and was tested on four-hundred randomly selected noun-phrases and gives accuracy of 91% which is quite encouraging.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.