In the online world, fraudsterscan monetary gain. Corpus development research can be use identify keywords used by fraudsters online to prevent the crime. The aim of this research is to develop a corpus for Malay investment fraud so that it can be used in detection and classification of investment fraud in Malay website and compare the most suitable technique. tagger (POS) and Named Entity Recognition (NER) tagger are selected. Proposed methodology that are used in this research is corpus development, training and development of dataset using Naïve Bayes and performance evaluation. The dataset used in this research is online news archive and discussion forums. This research able to help the law enforcements agencies in collecting and notifying the keyword used by fraudsters so that they can take any legal actions. scaneasily manipulate people to gain something and usually for monetary gain. Corpus development research can be use identify keywords used by fraudsters online to prevent the crime. The aim of this research is to develop a corpus for Malay t can be used in detection and classification of investment fraud in Malay website and compare the most suitable technique. In this research, Part tagger (POS) and Named Entity Recognition (NER) tagger are selected. Proposed used in this research is corpus development, training and development of dataset using Naïve Bayes and performance evaluation. The dataset used in this research is online news archive and discussion forums. This research able to help the law enforcements gencies in collecting and notifying the keyword used by fraudsters so that they can take any corpus development; information extraction; part-of-speech; named entity mail: mazura@utm.my http://dx.doi.org/10.4314/jfas.v9i6s.62 Journal of Fundamental and Applied Sciences http://www.jfas.info Creative Commons Attribution-NonCommercial 4.0 Research Associations category. ON CORPUS DEVELOPMENT FOR MALAY INVESTMENT FRAUD DETECTION IN WEBSITE aculty of Computing, Universiti Malaysia manipulate people to gain something and usually for monetary gain. Corpus development research can be use identify keywords used by fraudsters online to prevent the crime. The aim of this research is to develop a corpus for Malay t can be used in detection and classification of investment fraud in this research, Part-of-Speech tagger (POS) and Named Entity Recognition (NER) tagger are selected. Proposed used in this research is corpus development, training and development of dataset using Naïve Bayes and performance evaluation. The dataset used in this research is online news archive and discussion forums. This research able to help the law enforcements gencies in collecting and notifying the keyword used by fraudsters so that they can take any speech; named entity Research Article Special Issue N. Hashim et al.