Abstract. Text mining has become an effective tool for analyzing text documents in automated ways. Conceptually, clustering, classification and searching of legal documents to identify patterns in law corpora are of key interest since it aids law experts and police officers in their analyses. In this paper, we develop a document classification, clustering and search methodology based on neural network technology that helps law enforcement department to manage criminal written judgments more efficiently. In order to maintain a manageable number of independent Chinese keywords, we use term extraction scheme to select top-n keywords with the highest frequency as inputs of the Back-Propagation Network (BPN), and select seven criminal categories as target outputs of it. Related legal documents are automatically trained and tested by pre-trained neural network models. In addition, we use SelfOrganizing Map (SOM) method to cluster criminal written judgments. The research shows that automatic classification and clustering modules classify and cluster legal documents with a very high accuracy. Finally, the search module which uses the previous results helps users find relevant written judgments of criminal cases.
Purpose -The purpose of this paper is to establish a new approach for solving the expansion term problem. Design/methodology/approach -This study develops an expansion term weighting function derived from the valuable concepts used by previous approaches. These concepts include probability measurement, adjustment according to situations, and summation of weights. Formal tests have been conducted to compare the proposed weighting function with the baseline ranking model and other weighting functions. Findings -The results reveal stable performance by the proposed expansion term weighting function. It proves more effective than the baseline ranking model and outperforms other weighting functions.Research limitations/implications -The paper finds that testing additional data sets and potential applications to real working situations is required before the generalisability and superiority of the proposed expansion term weighting function can be asserted. Originality/value -Stable performance and an acceptable level of effectiveness for the proposed expansion term weighting function indicate the potential for further study and development of this approach. This would add to the current methods studied by the information retrieval community for culling information from documents.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.