The digital India program encourages Indian citizens to become conversant with e-services which are primarily English language-based services. However, the vast majority of the Indian population is comfortable with vernacular languages like Bengali, Assamese, Hindi, etc. The rural villagers are not able to interact with the Relational Database Management system in their native language. Therefore, create a system that produces SQL queries from natural language queries in Bengali, containing ambiguous words. This paper proposes a Bengali Query Processor named Extended Bengali language Query Processing System (XBLQPS) to handle queries containing ambiguous words posted to a Healthcare Information database in the electronic domain. The Healthcare Information database contains doctor, hospital and department details in the Bengali language. The proposed system provides support for the Bengalispeaking Indian rural population to efficiently fetch required information from the database. The proposed system extracts the Bengali root word by removing the inflectional part and categorizing them to a specific part of speech (POS) using modified Bengali WordNet. The proposed system uses manually annotated parts of speech detection of a word based on Bengali WordNet. Patterns of noun phrases are generated to detect the correct noun phrase as well as entity and attribute(s). Entity and attributes are used to prepare the semantic table which is utilized to create the Structured Query Language (SQL). The simplified LESK method is utilized to resolve ambiguous Bengali phrases in this query processing system. The accuracy, precision, recall and F1 score of the system is measured as 70%, 74%, 73%, and 73% respectively.
This paper proposes a disease detection system where it receives the query in form of symptoms of the disease in the Bengali language. This system is able to handle natural language queries in Bengali. The proposed system assists a layman to detect a probable disorder or disease in their body using disease symptoms. The proposed research work is challenging due to insufficient resources in vernacular languages like Bengali. This system receives a description of the patient's symptoms in the Bengali language and after processing the natural language text, it detects any potential disorders or diseases that may have occurred. This research work has been implemented separately by using the two most popular sequential prediction models. One is Bi-directional Long Short-Term Memory (Bi-LSTM) and the other is Bi-directional Gated Recurrent Unit (Bi-GRU). Both Bi-GRU and Bi-LSTM have provided significant results on a dataset of 3714 samples. The raw clinical text categorization data has been gathered from the Kaggle to build the detection model. The performances of disease detectability of both models have been measured using precision, recall and f1-score. The accuracy of the proposed system using the Bi-LSTM and the Bi-GRU models are 97.85% and 99.73%, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.