Motivation: With an overwhelming amount of textual information in molecular biology and biomedicine, there is a need for effective and efficient literature mining and knowledge discovery that can help biologists to gather and make use of the knowledge encoded in text documents. In order to make organized and structured information available, automatically recognizing biomedical entity names becomes critical and is important for information retrieval, information extraction and automated knowledge acquisition. Results: In this paper, we present a named entity recognition system in the biomedical domain, called PowerBioNE. In order to deal with the special phenomena of naming conventions in the biomedical domain, we propose various evidential features: (1) word formation pattern; (2) morphological pattern, such as prefix and suffix; (3) part-of-speech; (4) head noun trigger; (5) special verb trigger and (6) name alias feature. All the features are integrated effectively and efficiently through a hidden Markov model (HMM) and a HMM-based named entity recognizer. In addition, a k -Nearest Neighbor (k -NN) algorithm is proposed to resolve the data sparseness problem in our system. Finally, we present a pattern-based post-processing to automatically extract rules from the training data to deal with the cascaded entity name phenomenon. From our best knowledge, PowerBioNE is the first system which deals with the cascaded entity name phenomenon. Evaluation shows that our system achieves the F -measure of 66.6 and 62.2 on the 23 classes of GENIA V3.0 and V1.1, respectively. In particular, our system achieves the F -measure of 75.8 on the 'protein' class of GENIA V3.0. For comparison, our system outperforms the best published result by 7.8 on GENIA V1.1, without help of any dictionaries. It also shows that our HMM and the k -NN algorithm outperform other models, such as back-off HMM, linear interpolated HMM, support vector machines, C4.5, C4.5 rules and RIPPER, by effectively capturing the local context dependency and resolving the data sparseness problem. Moreover, evaluation on GENIA V3.0 shows that the post-processing for the * To whom correspondence should be addressed.cascaded entity name phenomenon improves the F -measure by 3.9. Finally, error analysis shows that about half of the errors are caused by the strict annotation scheme and the annotation inconsistency in the GENIA corpus. This suggests that our system achieves an acceptable F -measure of 83.6 on the 23 classes of GENIA V3.0 and in particular 86.2 on the 'protein' class, without help of any dictionaries. We think that a F -measure of 90 on the 23 classes of GENIA V3.0 and in particular 92 on the 'protein' class, can be achieved through refining of the annotation scheme in the GENIA corpus, such as flexible annotation scheme and annotation consistency, and inclusion of a reasonable biomedical dictionary.
Tumor is an abnormal tissue which can be appeared at any part of the body. It can be classified to either benign or malignant. One of the most common women's tumors that infest the breast. Various benign disorders like development of cysts in woman’s breast occur due to hormonal changes and are at the risk of becoming malignant. A number of thermal models are reported to differentiate between normal and malignant tissues of breast. But no thermal model is reported in study the effect of benign disorders on the literature to distinguish between benign and malignant disorders in woman’s breast. An attempt has been made in this paper to study the thermal disturbances caused by cysts and malignant tumors in the fat tissues of woman’s breast. The model is developed for a two-dimensional steady state case using penne’s bio heat equation and incorporating parameters like thermal conductivity, blood mass flow rate and self-controlled metabolic heat generation. The appropriate adiabatic boundary conditions have been framed for various environmental conditions. The finite element method has been employed to obtain the solution. The results have been obtained for different sizes of spherical shaped cysts and different depth of tissues in hemispherical shaped woman’s breast. The relation of size and position of the cysts have been studied with the thermal distribution in various tissues layers of the woman’s breast. The comparison of thermal profiles for cysts and malignant tumors in woman’s breast has been performed. A contrast in thermal behavior of cyst and malignant tumor in woman’s breast is observed which can be useful to distinguish between the malignant tumor and cyst in woman’s breast to prevent false positive test for malignant tumor. Accordingly, this study found that there are various factors that could affect the cancer classification and prediction. Therefore in this study, Breast cancer data classification have been done using three classification techniques which are Artifical Neural Network (ANN), Support Vector Machine (SVM), and Random Forest (RF) in order to improve the performance of the model trained the model with selected features according to the analysis done.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.