Background. Nasopharyngeal carcinoma (NPC), particularly those tumors endemic to the Far East, commonly harbor Epstein-Barr virus (EBV), thought to serve as an important oncogenic promoter. Human papillomavirus (HPV) is associated with a proportion of upper aerodigestive tract carcinomas. We hypothesized that HPV might also contribute to the pathogenesis of NPC, and we queried whether geographic and racial distinctions may be identified between NPC of the Far East versus those diagnosed in Caucasian American patients with regard to the interrelationship of histologic subtype and viral infection.Materials and Methods. Formalin-fixed paraffin-embedded tissue (FFPET) from 30 patients (6 Caucasian Americans, 1 Chinese American, 14 and 9 patients from Korea and China, respectively) were studied using the ligation-dependent polymerase chain reaction (LD-PCR). These cases were histologically classified according to the World Health Organization (WHO) schema for NPC. Consensus target probes complementary to the L1 region of over 30 HPV types, as well as target probes complementary to EBER-1 (EBV-related nontranslated latency-associated RNA), were used to amplify target sequences.Results. Seven of 30 NPC (23%) contained HPV sequences. There were 6 Caucasian American patients with NPC; 3 cases (50%) were HPV positive (HPV+).
Random forest is an excellent ensemble learning method, which is composed of multiple decision trees grown on random input samples and splitting nodes on a random subset of features. Due to its good classification and generalization ability, random forest has achieved success in various domains. However, random forest will generate many noisy trees when it learns from the data set that has high dimension with many noise features. These noisy trees will affect the classification accuracy, and even make a wrong decision for new instances. In this paper, we present a new approach to solve this problem through weighting the trees according to their classification ability, which is named Trees Weighting Random Forest (TWRF). Here, Out-Of-Bag, which is the training data subset generated by Bagging and not involved in building decision tree, is used to evaluate the tree. For simplicity, we choose the accuracy as the index that notes tree's classification ability and set it as the tree's weight. Experiments show that TWRF has better performance than the original random forest and other traditional methods, such as C45, Naïve Bayes and so on.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.