Large-scale industrial recommender systems are usually confronted with computational problems due to the enormous corpus size. To retrieve and recommend the most relevant items to users under response time limits, resorting to an efficient index structure is an effective and practical solution. The previous work Tree-based Deep Model (TDM) [34] greatly improves recommendation accuracy using tree index. By indexing items in a tree hierarchy and training a user-node preference prediction model satisfying a max-heap like property in the tree, TDM provides logarithmic computational complexity w.r.t. the corpus size, enabling the use of arbitrary advanced models in candidate retrieval and recommendation. In tree-based recommendation methods, the quality of both the tree index and the user-node preference prediction model determines the recommendation accuracy for the most part. We argue that the learning of tree index and preference model has interdependence. Our purpose, in this paper, is to develop a method to jointly learn the index structure and user preference prediction model. In our proposed joint optimization framework, the learning of index and user preference prediction model are carried out under a unified performance measure. Besides, we come up with a novel hierarchical user preference representation utilizing the tree index hierarchy. Experimental evaluations with two large-scale real-world datasets show that the proposed method improves recommendation accuracy significantly.
Ginseng, which contains ginsenosides as bioactive compounds, has been regarded as an important traditional medicine for several millennia. However, the genetic background of ginseng remains poorly understood, partly because of the plant's large and complex genome composition. We report the entire genome sequence of Panax ginseng using next-generation sequencing. The 3.5-Gb nucleotide sequence contains more than 60% repeats and encodes 42 006 predicted genes. Twenty-two transcriptome datasets and mass spectrometry images of ginseng roots were adopted to precisely quantify the functional genes. Thirty-one genes were identified to be involved in the mevalonic acid pathway. Eight of these genes were annotated as 3-hydroxy-3-methylglutaryl-CoA reductases, which displayed diverse structures and expression characteristics. A total of 225 UDP-glycosyltransferases (UGTs) were identified, and these UGTs accounted for one of the largest gene families of ginseng. Tandem repeats contributed to the duplication and divergence of UGTs. Molecular modeling of UGTs in the 71st, 74th, and 94th families revealed a regiospecific conserved motif located at the N-terminus. Molecular docking predicted that this motif captures ginsenoside precursors. The ginseng genome represents a valuable resource for understanding and improving the breeding, cultivation, and synthesis biology of this key herb.
No abstract
Supplementary data are available at Bioinformatics online.
Intimate coupling of photocatalysis and biodegradation (ICPB) offers potential for degrading biorecalcitrant and toxic organic compounds. This study reports on a novel sponge-type, TiO(2)-coated biofilm carrier that showed significant adherence of TiO(2) and ability to accumulate biomass in its interior. This carrier was tested for ICPB in a continuous-flow photocatalytic circulating-bed biofilm reactor (PCBBR) to mineralize 2,4,5-trichlorophenol (TCP), which is biorecalcitrant. Four mechanisms possibly acting in ICPB were tested separately: TCP adsorption to the carrier, UV photolysis, UV photocatalysis, and biodegradation by biofilm inside the carrier. The carrier exhibited strong TCP adsorption that followed a Freundlich isotherm with an exponent near 2. Whereas UV photolysis was negligible, photocatalysis produced TCP-degradation products that could be mineralized, and the strong adsorption of TCP to the carrier enhanced biodegradation by relieving toxicity. Validating the ICPB concept, biofilm was protected inside the carriers, although biomass originally on the outer surface of the carriers was eliminated. ICPB significantly lowered the diversity of the bacterial community, but five genera known to biodegrade chlorinated phenols (Ralstonia, Bradyrhizobium, Methylobacterium, Cupriavidus, and Pandoraea) were markedly enriched.
Extraction of the tongue body from digital images is essential for automated tongue diagnoses in traditional Chinese medicine. This paper presents a fully automated active contour initial method that utilizes prior knowledge of the tongue shape and its location in tongue images. Then colorspace information is introduced to control curve evolution. Combining the geometrical Snake model with the parameterized GVFSnake model, a novel approach for automatic tongue segmentation: C 2 G 2 FSnake (color control-geometric & gradient flow Snake) is proposed. This method increases the curve velocity but decreases the complexity. C 2 G 2 FSnake greatly extends practical usage to tongue segmentation, at the same time increasing the precision. Compared with other state-of-the-art works using different images of tongue color, C 2 G 2 FSnake realizes automatic tongue segmentation with greatly improved accuracy.
BackgroundCoronary heart disease (CHD) is a common cardiovascular disease that is extremely harmful to humans. In Traditional Chinese Medicine (TCM), the diagnosis and treatment of CHD have a long history and ample experience. However, the non-standard inquiry information influences the diagnosis and treatment in TCM to a certain extent. In this paper, we study the standardization of inquiry information in the diagnosis of CHD and design a diagnostic model to provide methodological reference for the construction of quantization diagnosis for syndromes of CHD. In the diagnosis of CHD in TCM, there could be several patterns of syndromes for one patient, while the conventional single label data mining techniques could only build one model at a time. Here a novel multi-label learning (MLL) technique is explored to solve this problem.MethodsStandardization scale on inquiry diagnosis for CHD in TCM is designed, and the inquiry diagnostic model is constructed based on collected data by the MLL techniques. In this study, one popular MLL algorithm, ML-kNN, is compared with other two MLL algorithms RankSVM and BPMLL as well as one commonly used single learning algorithm, k-nearest neighbour (kNN) algorithm. Furthermore the influence of symptom selection to the diagnostic model is investigated. After the symptoms are removed by their frequency from low to high; the diagnostic models are constructed on the remained symptom subsets.ResultsA total of 555 cases are collected for the modelling of inquiry diagnosis of CHD. The patients are diagnosed clinically by fusing inspection, pulse feeling, palpation and the standardized inquiry information. Models of six syndromes are constructed by ML-kNN, RankSVM, BPMLL and kNN, whose mean results of accuracy of diagnosis reach 77%, 71%, 75% and 74% respectively. After removing symptoms of low frequencies, the mean accuracy results of modelling by ML-kNN, RankSVM, BPMLL and kNN reach 78%, 73%, 75% and 76% when 52 symptoms are remained.ConclusionsThe novel MLL techniques facilitate building standardized inquiry models in CHD diagnosis and show a practical approach to solve the problem of labelling multi-syndromes simultaneously.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.