Shaojun Wang scite author profile

We augment naive Bayes models with statistical n-gram language models to address shortcomings of the standard naive Bayes text classifier. The result is a generalized naive Bayes * Most research was conducted while the authors were at the School of Computer Science at University of Waterloo, Canada. 1classifier which allows for a local Markov dependence among observations; a model we refer to as the Chain Augmented Naive Bayes (CAN) Bayes classifier. CAN models have two advantages over standard naive Bayes classifiers. First, they relax some of the independence assumptions of naive Bayes-allowing a local Markov chain dependence in the observed variables-while still permitting efficient inference and learning. Second, they permit straightforward application of sophisticated smoothing techniques from statistical language modeling, which allows one to obtain better parameter estimates than the standard Laplace smoothing used in naive Bayes classification. In this paper, we introduce CAN models and apply them to various text classification problems. To demonstrate the language independent and task independent nature of these classifiers, we present experimental results on several text classification problems-authorship attribution, text genre classification, and topic detection-in several languages-Greek, English, Japanese and Chinese. We then systematically study the key factors in the CAN model that can influence the classification performance, and analyze the strengths and weaknesses of the model.

show abstract

Ultra‐Deep Desulfurization of Diesel: Oxidation with a Recoverable Catalyst Assembled in Emulsion

Li¹,

Jiang²,

Gao³

et al. 2004

Chemistry A European J

276

121

View full text Add to dashboard Cite

A [(C(18)H(37))(2)N(+)(CH(3))(2)](3)[PW(12)O(40)] catalyst, assembled in an emulsion in diesel, can selectively oxidize the sulfur-containing molecules present in diesel into their corresponding sulfones by using H(2)O(2) as the oxidant under mild conditions. The sulfones can be readily separated from the diesel using an extractant, and the sulfur level of the desulfurized diesel can be lowered from about 500 ppm to 0.1 ppm without changing the properties of the diesel. The catalyst demonstrates high performance (>/=96 % efficiency of H(2)O(2), is easily recycled, and approximately 100 % selectivity to sulfones). Metastable emulsion droplets (water in oil) act like a homogeneous catalyst and are formed when the catalyst (as the surfactant) and H(2)O(2) (30 %) are mixed in the diesel. However, the catalyst can be separated from the diesel after demulsification.

show abstract

Non‐Radiative Energy Transfer Mediated by Hybrid Light‐Matter States

et al. 2016

View full text Add to dashboard Cite

We present direct evidence of enhanced non-radiative energy transfer between two J-aggregated cyanine dyes strongly coupled to the vacuum field of ac avity.E xcitation spectroscopya nd femtosecond pump-probe measurements show that the energy transfer is highly efficient when both the donor and acceptor form light-matter hybrid states with the vacuum field. The rate of energy transfer is increased by af actor of seven under those conditions as compared to the normal situation outside the cavity,w ith ac orresponding effect on the energy transfer efficiency.T he delocalized hybrid states connect the donor and acceptor molecules and clearly play the role of abridge to enhance the rate of energy transfer.T his finding has fundamental implications for coherent energy transport and light-energy harvesting.When an exciton transition and aresonant optical mode exchange energy faster than any competing dissipation process,itcan lead to light-matter strong coupling and the generation of two new hybrid (polaritonic) eigenstates,P + and PÀ,s eparated by the so-called Rabi splitting ( Figure 1a). This brings about interesting properties possessed by neither the original exciton or the optical mode, [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19] and leads to new possibilities,such as modified chemical reactivity, [5,6] and enhanced conductivity of organic semiconductors. [9] In the latter case,the enhancement stems from the delocalized nature of the hybrid states over the spatial extent of the optical mode [10,11] which is expected to affect energy transport according to recent theoretical studies. [12,13] In this context, it is interesting to consider how such hybrid states would affect energy transfer between donor and acceptor molecules.Energy transfer is anon-radiative process which has been extensively studied over the last century and typically involves either Coulombic interactions (Fçrster) or electronic exchange (Dexter). [20] Ak ey confirmation of energy transfer is of course ar eduction in the lifetime of the donor concomitant with the rise of the acceptor excited state population. Other factors that affect energy transfer include molecular aggregation, the presence of bridges between the donor and acceptor,a nd the density of optical states. [20][21][22][23][24][25] Strong coupling could provide an alternate effective path for energy transfer in analogy with chemically bridged donors and acceptors where the linker mediates the interactions by an effective overlap between the wave functions of both the donor and the acceptor.I nt he strong coupling case,i ti st he polaritonic states which are by construction either donor or acceptor-like that mediates the interactions in the system due to their delocalized nature.R ecently,e nergy transfer under strong coupling based on steady-state fluorescence excitation spectroscopy of the acceptor was studied. [17] However no Figure 1. a) Schematic representation of strong coupling successively with the donor resonant with acavity mode " hw c ,and then the a...

show abstract

Reduced miR-126 expression facilitates angiogenesis of gastric cancer through its regulation on VEGF-A

et al. 2014

View full text Add to dashboard Cite

miR-126 is an endothelial-specific microRNA essential for governing vascular integrity and angiogenesis. Its role in tumor angiogenesis of gastric cancer (GC) is unclear. This study aimed at determining the role of miR-126 in GC angiogenesis. Down-regulation of miR-126 was found to inversely correlate with an increased microvessel density (MVD) and vascular endothelial growth factor A (VEGF-A) expression in gastric cancer tissues. Bioinformatics analysis and luciferase reporter assay revealed that miR-126 directly targeted the 3′-untranslated region (3′-UTR) of VEGF-A mRNA. In addition, the restoration of miR-126 expression by lentivirus-miR-126 (Lenti-miR-126) transfection obviously reduced the expression of VEGF-A and the activition of its downstream genes, Akt, mTOR and Erk1/2 in gastric cancer cell lines SGC-7901, MKN-28 and MKN-45. In contrast, the down-regulation of miR-126 expression by lentivirus-anti-miR-126 (Lenti-anti-miR-126) transfection obviously up-regulated the expression of VEGF-A and its downstream signaling pathways. In vivo xenograft mice model experiments clarified the down-regulation of VEGF-A and MVD as well as inhibition of tumor growth by up-regulation of miR-126. Overall, the results from our study suggested that miR-126 could suppress tumor growth and tumor angiogenesis of GC through VEGF-A signaling, and it is a novel potential therapeutic target for GC.

show abstract

Supply chain models for perishable products under inflation and permissible delay in payment

Sarker

Jamal

Wang

2000

Computers & Operations Research

229

View full text Add to dashboard Cite

Semi-supervised conditional random fields for improved sequence segmentation and labeling

Jiao¹,

Wang²,

Lee³

et al. 2006

View full text Add to dashboard Cite

We present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum entropy regularization framework to the structured prediction case, yielding a training objective that combines unlabeled conditional entropy with labeled conditional likelihood. Although the training objective is no longer concave, it can still be used to improve an initial model (e.g. obtained from supervised training) by iterative ascent. We apply our new training algorithm to the problem of identifying gene and protein mentions in biological texts, and show that incorporating unlabeled data improves the performance of the supervised CRF in this case.

show abstract

Electron Beam Computed Tomographic Coronary Calcium as a Predictor of Coronary Events

et al. 1997

View full text Add to dashboard Cite

show abstract

Language independent authorship attribution using character level language models

et al. 2003

View full text Add to dashboard Cite

What is the information captured by neural network models of language? We address this question in the case of character-level recurrent neural language models. These models do not have explicit word representations ; do they acquire implicit ones? We assess the lexical capacity of a network using the lexical decision task common in psycholinguistics: the system is required to decide whether or not a string of characters forms a word. We explore how accuracy on this task is affected by the architecture of the network, focusing on cell type (LSTM vs. SRN), depth and width. We also compare these architectural properties to a simple count of the parameters of the network. The overall number of parameters in the network turns out to be the most important predictor of accuracy; in particular , there is little evidence that deeper networks are beneficial for this task.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shaojun Wang

Augmenting Naive Bayes Classifiers with Statistical Language Models

Ultra‐Deep Desulfurization of Diesel: Oxidation with a Recoverable Catalyst Assembled in Emulsion

Non‐Radiative Energy Transfer Mediated by Hybrid Light‐Matter States

Reduced miR-126 expression facilitates angiogenesis of gastric cancer through its regulation on VEGF-A

Supply chain models for perishable products under inflation and permissible delay in payment

Semi-supervised conditional random fields for improved sequence segmentation and labeling

Electron Beam Computed Tomographic Coronary Calcium as a Predictor of Coronary Events

Language independent authorship attribution using character level language models

Contact Info

Product

Resources

About