Untitled

We describe the cross-linking of poly(4-styrene-sulfonic acid) (PSS) by exposure to ultraviolet (UV) light (λ = 255 nm) under a vacuum. Fourier transform infrared (FT-IR) spectroscopy and X-ray photoelectron spectroscopy (XPS) showed that the photo-crosslinking of PSS resulted from coupling between radicals that were generated in the polymer chains by UV excitation. The photo-cross-linkable characteristics of PSS were employed to fabricate solution-processable, photopatternable, and conductive PSS-wrapped multiwalled carbon nanotube (MWNT) composite thin films by wrapping MWNTs with PSS in water. During photo-cross-linking, the work function of the PSS-wrapped MWNTs decreased from 4.83 to 4.53 eV following cleavage of a significant number of sulfonic acid groups. Despite the decreased work function of the PSS-wrapped MWNTs, the photopatterned PSS-wrapped MWNTs produced good source/drain electrodes for OFETs, yielding a mobility (0.134 ± 0.056 cm²/(V s)) for the TIPS-PEN field-effect transistors fabricated using PSS-wrapped MWNTs as source/drain electrodes that was higher than the mobility of gold-based transistors (0.011 ± 0.004 cm²/(V s)).

show abstract

Decoding Strategies for Improving Low-Resource Machine Translation

Park

Yang

Park

et al. 2020

Electronics

View full text Add to dashboard Cite

Pre-processing and post-processing are significant aspects of natural language processing (NLP) application software. Pre-processing in neural machine translation (NMT) includes subword tokenization to alleviate the problem of unknown words, parallel corpus filtering that only filters data suitable for training, and data augmentation to ensure that the corpus contains sufficient content. Post-processing includes automatic post editing and the application of various strategies during decoding in the translation process. Most recent NLP researches are based on the Pretrain-Finetuning Approach (PFA). However, when small and medium-sized organizations with insufficient hardware attempt to provide NLP services, throughput and memory problems often occur. These difficulties increase when utilizing PFA to process low-resource languages, as PFA requires large amounts of data, and the data for low-resource languages are often insufficient. Utilizing the current research premise that NMT model performance can be enhanced through various pre-processing and post-processing strategies without changing the model, we applied various decoding strategies to Korean–English NMT, which relies on a low-resource language pair. Through comparative experiments, we proved that translation performance could be enhanced without changes to the model. We experimentally examined how performance changed in response to beam size changes and n-gram blocking, and whether performance was enhanced when a length penalty was applied. The results showed that various decoding strategies enhance the performance and compare well with previous Korean–English NMT approaches. Therefore, the proposed methodology can improve the performance of NMT models, without the use of PFA; this presents a new perspective for improving machine translation performance.

show abstract

A Survey on Evaluation Metrics for Machine Translation

Lee

Moon

et al. 2023

Mathematics

View full text Add to dashboard Cite

The success of Transformer architecture has seen increased interest in machine translation (MT). The translation quality of neural network-based MT transcends that of translations derived using statistical methods. This growth in MT research has entailed the development of accurate automatic evaluation metrics that allow us to track the performance of MT. However, automatically evaluating and comparing MT systems is a challenging task. Several studies have shown that traditional metrics (e.g., BLEU, TER) show poor performance in capturing semantic similarity between MT outputs and human reference translations. To date, to improve performance, various evaluation metrics have been proposed using the Transformer architecture. However, a systematic and comprehensive literature review on these metrics is still missing. Therefore, it is necessary to survey the existing automatic evaluation metrics of MT to enable both established and new researchers to quickly understand the trend of MT evaluation over the past few years. In this survey, we present the trend of automatic evaluation metrics. To better understand the developments in the field, we provide the taxonomy of the automatic evaluation metrics. Then, we explain the key contributions and shortcomings of the metrics. In addition, we select the representative metrics from the taxonomy, and conduct experiments to analyze related problems. Finally, we discuss the limitation of the current automatic metric studies through the experimentation and our suggestions for further research to improve the automatic evaluation metrics.

show abstract

Ancient Korean Neural Machine Translation

et al. 2020

View full text Add to dashboard Cite

Translation of the languages of ancient times can serve as a source for the content of various digital media and can be helpful in various fields such as natural phenomena, medicine, and science. Owing to these needs, there has been a global movement to translate ancient languages, but expert minds are required for this purpose. It is difficult to train language experts, and more importantly, manual translation is a slow process. Consequently, the recovery of ancient characters using machine translation has been recently investigated, but there is currently no literature on the machine translation of ancient Korean. This paper proposes the first ancient Korean neural machine translation model using a Transformer. This model can improve the efficiency of a translator by quickly providing a draft translation for a number of untranslated ancient documents. Furthermore, a new subword tokenization method called the Share Vocabulary and Entity Restriction Byte Pair Encoding is proposed based on the characteristics of ancient Korean sentences. This proposed method yields an increase in the performance of the original conventional subword tokenization methods such as byte pair encoding by 5.25 BLEU points. In addition, various decoding strategies such as n-gram blocking and ensemble models further improve the performance by 2.89 BLEU points. The model has been made publicly available as a software application.

show abstract

BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text

Park¹,

Seo²,

Lee³

et al. 2021

View full text Add to dashboard Cite

With the growing popularity of smart speakers, such as Amazon Alexa, speech is becoming one of the most important modes of humancomputer interaction. Automatic speech recognition (ASR) is arguably the most critical component of such systems, as errors in speech recognition propagate to the downstream components and drastically degrade the user experience. A simple and effective way to improve the speech recognition accuracy is to apply automatic post-processor to the recognition result. However, training a post-processor requires parallel corpora created by human annotators, which are expensive and not scalable. To alleviate this problem, we propose Back TranScription (BTS), a denoising-based method that can create such corpora without human labor. Using a raw corpus, BTS corrupts the text using Text-to-Speech (TTS) and Speech-to-Text (STT) systems. Then, a postprocessing model can be trained to reconstruct the original text given the corrupted input. Quantitative and qualitative evaluations show that a post-processor trained using our approach is highly effective in fixing non-trivial speech recognition errors such as mishandling foreign words. We present the generated parallel corpus and post-processing platform to make our results publicly available.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chanjun Park

Photopatternable, highly conductive and low work function polymer electrodes for high-performance n-type bottom contact organic transistors

Separated Circular Capacitive Coupler for Reducing Cross-Coupling Capacitance in Drone Wireless Power Transfer System

Neural spelling correction: translating incorrect sentences to correct sentences for multimedia

Photopatternable Poly(4-styrene sulfonic acid)-Wrapped MWNT Thin-Film Source/Drain Electrodes for Use in Organic Field-Effect Transistors

Decoding Strategies for Improving Low-Resource Machine Translation

A Survey on Evaluation Metrics for Machine Translation

Ancient Korean Neural Machine Translation

BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text

Contact Info

Product

Resources

About