Se Jung Kwon scite author profile

Modeling and simulation (M&S) has long played an important role in developing tactics and evaluating the measure of effectiveness (MOE) for the underwater warfare system. In simulation-based acquisition, M&S technology facilitates decisions about future equipment procurements, such as a mobile decoy or a torpedo. In addition, assessment of submarine tactical development, during an engagement against a torpedo, can be conducted using M&S techniques. This paper presents a case study that applies discrete event systems specification-based M&S technology to develop a simulation of an underwater warfare system, specifically, an anti-torpedo combat system, to analyze the MOE of the system. The entity models required for M&S are divided into three sub-models: controller, maneuver, and sensor model. The developed simulation allows us to conduct a statistical evaluation of the overall underwater warfare system under consideration, an assessment of the anti-torpedo countermeasure's effectiveness, and an assessment of tactics development of the underwater vehicle. Moreover, it can be utilized to support the decision-making process for future equipment procurements. In order to analyze the system effectiveness, we performed extensive combat experiments by varying parameters, such as various tactics and weapon performance. The experimental results show how the factors influence the MOEs of the underwater warfare system.

show abstract

nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models

Park¹,

Baeseong²,

Kim³

et al. 2022

Preprint

View full text Add to dashboard Cite

The recent advance of self-supervised learning associated with the Transformer architecture enables natural language processing (NLP) to exhibit extremely low perplexity. Such powerful models demand ever-increasing model size, and thus, large amounts of computations and memory footprints. In this paper, we propose an efficient inference framework for largescale generative language models. As the key to reducing model size, we quantize weights by a non-uniform quantization method. Then, quantized matrix multiplications are accelerated by our proposed kernel, called nuQmm, which allows a wide trade-off between compression ratio and accuracy. Our proposed nuQmm reduces the latency of not only each GPU but also the entire inference of large LMs because a high compression ratio (by low-bit quantization) mitigates the minimum required number of GPUs. We demonstrate that nuQmm can accelerate the inference speed of the GPT-3 (175B) model by about 14.4 times and save energy consumption by 93%.

show abstract

BiQGEMM: Matrix Multiplication with Lookup Table for Binary-Coding-Based Quantized DNNs

Jeon

Park

Kwon

et al. 2020

View full text Add to dashboard Cite

Perceived professional competence of clinical research coordinators

Rojewski

Choi

Hill

et al. 2020

J. Clin. Trans. Sci.

View full text Add to dashboard Cite

Introduction: This study examined the perceived competence of Clinical Research Coordinators (CRCs) using several conceptual frameworks. Accurate self-assessment of one’s professional competence is a critical component in the career navigation process and contributes to (a) identifying and securing professional development (training), (b) leveraging professional strengths, and (c) integrating self-knowledge into a comprehensive career plan. Method: A survey design gathered responses from a sample of 119 CRCs in a southeastern region of the USA Two conceptual frameworks were used to represent aspects of CRC professional competence: the eight Joint Task Force (JTF) competence domains, and perceptions of strengths and training needs from a list of 12 task categories. Results: The JTF domain with the lowest competence level was Development and Regulations, while the highest was Communication. Perceived competence increased incrementally with years of experience. Top strengths involved direct patient interaction and data management. Tasks in need of training included project management and reporting issues. Variations in responses were based on years of experience as a CRC. Conclusion: Our results demonstrate an association between the self-reported strengths and training needs of CRCs and experience. This information can contribute to the self-directed career navigation of CRCs.

show abstract

Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation

Chung¹,

Kim²,

Choi³

et al. 2020

View full text Add to dashboard Cite

The deployment of widely used Transformer architecture is challenging because of heavy computation load and memory overhead during inference, especially when the target device is limited in computational resources such as mobile or edge devices. Quantization is an effective technique to address such challenges. Our analysis shows that for a given number of quantization bits, each block of Transformer contributes to translation quality and inference computations in different manners. Moreover, even inside an embedding block, each word presents vastly different contributions. Correspondingly, we propose a mixed precision quantization strategy to represent Transformer weights by an extremely low number of bits (e.g., under 3 bits). For example, for each word in an embedding block, we assign different quantization bits based on statistical property. Our quantized Transformer model achieves 11.8× smaller model size than the baseline model, with less than -0.5 BLEU. We achieve 8.3× reduction in run-time memory footprints and 3.5× speed up (Galaxy N10+) such that our proposed compression strategy enables efficient implementation for on-device NMT.

show abstract

Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation

Chung¹,

Kim²,

Choi³

et al. 2020

Preprint

View full text Add to dashboard Cite

Transformer is being widely used in Neural Machine Translation (NMT). Deploying Transformer models to mobile or edge devices with limited resources is challenging because of heavy computation and memory overhead during inference. Quantization is an effective technique to address such challenges. Our analysis shows that for a given number of quantization bits, each block of Transformer contributes to translation accuracy and inference computations in different manners. Moreover, even inside an embedding block, each word presents vastly different contributions. Correspondingly, we propose a mixed precision quantization strategy to represent Transformer weights with lower bits (e.g. under 3 bits). For example, for each word in an embedding block, we assign different quantization bits based on statistical property. Our quantized Transformer model achieves 11.8× smaller model size than the baseline model, with less than -0.5 BLEU. We achieve 8.3× reduction in run-time memory footprints and 3.5× speed up (Galaxy N10+) such that our proposed compression strategy enables efficient implementation for on-device NMT.

show abstract

Structured Compression by Weight Encryption for Unstructured Pruning and Quantization

Kwon

Lee

Kim

et al. 2020

View full text Add to dashboard Cite

Simulation-Based Optimization on the System-of-Systems Model via Model Transformation and Genetic Algorithm: A Case Study of Network-Centric Warfare

et al. 2018

View full text Add to dashboard Cite

Simulation of a system-of-systems (SoS) model, which consists of a combat model and a network model, has been used to analyze the performance of network-centric warfare in detail. However, finding the combat model parameters satisfying the required combat power using simulation can take a long time for two reasons: (1) the prolonged execution time per simulation run and (2) the enormous number of simulation runs. This paper proposes a simulation-based optimization method for the SoS-based simulation model to overcome these problems. The method consists of two processes: (1) the transformation of the SoS-based model into an integrated model using the neural network to reduce the execution time and (2) the optimization of the integrated model using the genetic algorithm with ranking and selection to decrease the number of simulation runs. The experimental result reveals that the proposed method significantly reduced the time for finding the optimal combat parameters with an acceptable level of accuracy.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Se Jung Kwon

Measurement of Effectiveness for an Anti-torpedo Combat System Using a Discrete Event Systems Specification-based Underwater Warfare Simulator

nuQmm: Quantized MatMul for Efficient Inference of Large-Scale Generative Language Models

BiQGEMM: Matrix Multiplication with Lookup Table for Binary-Coding-Based Quantized DNNs

Perceived professional competence of clinical research coordinators

Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation

Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation

Structured Compression by Weight Encryption for Unstructured Pruning and Quantization

Simulation-Based Optimization on the System-of-Systems Model via Model Transformation and Genetic Algorithm: A Case Study of Network-Centric Warfare

Contact Info

Product

Resources

About