PanGu-Coder: Program Synthesis with Function-Level Language Modeling

Christopoulou, Fenia; Λάμπουρας, Γεράσιμος; Gritta, Milan; Guchun, Zhang,; Guo, Yinpeng; Li, Zhongqi; Zhang, Qi; Xiao, M.; Shen, Bo; Li, Lin; Yu, Han‐Qing; Li, Yan; Zhou, Pingyi; Wang, Xin; Ma, Yuchi; Iacobacci, Ignacio; Wang, Yasheng; Liang, G. Y.; Jiang, Wei; Jiang, Xin; Wang, Qianxiang; Li, Qun

doi:10.48550/arxiv.2207.11280

Cited by 7 publications

(7 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It introduces an input mechanism that employs distinct embeddings for distinct domains, which is coupled with a two-level routing design in the Random Routed Experts (RRE) framework. The pretraining corpus of Pangu-Σ, totalling 329 billion tokens, primarily encompasses diverse data formats of bilingual Chines-English, content from [24,47,48] and code from [49,50].…”

Section: Pangu-σ (2023)mentioning

confidence: 99%

“…Chinese companies adopt an ecosystem-diverse and systematic approach, typically rolling out a series of models to create a holistic technological ecosystem. Examples include Baidu's ERNIE series [164][165][166] of large models and its derivative ERNIEBot, Huawei's PanGu series [10,11,49] model, Alibaba's Tongyi series, Tencent's HunYuan series. Additionally, Chinese universities actively participate in large model development and research, partner-ing with tech companies or independently creating multiple large models, with Tsinghua University's CPM [36,37,167] series and GLM [168] being one such instance.…”

Section: Comparative Analysis From the View Of Globalizationmentioning

confidence: 99%

See 1 more Smart Citation

From Large Language Models to Large Multimodal Models: A Literature Review

Huang,

Yan,

et al. 2024

Applied Sciences

View full text Add to dashboard Cite

With the deepening of research on Large Language Models (LLMs), significant progress has been made in recent years on the development of Large Multimodal Models (LMMs), which are gradually moving toward Artificial General Intelligence. This paper aims to summarize the recent progress from LLMs to LMMs in a comprehensive and unified way. First, we start with LLMs and outline various conceptual frameworks and key techniques. Then, we focus on the architectural components, training strategies, fine-tuning guidance, and prompt engineering of LMMs, and present a taxonomy of the latest vision–language LMMs. Finally, we provide a summary of both LLMs and LMMs from a unified perspective, make an analysis of the development status of large-scale models in the view of globalization, and offer potential research directions for large-scale models.

show abstract

Section: Pangu-σ (2023)mentioning

confidence: 99%

Section: Comparative Analysis From the View Of Globalizationmentioning

confidence: 99%

From Large Language Models to Large Multimodal Models: A Literature Review

Huang,

Yan,

et al. 2024

Applied Sciences

View full text Add to dashboard Cite

show abstract

“…Recent research has delved into leveraging pretrained large language models (LLMs) from the natural language processing (NLP) field to automate program synthesis tasks, using vast-scale code corpus data mined from open-source repositories. Notably, there are several prominent examples of such pretrained models including the encoder-only CodeBERT (Feng et al, 2020), decoder-only CodeGPT (Lu et al, 2021), Code-Gen (Nijkamp et al, 2022), PaLM-Coder (Chowdhery et al, 2022), PanGu-Coder (Christopoulou et al, 2022), CodeGeex (Zheng et al, 2023), andSantaCoder (Allal et al, 2023), as well as encoder-decoder transformer architectures like PLABRT (Ahmad et al, 2021) and CodeT5 (Wang et al, 2021). These pretrained probabilistic language (PL) models are already capable of generating code that appears visually impressive and well-structured.…”

Section: Pretrained Llms For Program Synthesismentioning

confidence: 99%

“…AlphaCode , for instance, aspires to address competitive-level programming challenges, while InCoder (Fried et al, 2022) enables code insertion at arbitrary junctures utilizing bidirectional contexts. Other acclaimed models include CodeT5 (Wang et al, 2021), CodeGen (Nijkamp et al, 2022), PaLM-Coder (Chowdhery et al, 2022), PanGu-Coder (Christopoulou et al, 2022), CodeGeex (Zheng et al, 2023), andSantaCoder (Allal et al, 2023). As the size of these LLMs increases, they demonstrate emergent competencies, including human-like programming prowess and debugging aptitude Saunders et al, 2022).…”

Section: Introductionmentioning

confidence: 99%

Deep reinforcement with spectrum series learning control for a mode-locked fiber laser

Yang

Xiao

et al. 2022

Photon. Res.

View full text Add to dashboard Cite

A spectrum series learning-based model is presented for mode-locked fiber laser state searching and switching. The mode-locked operation search policy is obtained by our proposed algorithm that combines deep reinforcement learning and long short-term memory networks. Numerical simulations show that the dynamic features of the laser cavity can be obtained from spectrum series. Compared with the traditional evolutionary search algorithm that only uses the current state, this model greatly improves the efficiency of the mode-locked search. The switch of the mode-locked state is realized by a predictive neural network that controls the pump power. In the experiments, the proposed algorithm uses an average of only 690 ms to obtain a stable mode-locked state, which is one order of magnitude less than that of the traditional method. The maximum number of search steps in the algorithm is 47 in the 16°C–30°C temperature environment. The pump power prediction error is less than 2 mW, which ensures precise laser locking on multiple operating states. This proposed technique paves the way for a variety of optical systems that require fast and robust control.

show abstract

“…F. Xu et al, 2022]. In particular, large language models (LLMs) have been successful in a variety of code generation tasks [Athiwaratkun et al, 2023;Austin et al, 2021;Cassano, Gouwar, et al, 2023;Christopoulou et al, 2022;Izadi et al, 2022;Nijkamp et al, 2023;F. F. Xu et al, 2022].…”

Section: Introductionmentioning

confidence: 99%

Predicting typeScript type annotations and definitions with machine learning

Yee

View full text Add to dashboard Cite

This has been a very long journey, and there have been so many days, weeks, and months where I did not think I would ever reach the end. But I made it, thanks to everyone who has supported and encouraged me.First, I have to thank my advisor, Arjun Guha. I genuinely believe I would not have completed my Ph.D. without Arjun. My committee, Michael Greenberg, Frank Tip, and Steven Holtzen, also helped me get over the finish line, with their thoughtful feedback and questions.I had the opportunity to mentor and advise several undergraduate and master's students. Thank you for being great collaborators. I learned a lot from

show abstract

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

Cited by 7 publications

References 24 publications

From Large Language Models to Large Multimodal Models: A Literature Review

From Large Language Models to Large Multimodal Models: A Literature Review

Deep reinforcement with spectrum series learning control for a mode-locked fiber laser

Predicting typeScript type annotations and definitions with machine learning

Contact Info

Product

Resources

About