DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text Generation in E-commerce Title and Review Summarization

Zhang, Xueying; Jiang, Yunjiang; Shang, Yan; Cheng, Zhaomeng; Zhang, Chi; Fan, Xiaochuan; Xiao, Yun; Long, Bo

doi:10.1145/3404835.3463037

Cited by 11 publications

(4 citation statements)

References 18 publications

(17 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Goodwin et al [58] study how to generate summaries conditioned on different topics or questions. DSGPT [218] proposes to pretrain in e-commerce scenarios and explore the product title and review summarization. Furthermore, PASS [137] aggregates different reviews of one product into a short summary.…”

Section: Text Summarizationmentioning

confidence: 99%

Pretrained Language Models for Text Generation: A Survey

Li¹,

Tang²,

Zhao³

et al. 2022

Preprint

View full text Add to dashboard Cite

Text Generation aims to produce plausible and readable text in human language from input data. The resurgence of deep learning has greatly advanced this field by neural generation models, especially the paradigm of pretrained language models (PLMs). Grounding text generation on PLMs is seen as a promising direction in both academia and industry. In this survey, we present the recent advances achieved in the topic of PLMs for text generation. In detail, we begin with introducing three key points of applying PLMs to text generation: 1) how to encode the input data as representations preserving input semantics which can be fused into PLMs; 2) how to design a universal and performant architecture of PLMs served as generation models; and 3) how to optimize PLMs given the reference text and ensure the generated text satisfying special text properties. Then, we figure out several challenges and future directions within each key point. Next, we present a summary of various useful resources and typical text generation applications to work with PLMs. Finally, we conclude and summarize the contribution of this survey.CCS Concepts: • Computing methodologies → Natural language generation.

show abstract

Section: Text Summarizationmentioning

confidence: 99%

Pretrained Language Models for Text Generation: A Survey

Li¹,

Tang²,

Zhao³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…(2020), or on domain‐specific corpus Zhang et al. (2021); Zou et al. (2020) with well‐defined pretraining tasks.…”

Section: Introductionmentioning

confidence: 99%

“…Recently, the pretraining plus fine-tuning paradigm has gained traction and been widely applied in real-world applications. Under this paradigm, models are first pretrained on large-scale corpus Devlin et al (2019a); Brown et al (2020), or on domain-specific corpus Zhang et al (2021); Zou et al (2020) with well-defined pretraining tasks. The resulting pretrained models are then fine-tuned to adapt to different downstream tasks.…”

Section: Introductionmentioning

confidence: 99%

Automatic product copywriting for e‐commerce

et al. 2023

Self Cite

View full text Add to dashboard Cite

Product copywriting is a critical component of e-commerce recommendation platforms. It aims to attract users' interest and improve user experience by highlighting product characteristics with textual descriptions. In this paper, we report our experience deploying the proposed Automatic Product Copywriting Generation (APCG) system into the JD.com e-commerce product recommendation platform. It consists of two main components: (1) natural language generation, which is built from a transformer-pointer network and a pretrained sequence-tosequence model based on millions of training data from our in-house platform; and (2) copywriting quality control, which is based on both automatic evaluation and human screening. For selected domains, the models are trained and updated daily with the updated training data. In addition, the model is also used as a real-time writing assistant tool on our live broadcast platform. The APCG system has been deployed in JD.com since February 2021. By September 2021, it has generated 2.53 million product descriptions, and improved the overall averaged click-through rate (CTR) and the conversion rate (CVR) by 4.22 and 3.61%, compared to baselines, respectively, on a year-on-year basis. The accumulated gross merchandise volume (GMV) made by our system is improved by 213.42%, compared to the number in February 2021.

show abstract

“…Recently, the pre-training plus fine-tuning paradigm has gained traction and been widely applied in real-world applications. Under this paradigm, models are first pre-trained on large-scale corpus (Devlin et al 2019a;Brown et al 2020), or on domain-specific corpus (Zhang et al 2021) with welldefined pre-training tasks. The resulting pre-trained models are then fine-tuned to adapt to different downstream tasks.…”

Section: Introductionmentioning

confidence: 99%

Automatic Product Copywriting for E-commerce

Zhang¹,

Zou²,

Zhang³

et al. 2022

AAAI

Self Cite

View full text Add to dashboard Cite

Product copywriting is a critical component of e-commerce recommendation platforms. It aims to attract users' interest and improve user experience by highlighting product characteristics with textual descriptions. In this paper, we report our experience deploying the proposed Automatic Product Copywriting Generation (APCG) system into the JD.com e-commerce product recommendation platform. It consists of two main components: 1) natural language generation, which is built from a transformer-pointer network and a pre-trained sequence-to-sequence model based on millions of training data from our in-house platform; and 2) copywriting quality control, which is based on both automatic evaluation and human screening. For selected domains, the models are trained and updated daily with the updated training data. In addition, the model is also used as a real-time writing assistant tool on our live broadcast platform. The APCG system has been deployed in JD.com since Feb 2021. By Sep 2021, it has generated 2.53 million product descriptions, and improved the overall averaged click-through rate (CTR) and the Conversion Rate (CVR) by 4.22% and 3.61%, compared to baselines, respectively on a year-on-year basis. The accumulated Gross Merchandise Volume (GMV) made by our system is improved by 213.42%, compared to the number in Feb 2021.

show abstract

DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text Generation in E-commerce Title and Review Summarization

Cited by 11 publications

References 18 publications

Pretrained Language Models for Text Generation: A Survey

Pretrained Language Models for Text Generation: A Survey

Automatic product copywriting for e‐commerce

Automatic Product Copywriting for E-commerce

Contact Info

Product

Resources

About