“…Another solution could be to combine a data-driven model with another approach to compensate for the deficiencies in the models, such as combining a generative model (e.g., Sequence-to-Sequence) with a Memory Network ( Madotto et al, 2018 ; Zhang B. et al, 2020 ) or with transformers ( Vaswani et al, 2017 ), such as in the work of Roller et al (2020) , Generative Pre-trained Transformer (GPT) ( Radford et al, 2018 , 2019 ; Brown et al, 2020 ; Zhang Y. et al, 2020 ), Bidirectional Encoder Representations from Transformers (BERT) ( Devlin et al, 2019 ; Song et al, 2021 ), and Poly-encoders ( Humeau et al, 2020 ; Li et al, 2020 ). Data-driven models can also be combined with graphical models ( Zhou et al, 2020 ; Song et al, 2019 ; Moon et al, 2019 ; Shi et al, 2020 ; Wu B. et al, 2020 ; Xu et al, 2020 ), rule-based or slot-filling systems ( Tammewar et al, 2018 ; Zhang Z. et al, 2019 ), a knowledge-base ( Ganhotra and Polymenakos, 2018 ; Ghazvininejad et al, 2018 ; Luo et al, 2019 ; Yavuz et al, 2019 ; Moon et al, 2019 ; Wu et al, 2019 ; Lian et al, 2019 ; Zhang B. et al, 2020 ; Majumder et al, 2020 ; Tuan et al, 2021 ) or with automatic extraction of attributes from dialogue ( Tigunova et al, 2019 , 2020 ; Wu C.-S. et al, 2020 , 2021 ; Ma et al, 2021 ) to improve the personalised entity selection in responses.…”