“…To evaluate the effectiveness of TurduckenGen, we conduct experiments on two Turduckenstyle code datasets Lyra and Pisces, where Lyra (Liang et al 2021) is obtained from GitHub repositories and annotated by human annotators, whereas Pisces is obtained by manually translating the python code in Lyra into the corresponding Java code through a crowdsourcing approach. TurduckenGen is compared to six state-of-the-art baselines, including Transformer (Vaswani et al 2017), CodeBERT (Feng et al 2020), GraphCodeBERT (Guo et al 2021), GPT (Radford et al 2019), CodeGPT (Lu et al 2021a), and CodeT5 (Wang et al 2021b), in terms of six automatic performance metrics (i.e., BLEU (Papineni et al 2002), Weighted BLEU (Ren et al 2020), Crystal BLEU (Eghbali and Pradel 2022), Syntax Match (Ren et al 2020), Syntax Exact Match (Liang et al 2021), and Code Executable (Liang et al 2021)). The comparison results show that TurduckenGen outperforms these baselines.…”