“…With the successful development of cross-modal visual and linguistic representations [30,42,43,54], especially the omnipotent CLIP [35], many efforts [7, 18,23,34,46,49,51] have recently started investigating text-driven image manipulation. However, there are no existing methods specifically for image restoration.…”