Usability of security solution has always been a keen area of interest for researchers. CAPTCHA is one such security solution which presents various usability challenges for users. However, it has successfully reduced the abuse of the Internet resources, such as spam. Similar to the Internet, audio-based CAPTCHAs have been proposed as a solution to curb voice spam over telephony. Voice spam is often encountered on telephony in various forms, such as, an automated telemarketing call asking to call a number to win million of dollars. A large percentage of voice spam is generated through automated system which introduces the classical challenge of distinguishing machines from humans on the telephony. We present a large scale evaluation of audio CAPTCHA from the human perspective over telephony through a field study with 90 participants. We study two primary research questions: how much inconvenience does audio CAPTCHA causes to users on telephony, and how different features of the CAPTCHA, e.g., duration and size influence usability of audio CAPTCHA on telephony. We found that captcha could be a viable solution for telephony with improved features, such as better voice and accent. We found that users were relatively close to the expected correct answers, which does suggest the possibility of deploying audio captcha on telephony platforms in the future. However, we did not find strong influence of captcha size and duration on solving accuracy.