For efficient Computer Supported Cooperative Work (CSCW) audio conferencing is an essential component where video and text are add-ons. The specifications for enabling CSCW over Internet are incomplete if they are blind to actual conduct of participants. Indeed, a blind conference mimics quite closely a virtual voice-only conference. In this paper, we analyze the results of sessions of face-to-face blind conversations and gain penetrating insights. In particular, we focus on the impact of users' behavior on the design of a scalable architecture for virtual voice-only conferencing over VoIP and arrive at a meaningful number of floors for such conferences. We also present the features and the requirements for the proposed service.