Public Speaking Anxiety (PSA) or discomfort while speaking in public is a wide spread cognitive disorder. Exposure therapy offers the opportunity to treat patients suffering PSA by exposing them to the phobic stimulus. To plan and organize an in-vivo exposure takes a lot of effort in recruiting people for an audience and orchestrating their behavior. Virtual Reality (VR) offers the possibility to generate the audiences that can be controlled by an orchestrator according patient's individual needs. This paper explores a system that enables the therapists to richly interact verbally and non-verbally with immersed presenters. For evaluation, we conducted a study with 24 healthy participants in two groups (12 participants each). Our results indicate that the direct verbal interaction between an orchestrator outside the VR and an immersed presenter are enhancing the presenter's experience and increasing the efficiency of therapy process. The non-verbal dimension is realized that an orchestrator can takeover an avatar using a motion-tracking camera that controls then the avatar's movements. Our results indicate that the transition of animations and movements are not impacting the experience negatively.