Depression is a major societal issue. However, depression can be hard to self-diagnose, and people suffering from depression often hesitate to consult with professionals. We discuss the design and initial testings of our prototype application that performs depression detection using multi-modal information such as questionnaires, speech, and face landmarks. The application has an animated avatar ask questions concerning the users’ well-being. To perform screening, we opt for a 2-stage method which first predicts individual HAM-D ratings for better explainability, which may help facilitate the referral process to medical professionals if required. Initial results show that our system archives 0.85 Marco-F1 for the depression detection task.