Background: Using smartphones and wearable sensor technology has sparked a broad engagement of data science and machine learning methods to leverage the complex, assorted amount of data. Despite verified processes, there is a reported underdevelopment of user engagement concepts, and the desire for high accuracy or significance has shown to lead to low explicability and irreproducibility. To overcome these issues, we aim to analyze principal characteristics of everyday behavior in digital mental health. Methods: We generated five latent features based on previous research, expert opinions from digital mental health, and informed by data. The features were analyzed with descriptive statistics and data visualization. We carried out two rounds of evaluations with data from 12,400 users of IntelliCare, a mental health platform with 12 apps. First, we focused to proof concept and second, we assessed reproducibility by drawing conclusion from distribution differences. User data was drawn from both research trials and public deployment on Google Play. Results: Our algorithms showed increased rationale for the basic usage of apps with different underlying behavioral strategies. Measures of the distribution of user’s allocated attention, the user’s circadian behavior, their consecutive commitment to a specific strategy, and users’ interaction trajectory curve are perceived as transferable to the public data set. Because distributions between research trial and public deployment were similar, consistency was shown regarding the underlying behavioral strategies: psychoeducation and goal setting are used as a catalyst to overcome the users’ primary obstacles, sleep hygiene is addressed most regularly, while regular emotional exposure is avoided. Relaxation as well as cognitive reframing have increased variance in commitment among public users, indicating the challenging nature of these apps. The relative course of the engagement (learning curve) is similar in research and public data. Conclusions: The deliberate, a-priori engineered features were reproducible across app users from both data sets. These features led to improved results as well as increased interpretability, providing an increased understanding of how people engage with multiple mental health apps over time. Since we based the generation of features on generic interaction proxies, these methods are applicable to other cases in artificial intelligence and digital health.