Web-based innovation indicators may provide new insights into firm-level innovation activities. However, little is known yet about the accuracy and relevance of web-based information for measuring innovation. In this study, we use data on 4,487 firms from the Mannheim Innovation Panel (MIP) 2019, the German contribution to the European Community Innovation Survey (CIS), to analyze which website characteristics perform as predictors of innovation activity at the firm level. Website characteristics are measured by several data mining methods and are used as features in different Random Forest classification models that are compared against each other. Our results show that the most relevant website characteristics are textual content, the use of English language, the number of subpages and the amount of characters on a website. In our main analysis, models using all website characteristics jointly yield AUC values of up to 0.75 and increase accuracy scores by up to 18 percentage points compared to a baseline prediction based on the sample mean. Moreover, predictions with website characteristics significantly differ from baseline predictions according to a McNemar test. Results also indicate a better performance for the prediction of product innovators and firms with innovation expenditures than for the prediction of process innovators.
Die Dokumente auf EconStor dürfen zu eigenen wissenschaftlichen Zwecken und zum Privatgebrauch gespeichert und kopiert werden. Sie dürfen die Dokumente nicht für öffentliche oder kommerzielle Zwecke vervielfältigen, öffentlich ausstellen, öffentlich zugänglich machen, vertreiben oder anderweitig nutzen. Sofern die Verfasser die Dokumente unter Open-Content-Lizenzen (insbesondere CC-Lizenzen) zur Verfügung gestellt haben sollten, gelten abweichend von diesen Nutzungsbedingungen die in der dort genannten Lizenz gewährten Nutzungsrechte. Terms of use: Documents in EconStor may be saved and copied for your personal and scholarly purposes. You are not to copy documents for public or commercial purposes, to exhibit the documents publicly, to make them publicly available on the internet, or to distribute or otherwise use the documents in public. If the documents have been made available under an Open Content Licence (especially Creative Commons Licences), you may exercise further usage rights as specified in the indicated licence.
Knowledge-based capital is a key factor for productivity growth. Over the past 15 years, it has been increasingly recognised that knowledge-based capital comprises much more than technological knowledge and that these other components are essential for understanding productivity developments and competitiveness of both firms and economies. We develop selected indicators for knowledge-based capital, often denoted as intangible capital, on the basis of publicly available data from online platforms. These indicators based on data from Facebook and the employer branding and review platform Kununu are compared by OLS regressions with firm-level survey data from the Mannheim Innovation Panel (MIP). All regressions show a positive and significant relationship between survey-based firm-level expenditures for marketing and on-the-job training and the respective information stemming from the online platforms. We therefore explore the possibility of predicting brand equity and firm-specific human capital with machine learning methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.