Background: The eHealth initiative of the Conference and Labs of the Evaluation Forum (CLEF) has aimed since 2012 to gather researchers working on health text analytics and to provide them with annual workshop, shared development challenges/tasks, benchmark datasets, and software for processing and evaluation. The overall purpose of this initiative is to ease and support patients, their next-ofkin, clinical staff, health scientists, and healthcare policy makers in accessing, understanding, using, and authoring health information in a multilingual setting. Objective: This original research paper reports on the outcomes of the first six installations of CLEF eHealth from 2012 to 2017. The focus is on measuring and analysing the scholarly influence by reviewing CLEF eHealth papers, together with relevant citation metrics. Methods: A review and bibliometric study of the CLEF eHealth proceedings, working notes, and author-declared paper extensions was conducted. Citation data for these publications were collected from Google Scholar. Citation content analysis was used for the publications and their citations. Results: The large number of registrations, submissions, and citations demonstrate the substantial community interest in the tasks and their resources. In total, 718 teams have registered their interest in the tasks, leading to 130 teams submitting to the 15 tasks. 184 papers using CLEF eHealth data generated 1,299 citations, yielding a total scholarly citation influence of almost 963,000 citations for the 741 co-authors and included authors from 33 countries across the world. The tasks' evaluation outcomes contribute to the knowledge of the difficulty of the research challenges the tasks address and the applicability of particular methods in solving these challenges, with typically statistically significant improvements in processing quality. Conclusions: These outcomes encourage continuing to develop these technologies to address patient needs. Consequently, data and tools have been opened for future research and development and the CLEF eHealth initiative continues to run new challenges. Keywords: Evaluation Studies as Topic; Health Records; Information Extraction; Information Storage and Retrieval; Information Visualization; Patient Education as Topic; Speech Recognition; Systematic Reviews; Test-set Generation; Text Classification
IntroductionHealth information refers to all health-related content in all data formats, document types, information systems, publication media, and languages from all organisations, states, and countries. The privacy-sensitive, official part of health information consists of data recorded in healthcare services when describing a given patient's health or healthcare (Figure 1). The accessibility of this data is defined as limited (i.e., private or confidential information) and it is recorded either on paper in health records or electronically in Electronic Health (eHealth) records. Some common synonyms or related terms include eHealth charts, data, documents, information, ...