The population of adults with Alzheimer's disease (AD) varies in needs and outcomes. The heterogeneity of current AD diagnostic subgroups impedes the use of data analytics in clinical trial design and translation of findings into improved care. The purpose of this project was to define more clinicallyhomogeneous groups of AD patients and link clinical characteristics with biological markers. We used an innovative big data analysis strategy, the 3C strategy, that incorporates medical knowledge into the data analysis process. A large set of preprocessed AD Neuroimaging Initiative (ADNI) data was analyzed with 3C. The data analysis yielded 6 new disease subtypes, which differ from the assigned diagnosis types and present different patterns of clinical measures and potential biomarkers. Two of the subtypes, "Anosognosia dementia" and "Insightful dementia", differentiate between severe participants based on clinical characteristics and biomarkers. The "Uncompensated mild cognitive impairment (MCI)" subtype, demonstrates clinical, demographic and imaging differences from the "Affective MCI" subtype. Differences were also observed between the "Worried Well" and "Healthy" clusters. The use of data-driven analysis yielded sub-phenotypic clinical clusters that go beyond current diagnoses and are associated with biomarkers. Such homogenous subgroups can potentially form the basis for enhancement of brain medicine research. Alzheimer's disease (AD) is a degenerative brain disease and the most common cause of dementia 1 according to the 2018 Alzheimer's association report 2 an estimated 5.7 million Americans of all ages are living with AD in 2018. The percentage of people with AD increases with age: 3% of people age 65-74, 17% of people age 75-84, and 32% of people age 85 and older have AD 3. Symptoms vary among people with AD, and the differences between typical age-related cognitive changes and early signs of AD can be subtle. The definite diagnosis of AD, requiring histopathological examination, is characterized by the accumulation of β-amyloid (Aβ) plaques and neurofibrillary tangles composed of tau amyloid fibrils associated with brain cell damage and neurodegeneration 4. In clinical practice, the diagnosis of AD is based on clinical criteria, while laboratory and imaging examinations are used to exclude other diagnoses. Sub classification of AD has been previously attempted, mostly based on a small set of parameters or on a single modality 5,6 , and in some studies has relied only on previous knowledge. Current diagnostic subgroupings are informative, however, they are quite crude as they are based on rough criteria 7,8. This may lead astray supervised data mining tools that rely solely on these definitions while trying to predict or associate disease manifestation with clinical and biological markers. Thus, for the search of new insights, it is essential to use unsupervised processes, which do not rely on the current diagnostic subgroupings, Nevertheless, despite numerous attempts to use unsupervised processes as progn...