“…One of the biggest challenges that all researchers in the field are facing is to effectively identify/recognize the datasets that are available to explore. We have tried a few methods, including k-means clustering ( Lloyd, 1982 ), association rule learning ( Agrawal et al, 1993 ), and a generalized correspondence analysis method ( Beaton et al, 2019 ), to separate the scans by grouping their imaging parameters, but all these attempts were unsuccessful. There are two possible reasons for this, (a) any value range could be shared by multiple sequences, e.g., the value ranges of TR for 3DT1, PD/T2, and fMRI scans in this dataset are 6.4–2,740, 2,017–16,000, and 3,800–14,000 ms, respectively; and (b) any of the DICOM headers could be missing, e.g., nearly half of the sequences collected by two Brain-CODE study programs missed the DICOM header (0018,0024), Sequence Name , as mentioned above.…”