We present a statistical model to estimate the accuracy of derivatized heparin and heparan sulfate (HS) glycosaminoglycan (GAG) assignments to tandem mass (MS/MS) spectra made by the first published database search application, GAG-ID. Employing a multivariate expectationmaximization algorithm, this statistical model distinguishes correct from ambiguous and incorrect database search results when computing the probability that heparin/HS GAG assignments to spectra are correct based upon database search scores. Using GAG-ID search results for spectra generated from a defined mixture of 21 synthesized tetrasaccharide sequences as well as seven spectra of longer defined oligosaccharides, we demonstrate that the computed probabilities are accurate and have high power to discriminate between correctly, ambiguously, and incorrectly assigned heparin/HS GAGs. Heparin and heparan sulfate (HS), members of the glycosaminoglycan (GAG) family, are linear polysaccharides composed of repeating disaccharide building blocks of variously sulfated hexuronic acid (134) D-glucosamine units that structurally differ solely by the length of the oligosaccharide and degree of modification, with heparin being more heavily sulfated and having less N-acetylation than HS. Interacting with proteins, heparin/HS play essential roles in a wide variety of biological processes, including anticoagulation (1), cell proliferation (2, 3), and carcinogenesis (4, 5). The specificity of these interactions is driven by the pattern of modification of heparin/HS oligosaccharide sequences. To understand the molecular role of heparin/HS, it is necessary to correlate function with the fine structure of the carbohydrate. However, the non-template-driven biosynthesis of heparin/HS results in extremely diverse structures. Analyzing heparin/HS is challenging for three reasons: the presence of multiple isomeric sequences in a complex mixture of oligosaccharides, the difficulty of separating the isomers, and the facile loss of sulfates in MS/MS (6).We previously introduced a method for structurally sequencing heparin/HS oligosaccharides that involves chemical derivatizations to replace labile sulfates with stable acetyl groups (7). This derivatization scheme allows for the use of reverse-phase liquid chromatography (LC) for high-resolution separation of isomeric heparin/HS oligosaccharides and MS/MS for sequencing them. However, the data from this derivatization method cannot be easily incorporated into current glycomic software, such as GlycoWorkbench (8), due to the multistep derivatizations and lack of a scoring algorithm that accurately evaluates the matches. We recently reported the development of a software tool for the high-throughput analysis of LC-MS/MS data from these derivatized heparin/HS oligosaccharides, entitled GAG-ID (9), which is the first database-driven software package for this purpose. GAG-ID produces a GAG sequence assignment for each input spectrum; however, some assignments are true matches and some are false. False matches arise from low-qua...