BackgroundPrediction of axillary lymph node (ALN) status preoperatively is critical in the management of breast cancer patients. This study aims to develop a new set of nomograms to accurately predict ALN status.MethodsWe searched the National Cancer Database to identify eligible female breast cancer patients with profiles containing critical information. Patients diagnosed in 2010–2011 and 2012–2013 were designated the training (n = 99,618) and validation (n = 101,834) cohorts, respectively. We used binary logistic regression to investigate risk factors for ALN status and to develop a new set of nomograms to determine the probability of having any positive ALNs and N2–3 disease. We used ROC analysis and calibration plots to assess the discriminative ability and accuracy of the nomograms, respectively.ResultsIn the training cohort, we identified age, quadrant of the tumor, tumor size, histology, ER, PR, HER2, tumor grade and lymphovascular invasion as significant predictors of ALNs status. Nomogram-A was developed to predict the probability of having any positive ALNs (P_any) in the full population with a C-index of 0.788 and 0.786 in the training and validation cohorts, respectively. In patients with positive ALNs, Nomogram-B was developed to predict the conditional probability of having N2–3 disease (P_con) with a C-index of 0.680 and 0.677 in the training and validation cohorts, respectively. The absolute probability of having N2–3 disease can be estimated by P_any*P_con. Both of the nomograms were well-calibrated.ConclusionsWe developed a set of nomograms to predict the ALN status in breast cancer patients.Electronic supplementary materialThe online version of this article (doi:10.1186/s12885-017-3535-7) contains supplementary material, which is available to authorized users.