13Accuracy of infrared (IR) models to measure soil particle-size distribution (PSD) depends 14 on soil preparation, methodology (sedimentation, laser), settling times and relevant soil 15 features. Compositional soil data may require log ratio (ilr) transformation to avoid 16 Infrared spectroscopy (IRS) more accurately predicts clay than sand and silt contents [10] 59 because the IR spectrum is sensitive to clay mineralogy [10][11][12] and total reflectance 60 decreases as grain size increases [13,14]. Light absorption is also influenced by soil 61 features such as particle roundness [15][16][17][18], soil pH, and the Mehlich-3 soil test for Ca, 62 Mg and Mn [19]. The IRS detects Al-OH (2200 nm) and Fe-OH (2290 nm) [20,21] that 63 in turn have an impact on soil structure [22] and IR reflectance [23]. The VIS-NIR 64 spectra are sensitive to soil moisture and C content [24]. Organic matter, multi-nutrient 65 extraction and reconstituted bulk density from scooped soil samples are also common 66 features quantified in routine laboratories.67 The sedimentation methods provide percentages of sand, silt, and clay from the log-log 68 relationship between settling time and suspension density. The slope and intercept return 69 proportions of sand, silt and clay at pre-selected settling times that may vary between 70 laboratories [1], thus affecting the accuracy of IRS models. Providing more flexibility 71 using the slope and intercept of the log-log relationship and reducing the arbitrariness of 72 the settling time selection may allow increased reliability of IRS models calibrated 73 against Bouyoucos methods. 74 There are unattended sources of error in IRS calibration. There is systematic negative 75 covariance between sand, silt and clay fractions due to resonance within the ternary 76 diagram [25]. Indeed, there are D-1 degrees of freedom in a D-part composition [26]. Not 77 considering the problem of closure to 100% in statistical analysis, confidence intervals 78 about means of proportions may take values outside of the compositional space, i.e. < 0 79 or > 100% [27], and the measures of distance and dissimilarity are non-Euclidian [28]. To 80 return unbiased statistical results, orthonormal balances among subsets of components 81 can be computed as D-1 isometric log ratios (ilr) [29]. The back-transformed ilr values 82 allow recovering proportions of sand, silt, and clay totalling exactly 100% within the 83 limits of the ternary diagram. 84 For both the sedimentation and laser techniques there are several soil pre-treatments 85 (peroxide, sodium hypochlorite, sodium hexametaphosphate, sonication intensity), soil 86 features, calibration techniques, options (NIR 2X, NIR 4X, settling times or suspension 87 density function for sedimentation; pump and stirrer spin, refractive index of the medium, 88 real or imaginary refractive index, density for laser) and expressions (percentages, ratios) 89 that influence results of particle-size distribution. Machine learning (ML) is an emerging 90 data mining technique of ...