Purpose
There is a strong clinical need to evaluate different multi‐criteria optimization (MCO) algorithms, including inverse optimization sampling algorithms and machine learning‐based predictions. This study aims to develop and compare several interpolated Pareto surface similarity metrics.
Materials and methods
The first metric is the root‐mean‐square error (RMSE) evaluated between vertices on the interpolated surfaces, augmented by intra‐simplex sampling of the barycentric coordinates of the surfaces’ simplicial complexes. The second metric is the average projected distance (APD), which evaluates the displacements between the vertices and computes their projections along the mean displacement. The third metric is the average nearest‐point distance (ANPD), which numerically integrates point‐to‐simplex distances over the sampled simplices of the interpolated surfaces. These metrics were compared by their convergence rates, the times required to achieve convergence, and their representation of the underlying surface interpolations. For analysis, several interpolated Pareto surface pairs were constructed abstractly, with one pair from a nasopharyngeal treatment planning case using MCO.
Results
Convergence within 1% is typically achieved at approximately 50 and 80 samples per barycentric dimension for the RMSE and the ANPD, respectively. Calculation requires approximately 1 and 10 ms to achieve convergence for the RMSE and the ANPD in two dimensions, respectively, while the APD always requires < 1 ms. These time costs are much higher in higher dimensions for just the RMSE and ANPD. The APD values more closely approximated the ANPD limits than the RMSE limits.
Conclusion
The ANPD’s formulation and generality make it likely more meaningful than the RMSE and APD for representing the similarity between the underlying interpolated surfaces rather than the sampling points on the surfaces. However, in situations requiring high‐speed evaluations, the APD may be more desirable due to its speed, independence from a subjectively chosen sampling rate, and similarity to the ANPD limits.