With a growing demand for quasi-instantaneous communication services such as real-time video streaming, cloud gaming, and industry 4.0 applications, multi-constraint Traffic Engineering (TE) becomes increasingly important. While legacy TE management planes like MPLS have proven laborious to deploy, Segment Routing (SR) drastically eases the deployment of TE paths and is thus increasingly adopted by Internet Service Providers (ISP). There is now a clear need in computing and deploying Delay-Constrained Least-Cost paths (DCLC) with SR for real-time interactive services requiring both low delay and high bandwidth routes. However, most current DCLC solutions are not tailored for SR. They also often lack efficiency (particularly exact schemes) or guarantees (by relying on unbounded heuristics). Similarly to approximation schemes, we argue that the actual challenge is to design an algorithm providing both performances and strong guarantees. However, conversely to most of these schemes, we also consider operational constraints to provide a practical, high-performance implementation.In this work, we leverage inherent limitations in the accuracy of delay measurements and account for the operational constraint added by SR to design a new algorithm, BEST2COP, providing guarantees and performance in all cases. Our proposal efficiently deals with the complexity of DCLC in SR domains (DCLC-SR) thanks to simple but efficient data structures and amortized procedures specifically tailored to deal with the three metrics (delay, IGP cost, and the number of segments). We show that BEST2COP outperforms a state-of-the-art algorithm on both random and real networks of up to 1000 nodes. Relying on commodity hardware with a single thread, our algorithm retrieves all non-superfluous 3-dimensional routes in only 250ms and 100ms respectively. This execution time is further reduced using multiple threads, as the design of BEST2COP enables a speedup almost linear in the number of cores. The computing load is uniformly balanced across cores. Finally, we extend BEST2COP to deal with massive scale ISP by leveraging the multiarea partitioning of these deployments. Thanks to our new topology generator specifically designed to model realistic patterns in such massive IP networks, we show that BEST2COP can solve DCLC-SR in approximately 1 second even for ISP having more than 100 000 routers.