Microservice architectures can enhance software development by using
multiple programming languages and deployment infrastructures, isolating
failures within individual services, and accelerating the debugging and
fixing of issues in independent services. Locating performance
degradation becomes challenging, due to the presence of numerous service
instances with complex interactions compounded by parallelism. Although
end-to-end tracing allows tracing execution paths across services, and
detecting their latencies, it is limited to high-level information.
Indeed, end-to-end tracing cannot pinpoint the root causes of
performance degradation between the processes. Moreover, many existing
performance analysis tools lack a comparison feature to give developers
a comprehensive view of the performance differences between two groups
of requests. This paper introduces DTraComp (Distributed Trace
Compare), an open-source framework, compatible with various
microservice trace standards, and integrated with Eclipse Trace
Compass™. Our framework offers robust visual comparison capability for
two groups of executions within distributed systems, which includes
nested spans executed in parallel. Furthermore, it provides system
kernel details for each thread involved in the execution of each span,
allowing it to pinpoint the reasons for performance degradation across
distributed systems. We used our proposed framework to analyze five
practical use cases. By evaluating the efficiency of our tool, it was
determined that the overall time complexity scales linearly O(n) with
the trace size, indicating its suitability for deployment in production
environments. It is currently used within Ericsson company for
performance evaluation purposes.