Multi-core based systems are ubiquitous in data centers. Efficient exploitation of hardware parallelism supported by such systems is imperative on multiple fronts: minimizing latency and power consumption and maximizing throughput. This in turn calls for advanced program analysis and optimization. Call graphs have been long used to this end. Although several static call graph extraction techniques have been proposed in the past, these techniques cannot be applied to analyze programs already running in production. Likewise, the existing dynamic call graph extraction tools have limited use in production owing to, say (but not limited to), lack of support for capturing wall clock time spent in functions of a given program and lack of means to analyze the call graph information captured at run time. In this paper, we present a Pin-based dynamic call graph extraction framework called Trin-Trin. The framework enables extraction of complete, precise and dynamic call graphs. Additionally, the framework can be used seamlessly with already running applications. Furthermore, an analytics engine is provided to facilitate advanced program analysis, e.g., different multithreading context(s) of any function can be extracted in a demand-driven fashion. We evaluate the overhead of Trin-Trin using several Unix utilities, applications from the industry-standard SPEC CINT2006, CFP2006 benchmark suite and Yahoo! properties. Additionally, we present a case study to illustrate how Trin-Trin can be used to analyze performance bottlenecks and performance regressions.