Given a user-specified minimum degree threshold γ, a γ-quasiclique is a subgraph where each vertex connects to at least γ fraction of the other vertices. Quasi-clique is a natural definition for dense structures useful in finding communities in social networks and discovering significant biomolecule structures and pathways. However, mining maximal quasi-cliques is notoriously expensive with the state-of-the-art algorithm scaling only to small graphs.In this paper, we design parallel algorithms for mining maximal quasi-cliques on G-thinker, a distributed graph mining framework, to scale to big graphs. Our algorithms follow the idea of divide and conquer which partitions the problem of mining a big graph into tasks that mine smaller subgraphs. However, a direct adaptation to G-thinker cannot fully utilize the available CPU cores for mining, making a system reforge essential. We observe that even though our algorithms have better utilized pruning rules to reduce the search space for mining than prior algorithms, the resulting tasks have drastically different mining workloads leading to the straggler problem. Even worse, unpredictable pruning rules make it impossible to effectively estimate the running time of a task from its subgraph. We address these challenges by redesigning G-thinker's execution engine to prioritize long-running tasks for mining, and by utilizing a novel time-delayed divide-and-conquer strategy to effectively decompose the workloads of long-running tasks to improve load balancing. Extensive experiments verify that our parallel solution scales perfectly with the number of CPU cores, achieving over 371× speedup when mining a graph with over 1M vertices in a small 16-node cluster (32 threads each, 512 totally).