The quality of service (QoS) and lifetime of wireless sensor networks (WSNs) are severely degraded by coverage holes generated by random deployment or battery exhaustion of sensors. This work firstly introduces a novel confident information coverage CIC) model to dramatically reduce the density of sensor nodes and accurately detect confident information coverage holes (CICHs). Then, the problem of repairing confident information coverage holes (RCICHs) for big data collection in a large-scale heterogeneous WSN (LS-HWSN) which widely spreads over a geographic area with thousands of stationary sensor nodes and mobile sensor nodes is formulated, called as RCICH problem, which is to effectively repair CICHs considering that the transmitted data velocity of sensor nodes is different. Furthermore, we prove it to be NP-completeness. The target of the problem is to find a subset of mobile sensor nodes from all mobile sensor nodes while minimizing the amount of lost throughputs LTs) of all dispatched mobile sensor nodes or maximizing the amount of repairing transmission times (RTTs) for all dispatched mobile sensor nodes, with different objectives. Finally, based on the CIC model and the data-centric perspective, two heuristic schemes including a centralized dispatch scheme and a distributed dispatch scheme are proposed to effectively solve the RCICH problem. Simulation results show that the proposed schemes effectively repair CICHs while increasing the QoS and lifetime of the LS-HWSN with the topology control of a fan-shaped clustering (FSC) protocol. INDEX TERMS Large-scale heterogeneous wireless sensor networks, big data collection, confident information coverage, repairing confident information coverage holes.