Estimating I/O time of applications is critical for computing system research and developments, such as performance tuning and job scheduling. Parallel I/O systems on large-scale HPC systems typically use several I/O servers attached to a number of hard disk drives to read and write data concurrently. As a result, the response time of individual I/O servers affects the overall I/O performance and modeling the response time distribution holds the key to estimate I/O time. Existing studies have generally considered that the response time follows a Uniform or a Normal distribution. However, none of these studies considered supercomputing environments that are actively used by a number of users to verify the existence of Uniform or Normal distributions. In this study, we collected ≈ 2,500,000 measurements on two peta-scale class supercomputers that are actively used by ≈5000 users. These two systems, Hopper and Edison at the National Energy Research Scientific Computing Center (NERSC), typically support hundreds of concurrent jobs. Our performance measurements include the overheads introduced by the entire parallel I/O stack (I/O library, network, parallel file system software, cache and hardware). Our study shows that the response time of parallel I/O system follows a heavy-tailed property, in contrary to the widely accepted Normal or Uniform distributions. In exploring for new models, we identify that a mix of Power Law and Normal distributions is a good fit for the response time of parallel I/O systems that are actively used by hundreds of jobs concurrently.