We consider the problem of minimizing the total processing time of tardy jobs on a single machine. This is a classical scheduling problem, first considered by [Lawler and Moore 1969], that also generalizes the Subset Sum problem. Recently, it was shown that this problem can be solved efficiently by computing (max, min)-skewed-convolutions. The running time of the resulting algorithm is equivalent, up to logarithmic factors, to the time it takes to compute a (max, min)-skewed-convolution of two vectors of integers whose sum is O(P), where P is the sum of the jobs' processing times. We further improve the running time of the minimum tardy processing time computation by introducing a job "bundling" technique and achieve a Õ P 2−1/α running time, where Õ(P α ) is the running time of a (max, min)-skewedconvolution of vectors of size P. This results in a Õ P 7/5 time algorithm for tardy processing time minimization, an improvement over the previously known Õ P 5/3 time algorithm.