This paper considers for the first time end-to-end response-time analysis for DAG-based real-time task systems implemented on heterogeneous multicore platforms. The specific analysis problem that is considered was motivated by an industrial collaboration involving wireless cellular base stations. The DAG-based systems considered herein allow intra-task parallelism: while each invocation of a task (i.e., DAG node) is sequential, successive invocations of a task may execute in parallel. In the proposed analysis, this characteristic is exploited to reduce response-time bounds. Additionally, there is some leeway in choosing how to set tasks' relative deadlines. It is shown that by resolving such choices holistically via linear programming, response-time bounds can be further reduced. Finally, in the considered use case, DAGs are defined based upon just a few templates and individually often have quite low utilizations. It is shown that, by combining many such DAGs into one of higher utilization, response-time bounds can often be drastically lowered. The effectiveness of these techniques is demonstrated via both casestudy and schedulability experiments.