High performance streaming applications require hardware platforms featuring complex, multi-level interconnects. These applications often resemble a taskfarm, where many identical tasks listen to the same input channel. Usual embedded system design tools are not well adapted to capture these applications. In particular, the non-uniform memory access (NUMA) nature of the platforms induces latencies that must be carefully examined. The paper proposes a multilevel modeling methodology and tools (TTool, SoCLib) that have been extended to model the characteristics of streaming applications (multiple tasks, non deterministic behavior, I/O devices) in UML/SysML, and to automatically generate a virtual prototype that can be simulated with high precision. The paper uses a typical streaming application to show how latencies can be estimated and fed back to diagrams.