Node-Level Performance Engineering: Part 1
Presenters
Event Type
Tutorial
Requirements, Performance, and Benchmarks
Performance/Productivity Measurement and Evaluation
Software Engineering
TUT
TimeMonday, 9 November 202010am - 2pm EDT
LocationTrack 4
DescriptionThe gap between peak and application performance is continuing to open. Paradoxically, bad node-level performance entails scalable code, but at the price of increased time-to-solution. Therefore, valuable resources are wasted on massive scales. If we care about resource efficiency on any scale, optimal node-level performance is crucial. We convey the architectural features of current processor chips, multiprocessor nodes and accelerators, as far as they are relevant for the practitioner. Peculiarities like SIMD, shared caches, bandwidth bottlenecks and ccNUMA are introduced, and the influence of system topology and affinity on the performance of parallel programming constructs is demonstrated. Performance engineering and performance patterns are suggested as powerful tools that help the user understand the bottlenecks at hand and to assess the impact of possible code optimizations. A cornerstone of these concepts is the roofline model, which is described in detail, including useful case studies and limits of its applicability.
Links


