BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160556Z
LOCATION:Track 10
DTSTART;TZID=America/New_York:20201112T143300
DTEND;TZID=America/New_York:20201112T151000
UID:submissions.supercomputing.org_SC20_sess218_pec356@linklings.com
SUMMARY:PAW-ATM – Keynote: Performance Portability in the Age of Extreme H
 eterogeneity
DESCRIPTION:Workshop\n\nPAW-ATM – Keynote: Performance Portability in the 
 Age of Extreme Heterogeneity\n\nShalf\n\nMoore’s Law is a techno-economic 
 model that has enabled the IT industry to double the performance and funct
 ionality of digital electronics roughly every 2 years within a fixed cost,
  power and area. This expectation has led to a relatively stable ecosystem
  (e.g. electronic design automation tools, compilers, simulators and emula
 tors) built around general-purpose processor technologies, such as the x86
 , ARM and Power instruction set architectures. However, the historical imp
 rovements in performance offered by successive generations of lithography 
 are waning while costs for new chip generations are growing rapidly.\n\nIn
  the near term, the most practical path to continued performance growth wi
 ll be architectural specialization in the form of many different kinds of 
 accelerators. New software implementations, and in many cases new mathemat
 ical models and algorithmic approaches, are necessary to advance the scien
 ce that can be done with these specialized architecture. This trend will n
 ot only continue but also intensify as the transition from multi-core syst
 ems to hybrid systems has already caused many teams to re-factor and redes
 ign their implementations. But the next step to systems that exploit not j
 ust one type of accelerator but a full range of heterogeneous architecture
 s will require more fundamental and disruptive changes in algorithm and so
 ftware approaches. This applies to the broad range of algorithms used in s
 imulation, data analysis and learning. New programming models or low-level
  software constructs that hide the details of the architecture from the im
 plementation can make future programming less time-consuming, but they wil
 l not eliminate nor in many cases even mitigate the need to redesign algor
 ithms. Future software development will not be tractable if a completely d
 ifferent code base is required for each different variant of a specialized
  system.\n\nThe aspirational desire for “minimizing the number of lines of
  code that must be changed to migrate to different systems with different 
 arrangements of specialization” is encapsulated in the loaded phrase “Perf
 ormance Portability.” However, performance portability is likely not an ac
 hievable goal if we attempt to do it using imperative languages like Fortr
 an and C/C++. There is simply not enough flexibility built in to the speci
 fication of the algorithm for a compiler to do anything other than what th
 e algorithm designer explicitly stated in their code. To make this future 
 of diverse accelerators usable and accessible in the former case will requ
 ire the co-design of new compiler technology and domain- specific language
 s (DSLs) designed around the requirements of the target computational moti
 fs (the 13 motifs that extended Phil Colella’s original Dwarfs of algorith
 mic methods). The higher levels of abstraction and declarative semantics o
 ffered by DSLs enable more degrees of freedom to optimally map the algorit
 hms onto diverse hardware than traditional imperative languages that over-
 prescribe the solution. Because this will drastically increase the complex
 ity of the mapping problem, new mathematics for optimization will be devel
 oped, along with better performance introspection (both hardware and softw
 are mechanisms for online performance introspection) through extensions to
  the roofline model. Use of ML/AI technologies will be essential to enable
  analysis and automation of dynamic optimizations.\n\nRegistration Categor
 y: Workshop Reg Pass
END:VEVENT
END:VCALENDAR

