BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160547Z
LOCATION:Track 7
DTSTART;TZID=America/New_York:20201109T100000
DTEND;TZID=America/New_York:20201109T140000
UID:submissions.supercomputing.org_SC20_sess243_tut117@linklings.com
SUMMARY:Fault-tolerance for High Performance and Big Data Applications: Th
 eory and Practice: Part 1
DESCRIPTION:Tutorial\n\nFault-tolerance for High Performance and Big Data 
 Applications: Theory and Practice: Part 1\n\nBosilca, Bouteiller, Herault,
  Robert\n\nResilience is a critical issue for large-scale platforms. This 
 tutorial provides a comprehensive survey of fault-tolerant techniques for 
 high-performance and big-data applications, with a fair balance between th
 eory and practice. \n\nOutline: Overview of failure types and  typical pro
 bability distributions; general-purpose techniques: checkpoint and rollbac
 k recovery protocols, replication, prediction, silent error detection; app
 lication-specific techniques: user-level in-memory checkpointing,\ndata re
 plication (map-reduce) or fixed-point convergence for iterative applicatio
 ns (back-propagation); practical deployment of fault tolerance techniques 
 with User Level Fault Mitigation (a proposed MPI standard extension). \n\n
 Examples: Monte-Carlo methods; SPMD stencil; map-reduce; back-propagation 
 in neural networks.\n\nA step-by-step approach shows how to protect these 
 routines in a hands-on session. The tutorial is open to all SC20 attendees
  who are interested in the current status and expected promise of fault-to
 lerant approaches for scientific and big data applications. There are no a
 udience prerequisites: background will be provided for all protocols and p
 robabilistic models.\n\nTag: Correctness, Fault Tolerance, MPI, Reliabilit
 y and Resiliency, Reproducibility and Transparency\n\nRegistration Categor
 y: Tutorial Reg Pass
END:VEVENT
END:VCALENDAR

