FTXS: Workshop on Fault-Tolerance for HPC at Extreme Scale
Session Chairs
Event TypeWorkshop
Extreme Scale Computing
Fault Tolerance
Reliability and Resiliency
W
TimeWednesday, 11 November 202010am - 1:30pm EDT
LocationTrack 11
Presentations
10:00am - 10:05am EDT | FTXS – Introduction: Workshop on Fault-Tolerance for HPC at Extreme Scale | |
10:05am - 10:35am EDT | Improving Scalability of Silent-Error Resilience for Message-Passing Solvers via Local Recovery and Asynchrony | |
10:35am - 11:05am EDT | Towards Distributed Software Resilience in Asynchronous Many-Task Programming Models | |
11:05am - 11:35am EDT | Models for Resilience Design Patterns | |
11:35am - 11:55am EDT | FTXS – Break Presenter | |
11:55am - 12:25pm EDT | From Tasks Graphs to Asynchronous Distributed Checkpointing with Local Restart | |
12:25pm - 12:55pm EDT | A Generic Strategy for Node-Failure Resilience for Certain Iterative Linear Algebra Methods | |
12:55pm - 1:25pm EDT | Checkpointing OpenSHMEM Programs Using Compiler Analysis | |
1:25pm - 1:30pm EDT | FTXS – Closing Remarks Presenter |