Resilience and Power Management
Event TypePaper
Accelerators, FPGA, and GPUs
Fault Tolerance
Power
Reliability and Resiliency
TP
TimeWednesday, 18 November 20203pm - 4:30pm EDT
LocationTrack 5
Presentations
3:00pm - 3:30pm EDT | Runtime-Guided ECC Protection using Online Estimation of Memory Vulnerability | |
3:30pm - 4:00pm EDT | CRAC: Checkpoint-Restart Architecture for CUDA with Streams and UVM | |
4:00pm - 4:30pm EDT | ANT-Man: Towards Agile Power Management in the Microservice Era |