BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160053Z
LOCATION:Track 3
DTSTART;TZID=America/New_York:20201117T160000
DTEND;TZID=America/New_York:20201117T163000
UID:submissions.supercomputing.org_SC20_sess155_pap548@linklings.com
SUMMARY:HPC I/O Throughput Bottleneck Analysis with Explainable Local Mode
 ls
DESCRIPTION:Paper\n\nHPC I/O Throughput Bottleneck Analysis with Explainab
 le Local Models\n\nIsakov, del Rosario, Madireddy, Balaprakash, Carns...\n
 \nWith the growing complexity of high-performance computing (HPC) systems,
  achieving high performance can be difficult because of I/O bottlenecks. W
 e analyze multiple years worth of Darshan logs from the Argonne Leadership
  Computing Facility's Theta supercomputer in order to understand causes of
  poor I/O throughput. We present Gauge: a data-driven diagnostic tool for 
 exploring the latent space of supercomputing job features, understanding b
 ehaviors of clusters of jobs and interpreting I/O bottlenecks. By finding 
 groups of jobs that at first sight are highly heterogeneous but share cert
 ain behaviors, and analyzing these groups instead of individual jobs, we r
 educe the workload of domain experts and automate I/O performance analysis
 . We conduct a case study where a system owner using Gauge was able to arr
 ive at several clusters that do not conform to conventional I/O behaviors,
  as well as find several potential improvements, both on the application l
 evel and the system level.\n\nTag: File Systems and I/O, Machine Learning,
  Deep Learning and Artificial Intelligence, Performance/Productivity Measu
 rement and Evaluation, Resource Management and Scheduling\n\nRegistration 
 Category: Tech Program Reg Pass
END:VEVENT
END:VCALENDAR

