BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160049Z
LOCATION:Track 4
DTSTART;TZID=America/New_York:20201117T133000
DTEND;TZID=America/New_York:20201117T140000
UID:submissions.supercomputing.org_SC20_sess176_pap379@linklings.com
SUMMARY:ZeRO: Memory Optimizations Toward Training Trillion Parameter Mode
 ls
DESCRIPTION:Paper\n\nZeRO: Memory Optimizations Toward Training Trillion P
 arameter Models\n\nRajbhandari, Rasley, Ruwase, He\n\nLarge deep learning 
 models offer significant accuracy gains, but training billions of paramete
 rs is challenging. Existing solutions exhibit fundamental limitations fitt
 ing these models into limited device memory, while remaining efficient.  O
 ur solution uses ZeroRedundancy Optimizer (ZeRO) to optimize memory, vastl
 y improving throughput while increasing model size. ZeRO eliminates memory
  redundancies allowing us to scale the model size in proportion to the num
 ber of devices with sustained high efficiency.  ZeRO can scale beyond 1 tr
 illion parameters using today’s hardware. \n\nOur implementation of ZeRO c
 an train models of over 100b parameters on 400 GPUs with super-linear spee
 dup,  achieving 15 petaflops. This represents an 8x increase in model size
  and 10x increase in achievable performance.  ZeRO can train large models 
 of up to 13b parameters without requiring model parallelism (which is hard
 er for scientists to apply).  Researchers have used ZeRO to create the wor
 ld’s largest language model (17b parameters) with record breaking accuracy
 .\n\nTag: Machine Learning, Deep Learning and Artificial Intelligence, Mem
 ory Optimization, Scalable Computing\n\nRegistration Category: Tech Progra
 m Reg Pass
END:VEVENT
END:VCALENDAR

