BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160544Z
LOCATION:Poster Module
DTSTART;TZID=America/New_York:20201119T083000
DTEND;TZID=America/New_York:20201119T170000
UID:submissions.supercomputing.org_SC20_sess337_rpost111@linklings.com
SUMMARY:Distributed BERT Pre-Training And Fine-Tuning with Intel-Optimized
  TensorFlow On Intel Xeon Scalable Processors
DESCRIPTION:Posters, Research Posters\n\nDistributed BERT Pre-Training And
  Fine-Tuning with Intel-Optimized TensorFlow On Intel Xeon Scalable Proces
 sors\n\nOzturk, Wang, Szankin, Shao\n\nDistributed computing has become a 
 key component in the field of data science, allowing for faster prototypin
 g and accelerated time to market of numerous workloads. This work examines
  the distributed training performance of BERT, a state-of-the-art language
  model for neural language processing (NLP), in the tasks of pre-training 
 and fine-tuning on general-purpose Intel CPUs. The effects using Intel-opt
 imized TensorFlow optimizations on Intel Architecture with both FP32 and B
 FLOAT16 floating-point formats are included in the analysis. Results show 
 that the distributed TensorFlow BERT model with LAMB optimizer can maintai
 n high accuracy while getting good performance speedups from scaling to a 
 larger amount of Intel Xeon CPUs.\n\nRegistration Category: Tech Program R
 eg Pass, Exhibits Reg Pass
END:VEVENT
END:VCALENDAR

