SC20 Is Everywhere We Are

SC20 Virtual Platform
Distributed BERT Pre-Training And Fine-Tuning with Intel-Optimized TensorFlow On Intel Xeon Scalable Processors
Event Type
Posters
Research Posters
Registration Categories
TP
XO
TimeThursday, 19 November 20208:30am - 5pm EDT
LocationPoster Module
DescriptionDistributed computing has become a key component in the field of data science, allowing for faster prototyping and accelerated time to market of numerous workloads. This work examines the distributed training performance of BERT, a state-of-the-art language model for neural language processing (NLP), in the tasks of pre-training and fine-tuning on general-purpose Intel CPUs. The effects using Intel-optimized TensorFlow optimizations on Intel Architecture with both FP32 and BFLOAT16 floating-point formats are included in the analysis. Results show that the distributed TensorFlow BERT model with LAMB optimizer can maintain high accuracy while getting good performance speedups from scaling to a larger amount of Intel Xeon CPUs.
Back To Top Button