BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160105Z
LOCATION:Track 4
DTSTART;TZID=America/New_York:20201119T140000
DTEND;TZID=America/New_York:20201119T143000
UID:submissions.supercomputing.org_SC20_sess178_pap199@linklings.com
SUMMARY:Term Quantization: Furthering Quantization at Run Time
DESCRIPTION:Paper\n\nTerm Quantization: Furthering Quantization at Run Tim
 e\n\nKung, McDanel, Zhang\n\nWe present a novel technique, called Term Qua
 ntization (TQ), for furthering quantization at run time for improved compu
 tational efficiency of deep neural networks (DNNs) already quantized with 
 conventional quantization methods. TQ operates on power-of-two terms in ex
 pressions of values. In computing a dot-product, TQ dynamically selects a 
 fixed number of largest terms to use from values of the two vectors. By ex
 ploiting weight and data distributions typically present in DNNs, TQ has a
  minimal impact on DNN model performance (e.g., accuracy or perplexity). W
 e use TQ to facilitate tightly synchronized processor arrays, such as syst
 olic arrays, for efficient parallel processing. We evaluate TQ on an MLP f
 or MNIST, multiple CNNs for ImageNet and an LSTM for Wikitext-2. We demons
 trate significant reductions in inference computation costs (between 3x an
 d 10x) compared to conventional uniform quantization for the same level of
  model performance.\n\nTag: Data Analytics, Compression, and Management, L
 inear Algebra, Machine Learning, Deep Learning and Artificial Intelligence
 \n\nRegistration Category: Tech Program Reg Pass
END:VEVENT
END:VCALENDAR

