BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160050Z
LOCATION:Track 2
DTSTART;TZID=America/New_York:20201117T140000
DTEND;TZID=America/New_York:20201117T143000
UID:submissions.supercomputing.org_SC20_sess159_pap505@linklings.com
SUMMARY:Distributed-Memory DMRG via Sparse and Dense Parallel Tensor Contr
actions
DESCRIPTION:Paper\n\nDistributed-Memory DMRG via Sparse and Dense Parallel
Tensor Contractions\n\nLevy, Solomonik, Clark\n\nThe density matrix renor
malization group (DMRG) algorithm is a powerful tool for solving eigenvalu
e problems to model quantum systems. DMRG relies on tensor contractions an
d dense linear algebra to compute properties of condensed matter physics s
ystems. However, its efficient parallel implementation is challenging due
to limited concurrency, large memory footprint and tensor sparsity. We mit
igate these problems by implementing two new parallel approaches that hand
le block sparsity arising in DMRG, via Cyclops, a distributed memory tenso
r contraction library. We benchmark their performance on two physical syst
ems using the Blue Waters and Stampede2 supercomputers. Our DMRG performan
ce is improved by up to 5.9x in runtime and 99x in processing rate over IT
ensor, at roughly comparable computational resource use. This enables high
er accuracy calculations via larger tensors for quantum state approximatio
n. We demonstrate that despite having limited concurrency, DMRG is weakly
scalable with the use of efficient parallel tensor contraction mechanisms.
\n\nTag: Algorithms, Applications, Sparse Computation\n\nRegistration Cate
gory: Tech Program Reg Pass
END:VEVENT
END:VCALENDAR