BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210402T160554Z
LOCATION:Track 2
DTSTART;TZID=America/New_York:20201112T143000
DTEND;TZID=America/New_York:20201112T150000
UID:submissions.supercomputing.org_SC20_sess208_ws_pmbsf110@linklings.com
SUMMARY:Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimiz
 ation Pragmas Using Bayesian Optimization
DESCRIPTION:Workshop\n\nAutotuning PolyBench Benchmarks with LLVM Clang/Po
 lly Loop Optimization Pragmas Using Bayesian Optimization\n\nWu, Kruse, Ba
 laprakash, Finkel, Taylor...\n\nAn autotuning is an approach that explores
  a search space of possible implementations/configurations of a kernel or 
 an application by selecting and evaluating a subset  of implementations/co
 nfigurations on a target platform and/or use models to identify a high per
 formance implementation/configuration. In this paper, we develop an autotu
 ning framework that leverages Bayesian optimization to explore the paramet
 er space search. We select six of the most complex benchmarks from the app
 lication domains of the PolyBench benchmarks (syr2k, 3mm, heat-3d, lu, cov
 ariance, and Floyd-Warshall) and apply the newly developed LLVM Clang Clan
 g loop optimization pragmas to the benchmarks to optimize them. We then us
 e the autotuning framework to optimize the pragma parameters to improve th
 eir performance. The experimental results show that our autotuning approac
 h outperforms the other compiling methods to provide the smallest executio
 n time for the benchmarks syr2k, 3mm, heat-3d, lu, and covariance with two
  large datasets in 200 code evaluations for effectively searching the para
 meter spaces with up to 170,368 different configurations. We compare diffe
 rent supervised learning methods within Bayesian optimization and evaluate
  their effectiveness. We find that the Floyd-Warshall benchmark did not be
 nefit from autotuning because Polly uses heuristics to optimize the benchm
 ark to make it run much slower. To cope with this issue, we provide some c
 ompiler option solutions to improve the performance.\n\nRegistration Categ
 ory: Workshop Reg Pass
END:VEVENT
END:VCALENDAR

