BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20200129T163228Z
LOCATION:301-302-303
DTSTART;TZID=America/Denver:20191120T160000
DTEND;TZID=America/Denver:20191120T163000
UID:submissions.supercomputing.org_SC19_sess154_pap176@linklings.com
SUMMARY:Near-Memory Data Transformation for Efficient Sparse Matrix Multi-
Vector Multiplication
DESCRIPTION:Paper\n\nNear-Memory Data Transformation for Efficient Sparse
Matrix Multi-Vector Multiplication\n\nFujiki, Chatterjee, Lee, O'Connor\n\
nEfficient manipulation of sparse matrices is critical to a wide range of
HPC applications. We study one common operation, Sparse Matrix Multi-Vecto
r Multiplication (SpMM), and evaluate the impact of the sparsity, distribu
tion of non-zero elements, and tile-traversal strategies on GPU implementa
tions. Using these insights, we determine that operating on these sparse
matrices in tiled-DCSR is well-suited to the parallel warp-synchronous exe
cution model of GPU. \n\nPreprocessing or storing the sparse matrix in th
e tiled-DCSR format, however, often requires significantly more memory sto
rage than conventional CSR or CSC formats. Given that SpMM kernels are of
ten bottlenecked on DRAM bandwidth, the increase in DRAM traffic can resul
t in a slowdown for many matrices.\n\nThis work enhances a GPU's last-leve
l cache/memory controller unit to act as a dynamic translator between the
compute-optimized representation of data (tiled-DCSR) and its correspondin
g storage/bandwidth-optimized format (CSC).\nOur approach achieves 2.26x b
etter performance on average compared to cuSPARSE.\n\nTag: Tech Program Re
g Pass, Data Management, GPUs, Memory, Networks, Performance, Software-def
ined networking, Sparse Computation\n\nRegistration Category: Tech Program
Reg Pass, Data Management, GPUs, Memory, Networks, Performance, Software-
defined networking, Sparse Computation
URL:https://sc19.supercomputing.org/presentation/?id=pap176&sess=sess154
END:VEVENT
END:VCALENDAR