This research proposes an entropy-coded sparse matrix format using ANS coding to compress matrix indices for efficient GPU decompression, aiming to increase effective memory bandwidth and maintain computational throughput.
Key findings
Proposes a novel entropy-coded sparse matrix format for AMD and Intel GPU architectures.
Addresses unique characteristics of AMD's ROCm ecosystem and Intel's oneAPI.
Includes detailed format specification, compression/decompression algorithms, and kernel implementations.
Aims for significant memory bandwidth savings while maintaining computational throughput.
Limitations & open questions
Challenges include decompression overhead, thread divergence, and cross-platform portability.
Research is still in the proposal stage with experimental validation pending.