Detail publikace
Accelerating Hybrid Local Domain Decomposition for the k-Wave Toolbox on Multi-GPU Systems
k-Wave, HPC, Hybrid decomposition, Local decomposition, CUDA, Multi-GPU
The k-Wave toolbox is designed for high-fidelity acoustic wave simulations using Fourier collocation for spatial derivatives, but its performance is constrained by communication overhead on multi-CPU and multi-GPU systems. We present a hybrid local domain decomposition approach that partitions the simulation domain into subdomains, each assigned to a GPU with configurable resolution. Using CUDA and cuFFT for Fourier transforms and NVLink for halo exchanges, our method minimizes inter-subdomain communication and accelerates multi-GPU performance. Testing on the Karolina supercomputer shows strong scalability and accuracy, especially with uniform-resolution subdomains, and proves effective even for large-scale simulations beyond single-GPU memory limits.
@misc{BUT193367,
author="Oliver {Kuník} and Jiří {Jaroš}",
title="Accelerating Hybrid Local Domain Decomposition for the k-Wave Toolbox on Multi-GPU Systems",
year="2024",
pages="1",
address="Ostrava",
url="https://www.fit.vut.cz/research/publication/13293/",
note="presentation, poster"
}