Publication Details
Performance and Accuracy Analysis of Nonlinear k-Wave Simulations Using Local Domain Decomposition with an 8-GPU Server
k-Wave, Local domain decomposition, Fourier Basis, pseudospectral methods
Large-scale nonlinear ultrasound simulations using the open-source k-Wave toolbox are now routinely performed using the MPI version of k-Wave running on traditional CPU-based clusters. However, the allto-all communications required by the 3D fast Fourier transform (FFT) severely impact performance when scaling to large numbers of compute cores. This can be overcome by using a domain decomposition strategy based on a local Fourier basis. In this work, we analyse the performance and accuracy of using local domain decomposition for running a high-intensity focused ultrasound (HIFU) simulation in the kidney on a single server containing eight NVIDIA P40 graphical processing units (GPUs). Different decompositions and overlap sizes are investigated and compared to a global MPI simulation running on a CPU-based supercomputer using 1280 cores. For a grid size of 960 × 960 × 1280 grid points and an overlap size of 4 grid points, the error in the simulation using local domain decomposition is on the order of 0.1% compared to the global simulation, which is sufficient for most applications. The financial cost for running the simulation is also reduced by more than an order of magnitude.
@article{BUT155074,
author="Bradley {Treeby} and Filip {Vaverka} and Jiří {Jaroš}",
title="Performance and Accuracy Analysis of Nonlinear k-Wave Simulations Using Local Domain Decomposition with an 8-GPU Server",
journal="Proceedings of Meetings on Acoustics",
year="2018",
volume="34",
number="1",
pages="1--5",
doi="10.1121/2.0000883",
issn="1939-800X",
url="https://asa.scitation.org/doi/10.1121/2.0000883"
}