May 3, 2021 by kittycool only

Analyzing Memory and Threading Correctness for GPU-Offloaded Code

Modern workloads are diverse—and so are architectures. No single architecture is best for every workload. Maximizing performance takes a mix of scalar, vector, matrix, and spatial architectures deployed in CPU, GPU, FPGA, and other future accelerators. Heterogeneity adds complexity that can be difficult to debug. This article introduces the new features of Intel® Inspector that support the analysis of code that’s offloaded to accelerators.

For more information: Analyzing Memory and Threading Correctness for GPU-Offloaded Code

Leave a comment Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Advertisements

Advertisements

Advertisements