dpc++(oneAPI)调用nvidiaGPU配置与验证-Toy模板网

这篇具有很好参考价值的文章主要介绍了dpc++(oneAPI)调用nvidiaGPU配置与验证。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

前提

1.安装Intel® oneAPI Toolkits
https://software.intel.com/content/www/us/en/develop/documentation/installation-guide-for-intel-oneapi-toolkits-linux/top.html
下载安装Base版，注意版本，尽量安装新版本

2.安装GPU驱动与CUDA
https://developer.nvidia.com/cuda-downloads
建议为11.8及以上版本
nvidia-smi能出现cuda版本

安装插件

1.依赖库

Ubuntu

sudo apt update
sudo apt -y install cmake pkg-config build-essential

Red Hat and Fedora

sudo yum update
sudo yum -y install cmake pkgconfig
sudo yum groupinstall "Development Tools"

SUSE

sudo zypper update
sudo zypper --non-interactive install cmake pkg-config
sudo zypper --non-interactive install pattern devel_C_C++

验证

which cmake pkg-config make gcc g++

显示
/usr/bin/cmake
/usr/bin/pkg-config
/usr/bin/make
/usr/bin/gcc
/usr/bin/g++

2.下载

https://developer.codeplay.com/products/oneapi/nvidia/download/
对应自己的版本，没有选低一点的版本

安装

chmod +x oneapi-for-nvidia-gpus-2023.1.0-cuda-12.0-linux.sh
sh oneapi-for-nvidia-gpus-2023.1.0-cuda-12.0-linux.sh

安装之前oneapi安装的位置运行

. /opt/intel/oneapi/setvars.sh --include-intel-llvm
或者
. ~/intel/oneapi/setvars.sh --include-intel-llvm

配置.bashrc(按自己路径)

export PATH=/PATH_TO_CUDA_ROOT/bin:$PATH
export LD_LIBRARY_PATH=/PATH_TO_CUDA_ROOT/lib:$LD_LIBRARY_PATH

查看GPU

sycl-ls

显示本机的gpu如[ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, TITAN RTX 0.0 [CUDA 11.0]

验证

#include <sycl/sycl.hpp>

int main() {
  // Creating buffer of 4 ints to be used inside the kernel code
  sycl::buffer<sycl::cl_int, 1> Buffer(4);

  // Creating SYCL queue
  sycl::queue Queue;

  // Size of index space for kernel
  sycl::range<1> NumOfWorkItems{Buffer.size()};

  // Submitting command group(work) to queue
  Queue.submit([&](sycl::handler &cgh) {
    // Getting write only access to the buffer on a device
    auto Accessor = Buffer.get_access<sycl::access::mode::write>(cgh);
    // Executing kernel
    cgh.parallel_for<class FillBuffer>(
        NumOfWorkItems, [=](sycl::id<1> WIid) {
          // Fill buffer with indexes
          Accessor[WIid] = (sycl::cl_int)WIid.get(0);
        });
  });

  // Getting read only access to the buffer on the host.
  // Implicit barrier waiting for queue to complete the work.
  const auto HostAccessor = Buffer.get_access<sycl::access::mode::read>();

  // Check the results
  bool MismatchFound = false;
  for (size_t I = 0; I < Buffer.size(); ++I) {
    if (HostAccessor[I] != I) {
      std::cout << "The result is incorrect for element: " << I
                << " , expected: " << I << " , got: " << HostAccessor[I]
                << std::endl;
      MismatchFound = true;
    }
  }

  if (!MismatchFound) {
    std::cout << "The results are correct!" << std::endl;
  }

  return MismatchFound;
}

编译

icpx -fsycl -fsycl-targets=nvptx64-nvidia-cuda simple-sycl-app.cpp -o simple-sycl-app

可无视的警告

icpx: warning: CUDA version is newer than the latest supported version 11.8 [-Wunknown-cuda-version]

运行

SYCL_DEVICE_FILTER=cuda SYCL_PI_TRACE=1 ./simple-sycl-app

结果

SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_cuda.so [ PluginVersion: 11.15.1 ]
SYCL_PI_TRACE[all]: Selected device: -> final score = 1500
SYCL_PI_TRACE[all]:   platform: NVIDIA CUDA BACKEND
SYCL_PI_TRACE[all]:   device: NVIDIA GeForce RTX 2060
The results are correct!