C++ CUDA 和类

Question

提问by secshunayt

I've searched all over for some insight on how exactly to use classes with CUDA, and while there is a general consensus that it can be done and apparently is being done by people, I've had a hard time finding out how to actually do it.

我已经到处搜索有关如何在 CUDA 中使用类的一些见解，虽然普遍认为它可以完成并且显然是由人们完成的，但我很难找到如何实际使用做吧。

I have a class which implements a basic bitset with operator overloading and the like. I need to be able to instantiate objects of this class on both the host and the device, copy between the two, etc. Do I define this class in a .cu? If so, how do I use it in my host-side C++ code? The functions of the class do not need to access special CUDA variables like threadId; it just needs to be able to be used host and device side.

我有一个类，它实现了一个带有运算符重载等的基本位集。我需要能够在主机和设备上实例化此类的对象，在两者之间进行复制等。我是否在 .cu 中定义了此类？如果是这样，我如何在我的主机端 C++ 代码中使用它？类的函数不需要访问像threadId这样的特殊CUDA变量；它只需要能够用于主机和设备端。

Thanks for any help, and if I'm approaching this in completely the wrong way, I'd love to hear alternatives.

感谢您的任何帮助，如果我以完全错误的方式处理这个问题，我很乐意听到其他选择。

Answer 1

回答by harrism

Define the class in a header that you #include, just like in C++.

在 #include 的头文件中定义类，就像在 C++ 中一样。

Any method that must be called from device code should be defined with both __device__and __host__declspecs, including the constructor and destructor if you plan to use new/deleteon the device (note new/deleterequire CUDA 4.0 and a compute capability 2.0 or higher GPU).

任何必须从设备代码调用的方法都应该使用__device__和__host__declspecs定义，包括构造函数和析构函数，如果您打算在设备上使用new/ delete（注意new/delete需要 CUDA 4.0 和计算能力 2.0 或更高的 GPU）。

You probably want to define a macro like

你可能想定义一个宏

#ifdef __CUDACC__
#define CUDA_CALLABLE_MEMBER __host__ __device__
#else
#define CUDA_CALLABLE_MEMBER
#endif

Then use this macro on your member functions

然后在你的成员函数上使用这个宏

class Foo {
public:
    CUDA_CALLABLE_MEMBER Foo() {}
    CUDA_CALLABLE_MEMBER ~Foo() {}
    CUDA_CALLABLE_MEMBER void aMethod() {}
};

The reason for this is that only the CUDA compiler knows __device__and __host__-- your host C++ compiler will raise an error.

这样做的原因是只有 CUDA 编译器知道__device__并且__host__-- 您的主机 C++ 编译器将引发错误。

Edit: Note __CUDACC__is defined by NVCC when it is compiling CUDA files. This can be either when compiling a .cu file with NVCC or when compiling any file with the command line option -x cu.

编辑：注__CUDACC__是由 NVCC 在编译 CUDA 文件时定义的。这可以在使用 NVCC 编译 .cu 文件时使用，也可以在使用命令行选项编译任何文件时使用-x cu。

Answer 2

回答by t. fochtman

Another good resource for this question are some of the code examples that come with the CUDA toolkit. Within these code samples you can find examples of just about any thing you could imagine. One that is pertinent to your question is the quadtree.cu file. Best of luck.

这个问题的另一个很好的资源是 CUDA 工具包附带的一些代码示例。在这些代码示例中，您可以找到您能想到的几乎任何事情的示例。与您的问题相关的是 quadtree.cu 文件。祝你好运。

C++ CUDA 和类

提问by secshunayt

回答by harrism

回答by t. fochtman

相关推荐

最近更新

标签

C++ CUDA 和类

提问by secshunayt

回答by harrism

回答by t. fochtman

相关推荐

使用函数查找数组的平均值 C++

找到向量 c++ 输入的平均值

C++ 使用基于范围的 for 循环时需要迭代器

将 2 Mats 的内容添加到另一个 Mat opencv c++

相关推荐

最近更新

标签