C++ CUDA 和类
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6978643/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
CUDA and Classes
提问by secshunayt
I've searched all over for some insight on how exactly to use classes with CUDA, and while there is a general consensus that it can be done and apparently is being done by people, I've had a hard time finding out how to actually do it.
我已经到处搜索有关如何在 CUDA 中使用类的一些见解,虽然普遍认为它可以完成并且显然是由人们完成的,但我很难找到如何实际使用做吧。
I have a class which implements a basic bitset with operator overloading and the like. I need to be able to instantiate objects of this class on both the host and the device, copy between the two, etc. Do I define this class in a .cu? If so, how do I use it in my host-side C++ code? The functions of the class do not need to access special CUDA variables like threadId; it just needs to be able to be used host and device side.
我有一个类,它实现了一个带有运算符重载等的基本位集。我需要能够在主机和设备上实例化此类的对象,在两者之间进行复制等。我是否在 .cu 中定义了此类?如果是这样,我如何在我的主机端 C++ 代码中使用它?类的函数不需要访问像threadId这样的特殊CUDA变量;它只需要能够用于主机和设备端。
Thanks for any help, and if I'm approaching this in completely the wrong way, I'd love to hear alternatives.
感谢您的任何帮助,如果我以完全错误的方式处理这个问题,我很乐意听到其他选择。
回答by harrism
Define the class in a header that you #include, just like in C++.
在 #include 的头文件中定义类,就像在 C++ 中一样。
Any method that must be called from device code should be defined with both __device__
and __host__
declspecs, including the constructor and destructor if you plan to use new
/delete
on the device (note new
/delete
require CUDA 4.0 and a compute capability 2.0 or higher GPU).
任何必须从设备代码调用的方法都应该使用__device__
和__host__
declspecs定义,包括构造函数和析构函数,如果您打算在设备上使用new
/ delete
(注意new
/delete
需要 CUDA 4.0 和计算能力 2.0 或更高的 GPU)。
You probably want to define a macro like
你可能想定义一个宏
#ifdef __CUDACC__
#define CUDA_CALLABLE_MEMBER __host__ __device__
#else
#define CUDA_CALLABLE_MEMBER
#endif
Then use this macro on your member functions
然后在你的成员函数上使用这个宏
class Foo {
public:
CUDA_CALLABLE_MEMBER Foo() {}
CUDA_CALLABLE_MEMBER ~Foo() {}
CUDA_CALLABLE_MEMBER void aMethod() {}
};
The reason for this is that only the CUDA compiler knows __device__
and __host__
-- your host C++ compiler will raise an error.
这样做的原因是只有 CUDA 编译器知道__device__
并且__host__
-- 您的主机 C++ 编译器将引发错误。
Edit:
Note __CUDACC__
is defined by NVCC when it is compiling CUDA files. This can be either when compiling a .cu file with NVCC or when compiling any file with the command line option -x cu
.
编辑:注__CUDACC__
是由 NVCC 在编译 CUDA 文件时定义的。这可以在使用 NVCC 编译 .cu 文件时使用,也可以在使用命令行选项编译任何文件时使用-x cu
。
回答by t. fochtman
Another good resource for this question are some of the code examples that come with the CUDA toolkit. Within these code samples you can find examples of just about any thing you could imagine. One that is pertinent to your question is the quadtree.cu file. Best of luck.
这个问题的另一个很好的资源是 CUDA 工具包附带的一些代码示例。在这些代码示例中,您可以找到您能想到的几乎任何事情的示例。与您的问题相关的是 quadtree.cu 文件。祝你好运。