C语言 如何从 CUDA 内核函数返回单个变量?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2619296/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to return a single variable from a CUDA kernel function?
提问by Pouya BCD
I have a CUDA search function which calculate one single variable. How can I return it back.
我有一个计算单个变量的 CUDA 搜索功能。我怎样才能退货。
__global__
void G_SearchByNameID(node* Node, long nodeCount, long start,char* dest, long answer){
answer = 2;
}
cudaMemcpy(h_answer, d_answer, sizeof(long), cudaMemcpyDeviceToHost);
cudaFree(d_answer);
for both of these lines I get this error: error: argument of type "long" is incompatible with parameter of type "const void *"
对于这两行,我都收到此错误:错误:“long”类型的参数与“const void *”类型的参数不兼容
回答by wich
I've been using __device__variables for this purpose, that way you don't have to bother with cudaMallocand cudaFreeand you don't have to pass a pointer as a kernel argument, which saves you a register in your kernel to boot.
我一直在使用__device__的变量为了这个目的,这样你不会有打扰cudaMalloc,并cudaFree和你没有传递一个指针作为内核参数,从而节省你在内核启动寄存器。
__device__ long d_answer;
__global__ void G_SearchByNameID() {
d_answer = 2;
}
int main() {
SearchByNameID<<<1,1>>>();
typeof(d_answer) answer;
cudaMemcpyFromSymbol(&answer, "d_answer", sizeof(answer), 0, cudaMemcpyDeviceToHost);
printf("answer: %d\n", answer);
return 0;
}
回答by fabrizioM
To get a single result you have to Memcpy it, ie:
要获得单个结果,您必须对其进行 Memcpy,即:
#include <assert.h>
__global__ void g_singleAnswer(long* answer){ *answer = 2; }
int main(){
long h_answer;
long* d_answer;
cudaMalloc(&d_answer, sizeof(long));
g_singleAnswer<<<1,1>>>(d_answer);
cudaMemcpy(&h_answer, d_answer, sizeof(long), cudaMemcpyDeviceToHost);
cudaFree(d_answer);
assert(h_answer == 2);
return 0;
}
I guess the error come because you are passing a long value, instead of a pointer to a long value.
我猜这个错误是因为你传递了一个 long 值,而不是一个指向 long 值的指针。

