GPGPU(通用GPU)开发的优缺点是什么？-IGI

时间：2020-03-06 14:37:51 　来源:igfitidea点击:

我想知道什么可以进行GPGPU开发，当然还有什么限制是我们无法接受的。

让我想到：

关键优势：这些东西的原始力量
关键约束：内存模型

看法是什么？

解决方案

我发现这篇文章很有趣，因为随着CPU和内核速度的不断提高，GPU不再是必需的。

http://arstechnica.com/articles/paedia/gpu-sweeney-interview.ars

过去因它们的并行体系结构和多余的硅片(通常闲置)而引起人们的兴趣，因此可以在侧面用于一般用途的编程任务-

参见http://en.wikipedia.org/wiki/CUDA

但是面对Lou的上述回答，它可能不太相关。

我们必须谨慎对待Ars采访中的蒂姆·斯威尼(Tim Sweeney)的陈述。他说拥有两个独立的平台(CPU和GPU)，一个适合单线程性能，一个适合面向吞吐量的计算的平台，将很快成为过去，因为我们的应用程序和硬件会相互发展。

GPU摆脱了CPU的技术限制，可以说在合理的分辨率和帧速率下，几乎更自然的算法(如光线跟踪和光子映射)变得不可置信。随之而来的是GPU，它具有完全不同且严格的编程模型，但对于用辛苦地编码到该模型的应用程序来说，吞吐量可能要高出2或者3个数量级。这两种机器模型具有(并且仍然具有)本质上不同的编码样式，语言(OpenGL，DirectX，着色器语言与传统桌面语言)和工作流。这使得代码重用，甚至算法/编程技能的重用变得极为困难，并且使想要使用密集并行计算基础的任何开发人员陷入这种限制性编程模型中。

最终，我们可以将这种密集的计算基板类似地编程为CPU。尽管这些大规模并行加速器的一个"核心"与现代x86台式机之间仍然存在相当大的性能差异(尽管例如G80上的SM内的执行线程并非完全是传统的核心)核心是两个因素推动这两个平台的融合：

英特尔和AMD朝着在x86芯片上使用更多，更简单的内核迈进，将硬件与GPU融合在一起，随着时间的推移，单元的粒度和可编程性都越来越高。
这种力量和其他力量催生了许多可以利用数据级或者线程级并行性(DLP / TLP)的新应用程序，从而有效地利用了这种基板。

因此，Tim所说的是，这两个不同的平台将在比OpenCl所能提供的范围更大的程度上融合。采访中的一个引人注目的引述是：

TS: No, I see exactly where you're
  heading. In the next console
  generation you could have consoles
  consist of a single non-commodity
  chip. It could be a general processor,
  whether it evolved from a past CPU
  architecture or GPU architecture, and
  it could potentially run
  everything—the graphics, the AI,
  sound, and all these systems in an
  entirely homogeneous manner. That's a
  very interesting prospect, because it
  could dramatically simplify the
  toolset and the processes for creating
  software.
  
  Right now, in the course of shipping
  Unreal 3, we have to use multiple
  programming languages. We use one
  programming language for writing pixel
  shaders, another for writing gameplay
  code, and then on PlayStation 3 we use
  yet another compiler to write code to
  run on the Cell processor. So the
  PlayStation 3 ends up being a
  particular challenge, because there
  you have three completely different
  processors from different vendors with
  different instruction sets and
  different compilers and different
  performance techniques. So, a lot of
  the complexity is unnecessary and
  makes load-balancing more difficult.
  
  When you have, for example, three
  different chips with different
  programming capabilities, you often
  have two of those chips sitting idle
  for much of the time, while the other
  is maxed out. But if the architecture
  is completely uniform, then you can
  run any task on any part of the chip
  at any time, and get the best
  performance tradeoff that way.

关键优势是千兆级交换机的原始功率。缺点包括有限的非正交指令集和编程模型。

这是一份调查文件：
http://graphics.idav.ucdavis.edu/publications/print_pub?pub_id=907

维基百科文章是一个很好的开始。

楼·佛朗哥(Lou Franco)指出了对蒂姆·斯威尼(Tim Sweeney)的采访。这是他演讲的幻灯片，其中有更多详细信息：
http://www.scribd.com/doc/5687/The-Next-Mainstream-Programming-Language-A-Game-Developers-Perspective-by-Tim-Sweeney

可能还会四处张望：
http://gpgpu.org

GPGPU(通用GPU)开发的优缺点是什么？

解决方案

相关推荐

最近更新

标签

GPGPU(通用GPU)开发的优缺点是什么？

解决方案

相关推荐

QueryString在URLDecode之后格式错误

如何判断DOM元素在当前视口中是否可见？

为什么双向文本(希伯来语，阿拉伯语)的软件支持如此差？

如何克服Data Flow任务中的vs_needsnewmetadata错误？

相关推荐

最近更新

标签