Python 在 AMD GPU 上使用 Keras 和 Tensorflow
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37892784/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using Keras & Tensorflow with AMD GPU
提问by Nyxynyx
I'm starting to learn Keras, which I believe is a layer on top of Tensorflow and Theano. However, I only have access to AMD GPUs such as the AMD R9 280X.
我开始学习 Keras,我相信它是在 Tensorflow 和 Theano 之上的一层。但是,我只能使用 AMD GPU,例如 AMD R9 280X。
How can I setup my Python environment such that I can make use of my AMD GPUs through Keras/Tensorflow support for OpenCL?
如何设置我的 Python 环境,以便我可以通过 Keras/Tensorflow 对 OpenCL 的支持来使用我的 AMD GPU?
I'm running on OSX.
我在 OSX 上运行。
回答by Hugh Perkins
I'm writing an OpenCL 1.2 backend for Tensorflow at https://github.com/hughperkins/tensorflow-cl
我正在https://github.com/hughperkins/tensorflow-cl为 Tensorflow 编写 OpenCL 1.2 后端
This fork of tensorflow for OpenCL has the following characteristics:
这个用于 OpenCL 的 tensorflow 分支具有以下特点:
- it targets any/all OpenCL 1.2 devices. It doesnt need OpenCL 2.0, doesnt need SPIR-V, or SPIR. Doesnt need Shared Virtual Memory. And so on ...
- it's based on an underlying library called 'cuda-on-cl', https://github.com/hughperkins/cuda-on-cl
- cuda-on-cl targets to be able to take anyNVIDIA? CUDA? soure-code, and compile it for OpenCL 1.2 devices. It's a very general goal, and a very general compiler
- for now, the following functionalities are implemented:
- per-element operations, using Eigen over OpenCL, (more info at https://bitbucket.org/hughperkins/eigen/src/eigen-cl/unsupported/test/cuda-on-cl/?at=eigen-cl)
- blas / matrix-multiplication, using Cedric Nugteren's CLBlast https://github.com/cnugteren/CLBlast
- reductions, argmin, argmax, again using Eigen, as per earlier info and links
- learning, trainers, gradients. At least, StochasticGradientDescent trainer is working, and the others are commited, but not yet tested
- it is developed on Ubuntu 16.04 (using Intel HD5500, and NVIDIA GPUs) and Mac Sierra (using Intel HD 530, and Radeon Pro 450)
- 它针对任何/所有 OpenCL 1.2 设备。它不需要 OpenCL 2.0,不需要 SPIR-V 或 SPIR。不需要共享虚拟内存。等等 ...
- 它基于名为“cuda-on-cl”的底层库,https://github.com/hughperkins/cuda-on-cl
- cuda-on-cl 目标能够采用任何NVIDIA 吗?CUDA?源代码,并为 OpenCL 1.2 设备编译它。这是一个非常通用的目标,也是一个非常通用的编译器
- 目前,实现了以下功能:
- 每元素操作,使用 Eigen over OpenCL,(更多信息见 https://bitbucket.org/hughperkins/eigen/src/eigen-cl/unsupported/test/cuda-on-cl/?at=eigen-cl)
- blas / 矩阵乘法,使用 Cedric Nugteren 的 CLBlast https://github.com/cnugteren/CLBlast
- 减少,argmin,argmax,再次使用 Eigen,根据之前的信息和链接
- 学习,培训师,梯度。至少,StochasticGradientDescent 训练器正在工作,其他人已提交,但尚未测试
- 它是在 Ubuntu 16.04(使用 Intel HD5500 和 NVIDIA GPU)和 Mac Sierra(使用 Intel HD 530 和 Radeon Pro 450)上开发的
This is not the only OpenCL fork of Tensorflow available. There is also a fork being developed by Codeplay https://www.codeplay.com, using Computecpp, https://www.codeplay.com/products/computesuite/computecppTheir fork has stronger requirements than my own, as far as I know, in terms of which specific GPU devices it works on. You would need to check the Platform Support Notes (at the bottom of hte computecpp page), to determine whether your device is supported. The codeplay fork is actually an official Google fork, which is here: https://github.com/benoitsteiner/tensorflow-opencl
这不是 Tensorflow 唯一可用的 OpenCL 分支。还有一个Codeplay正在开发的fork https://www.codeplay.com,使用Computecpp,https://www.codeplay.com/products/computesuite/computecpp他们的fork比我的要求更强,就我而言知道它在哪些特定的 GPU 设备上工作。您需要查看平台支持说明(位于 hte computecpp 页面底部),以确定您的设备是否受支持。codeplay fork其实是谷歌官方的fork,在这里:https: //github.com/benoitsteiner/tensorflow-opencl
回答by Thornhale
The original question on this post was: How to get Keras and Tensorflow to run with an AMD GPU.
这篇文章的原始问题是:如何让 Keras 和 Tensorflow 与 AMD GPU 一起运行。
The answer to this question is as followed:
这个问题的答案如下:
1.) Keras will work if you can make Tensorflow work correctly (optionally within your virtual/conda environment).
1.) 如果您可以使 Tensorflow 正常工作(可选择在您的虚拟/conda 环境中),Keras 将起作用。
2.) To get Tensorflow to work on an AMD GPU, as others have stated, one way this could work is to compile Tensorflow to use OpenCl. To do so read the link below. But for brevity I will summarize the required steps here:
2.) 为了让 Tensorflow 在 AMD GPU 上工作,正如其他人所说,一种可行的方法是编译 Tensorflow 以使用 OpenCl。为此,请阅读下面的链接。但为简洁起见,我将在这里总结所需的步骤:
You will need AMDs proprietary drivers. These are currently only available on Ubuntu 14.04 (the version before Ubuntu decided to change the way the UI is rendered). Support for Ubuntu 16.04 is at the writing of this post limited to a few GPUs through AMDProDrivers. Readers who want to do deep learning on AMD GPUs should be aware of this!
Compiling Tensorflow with OpenCl support also requires you to obtain and install the following prerequisites: OpenCl headers, ComputeCpp.
After the prerequisites are fulfilled, configure your build. Note that there are 3 options for compiling Tensorflow: Std Tensorflow (stable), Benoits Steiner's Tensorflow-opencl (developmental), and Luke Iwanski's Tensorflow-opencl (highly experimental) which you can pull from github. Also note that if you decide to build from any of the opencl versions, the question to use opencl will be missing because it is assumed that you are using it. Conversely, this means that if you configure from the standard tensorflow, you will need to select "Yes" when the configure script asks you to use opencl and "NO" for CUDA.
Then run tests like so:
$ bazel test --config=sycl -k --test_timeout 1600 -- //tensorflow/... -//tensorflow/contrib/... -//tensorflow/java/... -//tensorflow /compiler/...
您将需要 AMD 的专有驱动程序。这些目前仅在 Ubuntu 14.04(Ubuntu 决定改变 UI 呈现方式之前的版本)上可用。在撰写本文时,对 Ubuntu 16.04 的支持仅限于通过 AMDProDrivers 的几个 GPU。想要在 AMD GPU 上进行深度学习的读者应该注意这一点!
使用 OpenCl 支持编译 Tensorflow 还需要您获取并安装以下先决条件:OpenCl 头文件、ComputeCpp。
满足先决条件后,配置您的构建。请注意,编译 Tensorflow 有 3 个选项:Std Tensorflow(稳定版)、Benoits Steiner 的 Tensorflow-opencl(开发版)和 Luke Iwanski 的 Tensorflow-opencl(高度实验性的),您可以从 github 中提取。另请注意,如果您决定从任何 opencl 版本构建,使用 opencl 的问题将丢失,因为假定您正在使用它。相反,这意味着如果您从标准 tensorflow 进行配置,则当配置脚本要求您使用 opencl 时,您将需要选择“是”,而 CUDA 则需要选择“否”。
然后像这样运行测试:
$ bazel test --config=sycl -k --test_timeout 1600 -- //tensorflow/... -//tensorflow/contrib/... -//tensorflow/java/... -//tensorflow /compiler/ ...
Update: Doing this on my setup takes exceedingly long on my setup. The part that takes long are all the tests running. I am not sure what this means but a lot of my tests are timeing out at 1600 seconds. The duration can probably be shortened at the expense of more tests timeing out. Alternatively, you can just build tensor flow without tests. At the time of this writing, running the tests has taken 2 days already.
更新:在我的设置上执行此操作需要很长时间。需要很长时间的部分是所有正在运行的测试。我不确定这意味着什么,但我的很多测试在 1600 秒时超时。持续时间可能会缩短,但代价是更多的测试超时。或者,您可以只构建张量流而无需测试。在撰写本文时,运行测试已经花费了 2 天时间。
Or just build the pip package like so:
或者像这样构建 pip 包:
bazel build --local_resources 2048,.5,1.0 -c opt --config=sycl //tensorflow/tools/pip_package:build_pip_package
Please actually read the blog post over at Codeplay: Lukas Iwansky posted a comprehensive tutorial post on how to get Tensorflow to work with OpenCl just on March 30th 2017. So this is a very recent post. There are also some details which I did not write about here.
请仔细阅读 Codeplay 上的博文:Lukas Iwansky 在 2017 年 3 月 30 日发布了一篇关于如何让 Tensorflow 与 OpenCl 一起工作的综合教程文章。所以这是一篇最近的文章。还有一些细节我没有在这里写。
As indicated in the many posts above, little bits of information are spread throughout the interwebs. What Lukas' post adds in terms of value is that all the information was put together into one place which should make setting up Tensforflow and OpenCl a bit less daunting. I will only provide a link here:
正如上面的许多帖子所示,在整个互联网上传播的信息很少。Lukas 的帖子在价值方面增加的是所有信息都放在一个地方,这应该使设置 Tensforflow 和 OpenCl 不那么令人生畏。我只会在这里提供一个链接:
https://www.codeplay.com/portal/03-30-17-setting-up-tensorflow-with-opencl-using-sycl
https://www.codeplay.com/portal/03-30-17-setting-up-tensorflow-with-opencl-using-sycl
A slightly more complete walk-through has been posted here:
这里发布了一个稍微更完整的演练:
http://deep-beta.co.uk/setting-up-tensorflow-with-opencl-using-sycl/
http://deep-beta.co.uk/setting-up-tensorflow-with-opencl-using-sycl/
It differs mainly by explicitly telling the user that he/she needs to:
它的主要区别在于明确告诉用户他/她需要:
- create symlinks to a subfolder
- and then actually install tensorflow via "python setup.py develop" command.
- 创建指向子文件夹的符号链接
- 然后通过“python setup.py develop”命令实际安装tensorflow。
Note an alternative approach was mentioned above using tensorflow-cl:
请注意,上面提到了使用 tensorflow-cl 的另一种方法:
https://github.com/hughperkins/tensorflow-cl
https://github.com/hughperkins/tensorflow-cl
I am unable to discern which approach is better at this time though it appears that this approach is less active. Fewer issues are posted, and fewer conversations to resolve those issues are happening. There was a major push last year. Additional pushes have ebbed off since November 2016 although Hugh seems to have pushed some updates a few days ago as of the writing of this post. (Update: If you read some of the documentation readme, this version of tensorflowo now only relies on community support as the main developer is busy with life.)
我目前无法辨别哪种方法更好,尽管这种方法似乎不太活跃。发布的问题越来越少,解决这些问题的对话也越来越少。去年有一个重大推动。自 2016 年 11 月以来,其他推送已经减弱,尽管在撰写本文时休似乎在几天前推送了一些更新。(更新:如果你阅读了一些文档自述文件,这个版本的 tensorflowo 现在只依赖社区支持,因为主要开发人员忙于生活。)
UPDATE (2017-04-25): I have some notes based on testing tensorflow-opencl below.
更新(2017-04-25):我有一些基于测试 tensorflow-opencl 的笔记。
- The future user of this package should note that using opencl means that all the heavy-lifting in terms of computing is shifted to the GPU. I mention this because I was personally thinking that the compute work-load would be shared between my CPU and iGPU. This means that the power of your GPU is very important (specifically, bandwidth, and available VRAM).
- 这个包的未来用户应该注意,使用 opencl 意味着所有繁重的计算工作都转移到了 GPU 上。我提到这一点是因为我个人认为计算工作负载将在我的 CPU 和 iGPU 之间共享。这意味着 GPU 的能力非常重要(特别是带宽和可用的 VRAM)。
Following are some numbers for calculating 1 epoch using the CIFAR10 data set for MY SETUP (A10-7850 with iGPU). Your mileage will almost certainly vary!
以下是使用 MY SETUP 的 CIFAR10 数据集(带 iGPU 的 A10-7850)计算 1 个时期的一些数字。您的里程几乎肯定会有所不同!
- Tensorflow (via pip install): ~ 1700 s/epoch
- Tensorflow (w/ SSE + AVX): ~ 1100 s/epoch
- Tensorflow (w/ opencl & iGPU): ~ 5800 s/epoch
- Tensorflow(通过 pip install):~ 1700 s/epoch
- Tensorflow (w/ SSE + AVX): ~ 1100 s/epoch
- Tensorflow (w/ opencl & iGPU): ~ 5800 s/epoch
You can see that in this particular case performance is worse. I attribute this to the following factors:
您可以看到,在这种特殊情况下,性能更差。我将此归因于以下因素:
- The iGPU only has 1GB. This leads to a lot of copying back and forth between CPU and GPU. (Opencl 1.2 does not have the ability to data pass via pointers yet; instead data has to be copied back and forth.)
- The iGPU only has 512 stream processors (and 32 Gb/s memory bandwidth) which in this case is slower than 4 CPUs using SSE4 + AVX instruction sets.
- The development of tensorflow-opencl is in it's beginning stages, and a lot of optimizations in SYCL etc. have not been done yet.
- iGPU 只有 1GB。这会导致在 CPU 和 GPU 之间进行大量的来回复制。(Opencl 1.2 尚不具备通过指针传递数据的能力;相反,必须来回复制数据。)
- iGPU 只有 512 个流处理器(和 32 Gb/s 内存带宽),在这种情况下,这比使用 SSE4 + AVX 指令集的 4 个 CPU 慢。
- tensorflow-opencl的开发还处于起步阶段,SYCL等方面的很多优化还没有做。
If you are using an AMD GPU with more VRAM and more stream processors, you are certain to get much better performance numbers. I would be interested to read what numbers people are achieving to know what's possible.
如果您使用具有更多 VRAM 和更多流处理器的 AMD GPU,您肯定会获得更好的性能数据。我有兴趣阅读人们正在实现的数字以了解什么是可能的。
I will continue to maintain this answer if/when updates get pushed.
如果/何时推送更新,我将继续维护此答案。
3.) An alternative way is currently being hinted at which is using AMD's RocM initiative, and miOpen (cuDNN equivalent) library. These are/will be open-source libraries that enable deep learning. The caveat is that RocM support currently only exists for Linux, and that miOpen has not been released to the wild yet, but Raja (AMD GPU head) has said in an AMAthat using the above, it should be possible to do deep learning on AMD GPUs. In fact, support is planned for not only Tensorflow, but also Cafe2, Cafe, Torch7 and MxNet.
3.) 目前正在暗示使用 AMD 的 RocM 计划和 miOpen(cuDNN 等效)库的替代方法。这些是/将是支持深度学习的开源库。需要注意的是,RocM 目前只支持 Linux,而且 miOpen 还没有发布到野外,但 Raja(AMD GPU 负责人)在AMA 中表示,使用上述内容,应该可以在AMD GPU。事实上,不仅计划支持 Tensorflow,还计划支持 Cafe2、Cafe、Torch7 和 MxNet。
回答by Talha Junaid
One can use AMD GPU via the PlaidML Keras backend.
可以通过 PlaidML Keras 后端使用 AMD GPU。
Fastest: PlaidML is often 10x faster (or more) than popular platforms (like TensorFlow CPU) because it supports all GPUs, independent of make and model. PlaidML accelerates deep learning on AMD, Intel, NVIDIA, ARM, and embedded GPUs.
最快:PlaidML 通常比流行平台(如 TensorFlow CPU)快 10 倍(或更多),因为它支持所有 GPU,独立于品牌和型号。PlaidML 可加速 AMD、Intel、NVIDIA、ARM 和嵌入式 GPU 上的深度学习。
Easiest: PlaidML is simple to install and supports multiple frontends (Keras and ONNX currently)
最简单:PlaidML 易于安装并支持多个前端(目前是 Keras 和 ONNX)
Free: PlaidML is completely open source and doesn't rely on any vendor libraries with proprietary and restrictive licenses.
免费:PlaidML 是完全开源的,不依赖任何具有专有和限制性许可证的供应商库。
For most platforms, getting started with accelerated deep learning is as easy as running a few commands (assuming you have Python (v2 or v3) installed):
对于大多数平台,开始加速深度学习就像运行一些命令一样简单(假设您安装了 Python(v2 或 v3)):
virtualenv plaidml
source plaidml/bin/activate
pip install plaidml-keras plaidbench
Choose which accelerator you'd like to use (many computers, especially laptops, have multiple):
选择您要使用的加速器(许多计算机,尤其是笔记本电脑,有多个):
plaidml-setup
Next, try benchmarking MobileNet inference performance:
接下来,尝试对 MobileNet 推理性能进行基准测试:
plaidbench keras mobilenet
Or, try training MobileNet:
或者,尝试训练 MobileNet:
plaidbench --batch-size 16 keras --train mobilenet
To use it with keras set
与 keras 集一起使用
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
For more information
想要查询更多的信息
https://github.com/plaidml/plaidml
https://github.com/plaidml/plaidml
https://github.com/rstudio/keras/issues/205#issuecomment-348336284
https://github.com/rstudio/keras/issues/205#issuecomment-348336284
回答by Selly
This is an old question, but since I spent the last few weeks trying to figure it out on my own:
这是一个老问题,但由于我花了过去几周的时间试图自己解决这个问题:
- OpenCL support for Theano is hit and miss.They added a libgpuarray back-end which appears to still be buggy (i.e., the process runs on the GPU but the answer is wrong--like 8% accuracy on MNIST for a DL model that gets ~95+% accuracy on CPU or nVidia CUDA). Also because ~50-80% of the performance boost on the nVidia stack comes from the CUDNN libraries now, OpenCL will just be left in the dust. (SEE BELOW!) :)
- ROCM appears to be very cool, but the documentation (and even a clear declaration of what ROCM is/what it does) is hard to understand.They're doing their best, but they're 4+ years behind. It does NOT NOT NOT work on an RX550 (as of this writing). So don't waste your time (this is where 1 of the weeks went :) ). At first, it appears ROCM is a new addition to the driver set (replacing AMDGPU-Pro, or augmenting it), but it is in fact a kernel module and set of libraries that essentially replace AMDGPU-Pro. (Think of this as the equivalent of Nvidia-381 driver + CUDA some libraries kind of). https://rocm.github.io/dl.html(Honestly I still haven't tested the performance or tried to get it to work with more recent Mesa drivers yet. I will do that sometime.
- Add MiOpen to ROCM, and that is essentially CUDNN.They also have some pretty clear guides for migrating. But better yet.
- They created "HIP" which is an automagical translator from CUDA/CUDNN to MiOpen.It seems to work pretty well since they lined the API's up directly to be translatable. There are concepts that aren't perfect maps, but in general it looks good.
- OpenCL 对 Theano 的支持很受欢迎。他们添加了一个似乎仍然有问题的 libgpuarray 后端(即,该过程在 GPU 上运行,但答案是错误的——比如 MNIST 上 8% 的 DL 模型准确率,在 CPU 或 nVidia 上获得了约 95+% 的准确率CUDA)。此外,由于现在 nVidia 堆栈上约 50-80% 的性能提升来自 CUDNN 库,因此 OpenCL 将被搁置一旁。(见下文!) :)
- ROCM 看起来很酷,但是文档(甚至是 ROCM 是什么/它做什么的明确声明)很难理解。他们正在尽力而为,但他们落后了 4 年多。它不适用于 RX550(在撰写本文时)。所以不要浪费你的时间(这是第一个星期去的地方:))。乍一看,ROCM 似乎是驱动程序集的新成员(替代 AMDGPU-Pro 或对其进行扩充),但实际上它是一个内核模块和一组库,基本上取代了 AMDGPU-Pro。(将此视为相当于 Nvidia-381 驱动程序 + CUDA 某些库)。https://rocm.github.io/dl.html(老实说,我还没有测试性能或尝试让它与更新的 Mesa 驱动程序一起工作。我会在某个时候这样做。
- 将 MiOpen 添加到 ROCM,本质上就是 CUDNN。他们也有一些非常明确的迁移指南。但更好。
- 他们创建了“HIP”,这是一个从 CUDA/CUDNN 到 MiOpen 的自动翻译器。它似乎工作得很好,因为他们直接将 API 排列成可翻译的。有些概念不是完美的地图,但总的来说它看起来不错。
Now, finally, after 3-4 weeks of trying to figure out OpenCL, etc, I found this tutorial to help you get started quickly. It is a step-by-step for getting hipCaffe up and running. Unlike nVidia though, please ensure you have supported hardware!!!! https://rocm.github.io/hardware.html. Think you can get it working without their supported hardware? Good luck. You've been warned. Once you have ROCM up and running (AND RUN THE VERIFICATION TESTS), here is the hipCaffe tutorial--if you got ROCM up you'll be doing an MNIST validation test within 10 minutes--sweet! https://rocm.github.io/ROCmHipCaffeQuickstart.html
现在,终于,经过 3-4 周的尝试弄清楚 OpenCL 等之后,我发现本教程可以帮助您快速入门。这是启动和运行 hipCaffe 的分步指南。与 nVidia 不同,请确保您有支持的硬件!!!!https://rocm.github.io/hardware.html。认为您可以在没有他们支持的硬件的情况下工作吗?祝你好运。你已经被警告过了。一旦您启动并运行 ROCM(并运行验证测试),这里是 hipCaffe 教程——如果您启动了 ROCM,您将在 10 分钟内进行 MNIST 验证测试——太棒了! https://rocm.github.io/ROCmHipCaffeQuickstart.html
回答by nemo
Theano does havesupport for OpenCL but it is still in its early stages. Theano itself is not interested in OpenCL and relies on community support.
Theano确实支持 OpenCL,但仍处于早期阶段。Theano 本身对 OpenCL 不感兴趣,依赖社区支持。
Mostof the operations are already implemented and it is mostly a matter of tuning and optimizing the given operations.
大多数操作已经实现,主要是调整和优化给定操作。
To use the OpenCL backend you have to buildlibgpuarray
yourself.
要使用 OpenCL 后端,您必须自己构建libgpuarray
。
From personal experience I can tell you that you will get CPU performance if you are lucky. The memory allocation seems to be very naively implemented (therefore computation will be slow) and will crash when it runs out of memory. But I encourage you to try and maybe even optimize the code or help reporting bugs.
根据个人经验,我可以告诉您,如果幸运的话,您将获得 CPU 性能。内存分配似乎非常天真地实现(因此计算会很慢)并且会在内存不足时崩溃。但我鼓励您尝试甚至优化代码或帮助报告错误。
回答by Kruft Industries
If you have access to other AMD gpu's please see here: https://github.com/ROCmSoftwarePlatform/hiptensorflow/tree/hip/rocm_docs
如果您可以访问其他 AMD gpu,请参阅此处:https: //github.com/ROCmSoftwarePlatform/hiptensorflow/tree/hip/rocm_docs
This should get you going in the right direction for tensorflow on the ROCm platform, but Selly's post about https://rocm.github.io/hardware.htmlis the deal with this route. That page is not an exhaustive list, I found out on my own that the Xeon E5 v2 Ivy Bridge works fine with ROCm even though they list v3 or newer, graphics cards however are a bit more picky. gfx8 or newer with a few small exceptions, polaris and maybe others as time goes on.
这应该会让你在 ROCm 平台上朝着 tensorflow 的正确方向前进,但 Selly 的关于https://rocm.github.io/hardware.html的帖子是关于这条路线的。该页面不是一个详尽的列表,我自己发现 Xeon E5 v2 Ivy Bridge 与 ROCm 配合良好,即使它们列出了 v3 或更新版本,但显卡有点挑剔。gfx8 或更新版本,但有一些小例外,北极星,随着时间的推移,也许还有其他人。
UPDATE -It looks like hiptensorflow has an option for opencl support during configure. I would say investigate the link even if you don't have gfx8+ or polaris gpu if the opencl implementation works. It is a long winded process but an hour or three (depending on hardware) following a well written instruction isn't too much to lose to find out.
更新 -看起来 hiptensorflow 在配置期间有一个支持 opencl 的选项。如果 opencl 实现有效,即使您没有 gfx8+ 或 polaris gpu,我也会说调查链接。这是一个冗长的过程,但是在编写好的说明之后一三个小时(取决于硬件)并不会丢失太多。
回答by user1917768
Tensorflow 1.3 has been supported on AMD ROCm stack:
AMD ROCm 堆栈已支持 Tensorflow 1.3:
A pre-built docker image has also been posted publicly:
一个预先构建的 docker 镜像也已公开发布: