Requirements
- ComputeCpp v0.5.1 or greater (http://developer.codeplay.com)
- Python 2.7
- Ubuntu 16.04 LTS
- AMD R9 Nano / AMD FirePro GPU ( AMDGPU-PRO driver)
Intro
It’s a follow up from the previous post where set up on ubuntu 14.04 was described. The format is going to be the same “copy-paste” and “get-it-working” style.
Dependencies
The assumption is that you have a vanilla Ubuntu64 16.04.03 LTS installed.
Misc
In some cases you will need to install curl
1 | $ sudo apt-get install curl linux-generic |
Update 17Jan2018
Note: That might be relevant only to FirePro W8100 users
It seems like old drivers previous to AMDGPU-PRO 17.50.511655 are not able to compile DKMS module for kernels installed by default on Ubuntu 16.04.3 TLS ( 4.13.0-26-generic )
In order to work this around kernel needs to be downgraded to 4.10.0-28-generic
1 | $ sudo apt-get remove linux-image-4.13.0-26-generic linux-headers-4.13.0-26 |
Java & Bazel
1 2 3 4 5 | $ sudo apt-get install openjdk-8-jdk $ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list $ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add - $ sudo apt-get update && sudo apt-get install bazel $ sudo apt-get upgrade bazel |
OpenCL & SYCL
Note: In order to install only OpenCL parts of AMDGPU-PRO driver pass: –compute to ./amdgpu-pro-install
1 2 3 4 5 6 7 8 9 | $ sudo apt-get install ocl-icd-opencl-dev opencl-headers $ wget --referer=http://support.amd.com https://www2.ati.com/drivers/linux/ubuntu/amdgpu-pro-17.30-465504.tar.xz $ tar -xvf amdgpu-pro-17.30-465504.tar.xz $ cd amdgpu-pro-17.30-465504 $ sudo ./amdgpu-pro-install #or sudo ./amdgpu-pro-install --compute $ tar -xvzf Ubuntu-16.04-64bit.tar.gz $ sudo mkdir /usr/local/computecpp $ cd Ubuntu-16.04-64bit && cp * /usr/local/computecpp $ sudo reboot |
Update: 13Jan2018
In some cases (eg. update of the ubuntu16.04) you might need to use latest AMDGPU-PRO (17.50.511655)
1 2 3 | $ wget --referer=http://support.amd.com https://www2.ati.com/drivers/linux/ubuntu/amdgpu-pro-17.50-511655.tar.xz $ cd amdgpu-pro-17.50-511655 $ ./amdgpu-pro-install --headless --opencl=legacy |
Update 17Jan2018
AMDGPU-PRO 17.50.511655 drivers seems to expose either in TensorFlow SYCL implementation or possibly have a bug where graph that nodes are allocated to different devices ( GPU and CPU ) fails to synchronize data from GPU placed node to CPU
Update 11Jul2018
AMDGPU-PRO 17.40-501128 is the latest driver that seems to be working with SYCL 0.9
Python
1 2 3 4 | $ sudo apt-get install python-numpy python-dev python-wheel python-mock python-psutil python-pip $ sudo pip install --upgrade pip $ sudo pip install py-cpuinfo portpicker numpy $ sudo pip install --upgrade scipy |
Verification
Note: In some cases user need to be added to video group via:
1 | $ sudo adduser $(whoami) video |
1 | $ /opt/amdgpu-pro/bin/clinfo |
Should return something similar to:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.0 AMD-APP (2442.7) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon FirePro W8100 ... Version: OpenCL 1.2 AMD-APP (2442.7) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer <strong>cl_khr_spir</strong> cl_khr_gl_event |
There is cl_khr_spir – good sign!
Now:
1 | $ /usr/local/computecpp/bin/computecpp_info |
For ComputeCpp CE 0.5.0 you should see:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ******************************************************************************** ComputeCpp Info (CE 0.5.0) ******************************************************************************** Toolchain information: GLIBC version: 2.23 GLIBCXX: 20160609 This version of libstdc++ is supported. ******************************************************************************** Device Info: Discovered 1 devices matching: platform : device type : -------------------------------------------------------------------------------- Device 0: Device is supported : UNTESTED - Vendor not tested on this OS CL_DEVICE_NAME : Hawaii CL_DEVICE_VENDOR : Advanced Micro Devices, Inc. CL_DRIVER_VERSION : 2442.7 CL_DEVICE_TYPE : CL_DEVICE_TYPE_GPU If you encounter problems when using any of these OpenCL devices, please consult this website for known issues: https://computecpp.codeplay.com/releases/v0.3.1/platform-support-notes |
Set Up
The changesets are being upstreamed however, for now I would recommend using my fork of the TensorFlow.
1 2 3 4 5 6 7 8 | $ export TF_NEED_OPENCL=1 $ export HOST_CXX_COMPILER=/usr/bin/g++ $ export HOST_C_COMPILER=/usr/bin/gcc $ export COMPUTECPP_TOOLKIT_PATH=/usr/local/computecpp $ git clone https://github.com/lukeiwanski/tensorflow.git $ cd tensorflow $ git checkout dev/eigen_mehdi $ ./configure |
At this point enter through the config questions.
In order to run tests:
1 | $ bazel test --test_timeout 300,450,1200,3600 -c opt --config=sycl --test_tag_filters=requires-gpu,-no_gpu,-no_oss,-oss_serial,-benchmark-test -- //tensorflow/... -//tensorflow/compiler/... -//tensorflow/contrib/distributions/... -//tensorflow/contrib/session_bundle/... -//tensorflow/go/... -//tensorflow/stream_executor/... -//tensorflow/core/distributed_runtime/... -//tensorflow/contrib/verbs/... -//tensorflow/contrib/xla_tf_graph/... -//tensorflow/java/... -//tensorflow/core/kernels/hexagon/... |
Worth to note at this point there are some fails that we are working on resolving.
Other “worth to mention” is the performance improvement that is still a “work-in-progress”.
1 2 | $ bazel build -c opt --config=sycl tensorflow/core/kernels:matmul_op_test $ ./bazel-bin/tensorflow/core/kernels/matmul_op_test --benchmarks=all |
On AMD Radeon FirePro W8100 gives:
Further optimisation improvements are in the pipeline.