TensorFlow 1.x On Ubuntu 16.04 LTS


  • ComputeCpp v0.5.1 or greater (http://developer.codeplay.com)
  • Python 2.7
  • Ubuntu 16.04 LTS
  • AMD R9 Nano / AMD FirePro GPU  ( AMDGPU-PRO driver)


It’s a follow up from the previous post where set up on ubuntu 14.04 was described. The format is going to be the same “copy-paste” and “get-it-working” style.


The assumption is that you have a vanilla  Ubuntu64 16.04.03 LTS installed.


In some cases you will need to install curl

$ sudo apt-get install curl linux-generic

Update 17Jan2018
Note: That might be relevant only to FirePro W8100 users
It seems like old drivers previous to AMDGPU-PRO 17.50.511655 are not able to compile DKMS module for kernels installed by default on Ubuntu 16.04.3 TLS ( 4.13.0-26-generic )

In order to work this around kernel needs to be downgraded to 4.10.0-28-generic

$ sudo apt-get remove linux-image-4.13.0-26-generic linux-headers-4.13.0-26

Java & Bazel

$ sudo apt-get install openjdk-8-jdk
$ echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
$ sudo apt-get update && sudo apt-get install bazel
$ sudo apt-get upgrade bazel


Note: In order to install only OpenCL parts of AMDGPU-PRO driver pass: –compute to  ./amdgpu-pro-install

$ sudo apt-get install ocl-icd-opencl-dev opencl-headers
$ wget --referer=http://support.amd.com https://www2.ati.com/drivers/linux/ubuntu/amdgpu-pro-17.30-465504.tar.xz
$ tar -xvf amdgpu-pro-17.30-465504.tar.xz
$ cd amdgpu-pro-17.30-465504
$ sudo ./amdgpu-pro-install #or sudo ./amdgpu-pro-install --compute
$ tar -xvzf Ubuntu-16.04-64bit.tar.gz
$ sudo mkdir /usr/local/computecpp
$ cd Ubuntu-16.04-64bit && cp * /usr/local/computecpp 
$ sudo reboot

Update: 13Jan2018
In some cases (eg. update of the ubuntu16.04) you might need to use latest AMDGPU-PRO (17.50.511655)

$ wget --referer=http://support.amd.com https://www2.ati.com/drivers/linux/ubuntu/amdgpu-pro-17.50-511655.tar.xz
$ cd amdgpu-pro-17.50-511655
$ ./amdgpu-pro-install --headless --opencl=legacy

Update 17Jan2018
AMDGPU-PRO 17.50.511655 drivers seems to expose either in TensorFlow SYCL implementation or possibly have a bug where graph that nodes are allocated to different devices ( GPU and CPU ) fails to synchronize data from GPU placed node to CPU

Update 11Jul2018
AMDGPU-PRO 17.40-501128 is the latest driver that seems to be working with SYCL 0.9


$ sudo apt-get install python-numpy python-dev python-wheel python-mock python-psutil python-pip
$ sudo pip install --upgrade pip
$ sudo pip install py-cpuinfo portpicker numpy
$ sudo pip install --upgrade scipy


Note: In some cases user need to be added to video group via:

$ sudo adduser $(whoami) video
$ /opt/amdgpu-pro/bin/clinfo

Should return something similar to:

Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.0 AMD-APP (2442.7)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 

  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 1
  Device Type:					 CL_DEVICE_TYPE_GPU
  Vendor ID:					 1002h
  Board name:					 AMD Radeon FirePro W8100
 Version:					 OpenCL 1.2 AMD-APP (2442.7)
  Extensions:					 cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event 

There is cl_khr_spir – good sign!


$ /usr/local/computecpp/bin/computecpp_info

For ComputeCpp CE 0.5.0 you should see:


ComputeCpp Info (CE 0.5.0)


Toolchain information:

GLIBC version: 2.23
GLIBCXX: 20160609
This version of libstdc++ is supported.


Device Info:

Discovered 1 devices matching:
  platform    : 
  device type : 

Device 0:

  Device is supported                     : UNTESTED - Vendor not tested on this OS
  CL_DEVICE_NAME                          : Hawaii
  CL_DEVICE_VENDOR                        : Advanced Micro Devices, Inc.
  CL_DRIVER_VERSION                       : 2442.7
  CL_DEVICE_TYPE                          : CL_DEVICE_TYPE_GPU 

If you encounter problems when using any of these OpenCL devices, please consult
this website for known issues:

Set Up

The changesets are being upstreamed however, for now I would recommend using my fork of the TensorFlow.

$ export TF_NEED_OPENCL=1
$ export HOST_CXX_COMPILER=/usr/bin/g++
$ export HOST_C_COMPILER=/usr/bin/gcc
$ export COMPUTECPP_TOOLKIT_PATH=/usr/local/computecpp
$ git clone https://github.com/lukeiwanski/tensorflow.git
$ cd tensorflow
$ git checkout dev/eigen_mehdi
$ ./configure

At this point enter through the config questions.

In order to run tests:

$ bazel test --test_timeout 300,450,1200,3600  -c opt --config=sycl --test_tag_filters=requires-gpu,-no_gpu,-no_oss,-oss_serial,-benchmark-test  -- //tensorflow/... -//tensorflow/compiler/... -//tensorflow/contrib/distributions/... -//tensorflow/contrib/session_bundle/... -//tensorflow/go/... -//tensorflow/stream_executor/... -//tensorflow/core/distributed_runtime/... -//tensorflow/contrib/verbs/... -//tensorflow/contrib/xla_tf_graph/... -//tensorflow/java/... -//tensorflow/core/kernels/hexagon/...

Worth to note at this point there are some fails that we are working on resolving.

Other “worth to mention” is the performance improvement that is still a “work-in-progress”.

$ bazel build -c opt --config=sycl tensorflow/core/kernels:matmul_op_test
$ ./bazel-bin/tensorflow/core/kernels/matmul_op_test --benchmarks=all

On AMD Radeon FirePro W8100 gives:

Further optimisation improvements are in the pipeline.

3rd Bounce

Screenshot from 2013-07-16 22:58:57

finally I got some spare time to improve this baby.
It is a bit of improvement since the last time – 3rd bounce of rays were hacked into. Now the reflection looks far more realistic.

Screenshot from 2013-07-16 23:03:58

And here reflection of the reflection.. some artefacts here and there – but I can live with them for now.

oh and I need to fix timer for the frame – that is definietly not 0.31ms per frame – it takes about a second per frame. Still not that bad for the vanila implementation.

[Edit: late night]

Screenshot from 2013-07-17 01:38:58

Managed to add semi-transparent spheres! Rendering of the frame became extremely slow – had to reduce number of spheres on the scene.

Is it the right time to introduce optimisations?


One of the questions that I have been asked lately

Imagine situation where we have a numbers from 1 to 100 all incrementing so we have 1,2,3, … ,99,100. Now we take one of them from this pool – randomly, then we are randomize order of rest of them.

Question: what is the best way of finding out which number has been taken, and what is complexity of your algorithm?


Last Weekend before Christmas Issue

was really lazy.. cup of tea book.. couple episodes of friends – honestly I need to get internet at my place.. I am so out of good movies..

Amazon and other online based shops keep spamming great deals for useless stuff for Christmas.

Next week gift hunting will start.. I ll track it, I ll hunt it down..
The perspective of new xBox360 is great but I think it needs to wait until new year.. :S well.. I ll get it at some point.

the book.. I am reading great one. Game Coding Complete 3rd Edition.

Well I am not about to make a good revision of the book but “mr Mike” know how to explain.

This is good position to have on the shelf.

I have started to creating C++/DirectX game framework but to be honest I can’t decide which DirectX should I use. I am using 9 because Napier teaches this one, but in my personal opinion.. why are they learning old stuff? New DirectX API has been changed.. so in fact this is pointless.. well not really pointless but.. if I learn 9 one I will have learn to new one as well.. I think that’s just waste of time.

I will share the simple structure of “Engine” when I ll reach first milestone.. some object in 3D – yeah box.. so what! I bet it will be  Tuesday or Wednesday.. finger crossed.. and yes it will be DirectX9 but I ll prepare alternative version for DirectX11 by end of the year.. I hope.

Day 2: multi texture, grid, interface

Daily update, what have I achieved last night:

  • adding more than one texture to high map
  • grid on future water level
  • 2d text over 3d scene – interface/debug

My 1 milestone is almost reached – only two things left: smooth transition between textures and animated water with reflection. Once I have that I will publish application and sample code.

that should be done by end of the week, however Software Development 3 coursework deadline if 3rd Dec and I haven’t done anything yet.

finger crossed,
Continue reading Day 2: multi texture, grid, interface

Full time student – Full time emploee

Software Development – Java
Software Engineering – UML
Mobile apps – Android

Job – Web Developer

Things has been sorted already,
small modification in my programming project – portable strategy game – similar to majesty – still need to learn openGL  – but hopefully I ll get some sort of help from Android’s development kit