Build TensorFlow from Source in Centos 7

2019-06-022019-06-02
by parrondo

I must build Tensorflow from Source in Centos 7 after the weird message: “Illegal instruction (core dumped)” after running “import tensorflow” in my python code.

Introduction

With Tensorflow, Google has created a framework that is both too low to be used comfortably in rapid prototyping, but too high to be used comfortably in cutting-edge research or production environments with limited resources. In particular, in production, I love TensorFlow Server. Therefore, I have decided to continue using Tensorflow in my research as a trader. But I fight with the serious problem that their binaries do not fit my hardware or my operating system.

So now what I have to do is build it myself.

In the Tensorflow GitHub account, you can find how to build a TensorFlow pip package from source and install it on Ubuntu Linux and macOS. While the instructions might work for other systems, it is only tested and supported for Ubuntu and macOS. I will build it for Centos 7 my OS of the election.

Motivation

The easiest way to install TensorFlow is to work in a virtual Python environment. In my case, I prefer to use Conda. Once installed the Python virtual environment simply use the official TensorFlow packages in pip or use one of the official wheels for the distributions. However, there is a big problem with this technique and it is the fact that the binaries are precompiled to fit the hardware configuration chosen by Tensorflow. This is not a problem for the GPU since CUDA libraries will take care of the difference between one graphics card and another. But there are several problems with the Tensorflow binaries when we perform the CPU calculations.

The main disadvantages are:

Old CPUs that do not have AVX can only use Tensorflow until version 1.5, which is unacceptable given the rapid development of the technology.
The performance of the CPU. In fact, different processors have different capabilities. For example, the vectorization capabilities are different from one processor to another (SSE, AVX, AVX2, AVX-512F, FMA, …).
The operating system of the Linux binaries is Ubuntu. In my case I use Centos 7.

So, if you care about CPU performance or have an old CPU, you should install TensorFlow directly from sources. This will allow the compilation of TensorFlow fonts with option “-march = native“, which will enable all the hardware capabilities of the machine on which you are compiling the library.

Depending on your problem, this can give you a good acceleration. My CPU does not have AVX. Therefore I had to compile the latest version of Tensorflow. Thus, I have managed to improve in a small recurrent neural network, around 25% faster. In a bigger problem and depending on your processor, you can achieve better performance. If you are training with CPU, this can be a big difference in the total time.

Installing TensorFlow is a bit cumbersome. You may also have to compile Bazel from the sources and, depending on your processor, it may take a long time to finish. However, I have now successfully compiled TensorFlow from sources on several machines without too many problems. Just pay close attention to the options you are setting when configuring TensorFlow, for example, the CUDA configuration if you want GPU support.

Setup for Centos 7

I figured out how to build TensorFlow from source in Centos 7. This process does not require any root access.

What to prepare:

Java 8
Bazel
Tensorflow
CuDNN and CUDA toolkit (assume you have installed them)

Installation of Bazel

Check your JAVA_HOME since Bazel requires Java 8, you should download and install it first. This tutorial will not cover it.

“Download the corresponding “<em>.repo"</em> file from Fedora COPR and copy it to “/etc/yum.repos.d/“

$ yum install bazel

Installation of TensorFlow

Download the source

You can download Tensorflow from github as mentioned in the website https://www.tensorflow.org/install/source

$ git clone https://github.com/tensorflow/tensorflow

If you need CUDA then, you may need to hack the code. Go to file tensorflow/third_party/gpus/crosstool/CROSSTOOL and update cxx_builtin_include_directory with

cxx_builtin_include_directory : "/usr/local/cuda/targets/x86_64-linux/include"

Run the configuration script

$ ./configure

If you are wondering to use Tensorflow in a GPU with less than 3.5 compute capabilities, you may run this command.

TF_UNOFFICIAL_SETTING=1 ./configure

Build with Bazel

# Without GPU support
$ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package

# To build with GPU support:
$ bazel build -c opt --config=cuda //tensorflow/tool /pip_package:build_pip_package

# Build .whl file
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

Install TensorFlow

pip install /tmp/tensorflow_pkg/tensorflow-version-tags.whl

And you have done!

Test your Tensorflow installation

Open a Python terminal and enter the following lines of code:

>>> import tensorflow as tf
>>> hello = tf.constant("hello TensorFlow!")
>>> sess=tf.Session()

Then, to verify your installation just type:

>>> print sess.run(hello)

If the installation is right, you’ll see the following output:

Hello TensorFlow!

Final Notes

There’s a strong incentive to build TensorFlow from source especially on CPU-only systems because not everyone has an expensive GPU.

If it does not work well. Really the problems that usually arise during the compilation are not difficult to solve. If so, first Google the error, it is very likely that someone else has been the same issue. Finally, if you don’t get the answer ask TensorFlow maintainer team or StackOverflow. The reward of having the best Tensorflow is worth it