Managing Machine Learning Software Integration with Conan
In recent weeks, we have been working on a project to support machine learning development in the automotive domain with an integrated development environment. In addition to setting up an IDE based on Eclipse (including many interesting features), we also investigated how to support the work of the development team in managing the dependencies of the embedded projects (in terms of C/C++ libraries).
We evaluated conan.io about which we learned in another automotive customer setting.
In the integration use case that we investigated, one integrator receives code/libraries from other parties (in-house) and builds up one integrated system (mainly for prototyping, concept validation). Some of the deliverables require linking, re-compilation etc. and many of these have dependencies to different versions of the same library (e.g. different OpenCV version). Those dependencies should be manageable locally, i.e. without installing all of these libraries in the system and then getting into dependency problems.
We designed a small project to show the use of Conan to manage the dependencies of C-Projects automatically. It reads a .png image and feeds it to a neural network for image classification. For that, it has two basic dependencies:
- OpenCV for image processing
- Tensorflow C-API for Neural Network processing.
It is a multi-platform project (Windows with Visual Studio and Linux with gcc) and is being built with CMake on both platforms. The features of the showcase wrt / Conan are shown as in the table below:
Designed to show the management of dependencies with Conan.
Designed to show the consumption of two different platform packages for the same general dependency
This is a publically available binary package that will download pre-compiled libraries for the use with Visual Studio.
This is a publically available package. It will download the sources of OpenCV from git and build it automatically during conan configuration of the main project.
Designed to show how to provide a custom packaging of a 3rd party non-Conan library on a custom Xonan server.
This is a custom defined Conan package that can be used on Windows.
This is a custom defined that can be used on Linux.
Conan.io is a C/C++ package and dependency manager that provides functionality to automatically configure your project's dependencies on other projects. It has both very simple and accessible as well advanced features for configuration. E.g. if our demo project would only need to be built on Windows, we could have a configuration file like this:
bin, *.dll -> ./bin # Copies all dll files from packages bin folder to my "bin" folder
lib, *.dylib* -> ./bin # Copies all dylib files from packages lib folder to my "bin" folder
Conan would then
- Analyze all required dependencies (transitively) and look up its known repositories to find packages for the dependencies (in our case OpenCV and TensorFlow)
- Download the packages and install the necessary files to a local project and version-specific location (so that you can easily keep multiple versions of dependencies)
- Depending on the package, if the binary for your platform is not available, it would download the required sources, compile the dependency and install it locally
- Configure your build environment (cmake et. al.) so that the include paths, link paths and link setup for your build is automatically configured to point to the dependencies in the correct locations. No manual configuration is supposed to be required.
So we set up a demo project to evaluate that functionality. One focus was to investigate about the effort to package not yet existing projects as Conan projects.
Configuration of the main project
The main project has OpenCV as a dependency. However, it seems that there is no publically available conan package for OpenCV that can be consumed both on Windows and on Linux, but there are different packages for the platforms. That means that we need different dependencies on both platforms.
Usually, in the simple case, dependencies are specified in a conanfile.txt. However, this mechanism does not support any logic for the dependencies. The alternative to using a configuration file is to use a conanfile.py as definition, which supports more complex logic. The main feature is the requirements() method, that builds up different dependencies for OpenCV based on the platform.
As OpenCV is consumed from public repositories (there was some configuration required due to different packages for different platforms), the other task was to create a custom conan package for the TensorFlow C-API. One of the nice features of Conan is the fact, that package definition can be separately shipped from the artifacts so that you can easily provide a package definition for some other software without having to modify that software.
For the Tensorflow C-API, binary downloads are provided by Google. However, we obviously have different downloads for the Windows and Linux versions. The information about a package are provided by so-called Conan "recipes", which are essentially Python files – an implementation of a subclass of the provided class "ConanFile".
In our case, the file looks like this:
name = "TensorFlow"
version = "1.4.0"
settings = "os", "compiler", "build_type", "arch"
# This is required so that our version of the .lib for tensorflow gets
# copied. This .lib has been created for our use so that we can use the link library easily
if self.settings.os == "Windows" :#and self.compiler == "Visual Studio":
url = ("https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-cpu-windows-x86_64-1.4.0.zip"
# On windows, we need the .lib file for linking. However, we want to be able to
# provide different versions, so this is renaming it to the actual value that is
# Note that right now, on Windows, we only support TF 14 for x64. Other TF versions etc. would require
# additional .lib files and extensions to this .py script.
elif self.settings.os == "Linux" :
url = "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-cpu-linux-x86_64-1.4.0.tar.gz"
raise Exception("Binary does not exist for these settings")
self.copy("*.dll",dst="lib") # assume package as-is, but you can also copy specific files or rearrange
self.copy('*.so*', dst='lib', src='lib')
With this definition, Conan can download the required files, create a local location, setup the files correctly and will also setup and wire everything to build the stuff with our cmake configuration. We do not have to specify location of the include files and libraries manually. For cmake (Conan supports other build systems, too), it will create a cmake to be included that does a lot of the wiring.
One excerpt is
The TensorFlow C-API download for Windows does contain a .dll, but it does not contain the .lib files which are required for development. So we created the .lib files ourselves and packaged them with the recipe. We actually have some kind of a hybrid then: Some parts are copied from our package and the rest is downloaded when the dependencies are resolved.
Conan supports local servers. For demonstration, we simply used the command line to start a simple server and published our TensorFlow package to that local server to verify the consumption through the normal build.
Conan seems to address the use cases we actually see in our integration setting. It is rather straightforward to write custom packages – however, if you want to provide a single package for different platforms that is also able to build itself from source this is significantly more effort. Conan supports both simple text-file configuration as well as complex Python-based configs.
With that configuration, we could built our C based implementation of an image classification on two different platforms with automated configuration.