You only need to build from source if you have determined that none of the pre-built configurations are appropriate for your needs, or if you want to modify, test, or debug the code. If none of the above applies, see here for detailed information on how to get started quickly.
A release version of the ExpressionMatrix2
software
comes with pre-built version for the following configurations
for Ubuntu 16 and CentOs 7.
Platform | Configuration name | Python version | HDF5 support |
---|---|---|---|
Ubuntu 16 | Release-ubuntu16-python2 | 2 | Yes |
Ubuntu 16 | Release-ubuntu16-python3 | 3 | Yes |
Ubuntu 16 | Release-ubuntu16-nohdf5-python2 | 2 | No |
Ubuntu 16 | Release-ubuntu16-nohdf5-python3 | 3 | No |
CentOS 7 | Release-centos7-python2 | 2 | Yes |
CentOS 7 | Release-centos7-nohdf5-python2 | 2 | No |
You can choose the version that is appropriate for you. HDF5 support is only necessary if you intend to use functions ExpressionMatrix.addCellsFromHdf5 or ExpressionMatrix.addCellsFromBioHub2. If you don't need to use these functions, choose one of the versions without HDF5 support as they have fewer dependencies on installed packages.
The Release-ubuntu16-nohdf5-python3
version can also be used on
Windows 10 via Windows Subsystem for Linux and the Windows Ubuntu application.
See here for more information.
The Release-ubuntu16-nohdf5-python3
version,
because of its minimal set of dependencies,
may also run on Linux platforms other than Ubuntu 16
but running a similar Linux kernel.
To build the ExpressionMatrix2
software from source,
you can use Eclipse or CMake.
There are a couple of ways to get the source code from GitHub.
Clone or Download
.
git clone https://github.com/chanzuckerberg/ExpressionMatrix2
.
This requires git
to be installed.
The repository includes at its top level Eclipse files .project
and .cproject
which can be used to load
the ExpressionMatrix2
into Eclipse and build it.
The build settings used in Eclipse are the ones for the
Release-ubuntu16-python3
configuration.
If you need to build with different settings,
you can modify the settings in Eclipse, or use cmake as described below.
A debug version of the same configuration is also provided in Eclipse,
and can be used for debugging.
CMake
The ExpressionMatrix2/src
directory contains
file CMakeLists.txt
which can be used for
building using CMake
.
This provides some amount of customization that can be used
to build on other Linux platforms.
To build in this way:
cd
to it.
This directory can be anywhere - it does not have to
be in the ExpressionMatrix2
source tree.
cmake .../ExpressionMatrix2/src
,
making sure to enter the correct local path for the ExpressionMatrix2/src
directory.
Also add the CMake -D
options appropriate
for the configuration you want to build, as described below.
This will create a Makefile
.
make
.
This will create ExpressionMatrix2.so
.
If your build machine has multiple processors and sufficient memory,
you can use the -j
option of the make
command
to speed up the build. See the man
page for
the make
command for more information.
You can use CMake option -D
to override
default values for one or more of the following
configuration variables:
Variable | Default value | Description |
---|---|---|
PYTHON_INCLUDE_PATH | /usr/include/python3.5m
| The include path for the Python version you want your shared library to work with. |
PYBIND11_INCLUDE_PATH | /usr/local/include/python3.5
| The directory where pybind11 include files are installed.
This must be a version of pybind11 consistent with the
Python version you want to build for.
|
BUILD_WITH_HDF5 | ON
| If set to OFF , disables HDF5 functionality
(functions
ExpressionMatrix.addCellsFromHdf5
or
ExpressionMatrix.addCellsFromBioHub2)
and eliminates the dependency on HDF5 include files and libraries.
|
HDF5_INCLUDE_PATH | /usr/include/hdf5/serial
| The directory where HDF5 include files are located.
Only used if BUILD_WITH_HDF5 is ON .
|
HDF5_LIBRARIES | hdf5_cpp hdf5_serial
| The names of the HDF5 libraries to link with.
Only used if BUILD_WITH_HDF5 is ON .
If invoking cmake from the bash shell,
enter these names separated by spaces and enclosed in single quotes, not double quotes, like this:
-DHDF5_LIBRARIES='hdf5_cpp hdf5_serial' .
Using double quotes will not work.
|
For convenience, the table below lists the CMake -D
options used for each of the pre-built configurations,
as well as for configurations that are not part of the release package.
Release-ubuntu16-python2 |
-DPYTHON_INCLUDE_PATH=/usr/include/python2.7
-DPYBIND11_INCLUDE_PATH=/usr/local/include/python2.7
|
Release-ubuntu16-python3 | (None) |
Release-ubuntu16-nohdf5-python2 |
-DPYTHON_INCLUDE_PATH=/usr/include/python2.7
-DPYBIND11_INCLUDE_PATH=/usr/local/include/python2.7
-DBUILD_WITH_HDF5=OFF
|
Release-ubuntu16-nohdf5-python3 |
-DBUILD_WITH_HDF5=OFF
|
Release-centos7-python2 |
-DPYTHON_INCLUDE_PATH=/usr/include/python2.7
-DPYBIND11_INCLUDE_PATH=/usr/lib/python2.7/site-packages
-DHDF5_INCLUDE_PATH=/usr/include
-DHDF5_LIBRARIES='hdf5_cpp hdf5'
|
Release-centos7-nohdf5-python2 |
-DPYTHON_INCLUDE_PATH=/usr/include/python2.7
-DPYBIND11_INCLUDE_PATH=/usr/lib/python2.7/site-packages
-DBUILD_WITH_HDF5=OFF
|
Arch Linux (not included in release package - see here for more information) |
-DPYTHON_INCLUDE_PATH=/usr/include/python3.6m
-DPYBIND11_INCLUDE_PATH=/usr/include/python3.6m/pybind11
-DHDF5_INCLUDE_PATH=/usr/include
-DHDF5_LIBRARIES='hdf5_cpp hdf5'
|
Disregarding standard system libraries that are available on all Linux systems, the only prerequisite for running one of the configurations with HDF5 functionality turned off is Graphviz.
To run one of the configurations with HDF5 functionality you also need HDF5 libraries:
libhdf5-10
(or libhdf5-10-dev
)
and libhdf5-cpp-11
.
hdf5
.
A small number of prerequisites are necessary to build the
ExpressionMatrix2
code:
-DBUILD_WITH_HDF5=OFF
.
Package names are
gcc
compiler,
which are available on most Linux distributions.
For convenience, package names for the above for Ubuntu 16 and CentOS 7
are listed here:
Package | Ubuntu 16 | CentOS 7 |
---|---|---|
Boost libraries | libboost-all-dev | boost-devel |
Python 2 | python-all-dev | python-devel |
Python 3 | python3-all-dev | |
Pybind11 | pybind11 | pybind11 |
HDF5 libraries | libhdf5-dev, libhdf5-cpp-11 | hdf5-devel |
The pybind11
package is installed using (as root)
pip install pybind11
for Python 2
and pip3 install pybind11
for Python 3.
As an alternative to building with cmake
as described above,
for Arch Linux you also can use the following commands to build from source an Arch Linux
package that can be installed in the standard way using pacman
:
The resulting package can be installed using the following command (must be run as root):
curl -o PKGBUILD https://aur.archlinux.org/cgit/aur.git/plain/PKGBUILD?h=expressionmatrix2-git
makepkg
If you do this, you don't need to set the
pacman -U expressionmatrix2-git-*.tar.xz
PYTHONPATH
environment variable to be able to import the ExpressionMatrix2
module in Python.
The makepkg
command is supposed to take care of all dependencies,
but in case you run into problems you can manually install the prerequisites
using the following commands as root:
pacman -S cmake
pacman -S boost
pacman -S hdf5
pacman -S python-pipenv
pip install pybind11
Reference documentation for the Python API is here.
To generate Doxygen documentation for the C++ code,
make sure you have Doxygen installed
(Ubuntu package name is doxygen
),
cd to ExpressionMatrix2/doc/doxygen
, then issue command
doxygen
without arguments.
Documentation will be created in directory ExpressionMatrix2/doc/doxygen/html
.
The top level file
of the documentation will be ExpressionMatrix2/doc/doxygen/html/index.html
.
Point your browser to that to see the documentation.
For readability, the C++ code does not contain
any Doxygen directives, so the generated documentation
will only contain what Doxygen can do with EXTRACT_ALL=YES
and without relying on documentation directives in the code.
Even with this limitation, this can be useful to explore the C++ code.