We benchmark openTSNE against popular, open-source t-SNE libraries, accross three programming languages, including Python, R, and Julia.
The prerequisite for running the benchmarks is conda
,
which must be previously installed. conda
enables us to
create reproducable, isolated environments. You can run the full
benchmark suite using
bash run.sh -l
Alternatively, since the full benchmark suite can take days or even weeks to complete, you may instead wish to run the smaller benchmark suite using
bash run.sh -s
Note, however, that the strength of openTSNE
over other
implementations is its ability to quickly create embeddings of massive
data sets. As such, the smaller benchmark suite will fail to highlight
the scale of the advantage of openTSNE
to other
implementations.
The benchmark output will be saved to the logs/
directory. We also include exact conda
environment used to produce the benchmarks in the manuscript. This can
also be found in the logs/
directory, and can be reproduced
exactly using
conda env create -f logs/00--conda_env.yml
WARNING: Please note that Julia is not available via
conda on OSX systems. To replicate the environment on an OSX system,
please modify logs/00--conda_env.yml
and remove the
julia
package from the environment
section.
The Julia benchmarks will not be run, but the Python and R benchmarks
will be unaffected.
Because running the benchmarks can take a long time, we provide the
output of our own benchmarks in the logs/
directory. These
benchmarks were run on an Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
processor with 128GB of memory, and we include the output of these
benchmarks in the intel_xeon_e5_1650
folder. We also
include the exact conda
environment and installed package
versions.
To generate the benchmark figure, install the requirements listed in
requirements-figures.txt
pip install -r requirements-figures.txt
and run
python generate_figures.py -i logs/