Getting Started with our Toolbox

A toolbox has been created to allow you to use the different benchmarks. This toolbox allows to download all resources linked with the benchmarks (datasets, model checkpoints, samples, etc..), to run the various benchmarks on your own submissions, and to upload the results to our website for them to be included in our leaderboards. The toolbox can be found in our github in this page we will offer some basic documentation and instructions on how to install and use the toolbox.

Installation

The zerospeech benchmark toolbox is a python package so you require a version of python installed on your system before you can start.

You can use miniconda a lightweight version of anaconda or any other way of installing python you prefer.

Once python is installed you can install the package using :

If you are a conda user you can use our conda environment :

Or you can install it directly from source on github

To verify that the toolbox is installed correctly you can try zrc version which should print the version information. If this is not the case you can open an issue with your errors directly on our github

Toolbox Usage

Downloads

Download benchmark datasets

You can start by listing the available datasets using the command zrc datasets then you can download the dataset you want using the command zrc datasets:pull <dataset-name>.

When listing datasets the Installed column specifies whether the dataset has been downloaded.

Datasets are installed in the $APP_DIR/datasets folder. To delete a dataset you can use the command zrc dataset:rm <dataset-name>

Download model checkpoints

The command zrc checkpoints allows you to list available checkpoints.

You can then download each set by typing zrc checkpoints:pull <name>

Checkpoints are installed in the $APP_DIR/checkpoints folder.

To delete the checkpoints you can use the command : zrc checkpoints:rm <name>

Download samples

The command zrc samples allows you to list the available samples.

You can then download each sample by typing zrc samples:pull <name>.

Samples are installed in the $APP_DIR/samples folder.

To delete a sample from your system you can use zrc samples:rm <name> or just delete the relevant folder manually.

Benchmarks

You can list available benchmarks by typing the zrc benchmarks command.

To create a submission you have to follow the instructions on each of our task pages Task1, Task2, Task3, Task4

Once the submission has been created you can run the benchmark on it with the following command :

Some benchmarks are split into sub-tasks you can run partial tasks by using the following syntax:

With this syntax we run the sLM21 benchmark our submission but only for the lexical and syntactic task and we omit the semantic.

In the same way we can also only run on the dev set (or the test) :

We run the same tasks as previously but only on the dev set of the benchmark.

Submission Format

Each benchmark has a specific format that a submission has to follow, you can initialize a submission directory by using the following syntax : zrc submission:init <name> /path/to/submission, this will create a set of folders in the architecture corresponding to the benchmark name selected. For more detailed information on each benchmark you can see each Task page respectively.

Once all submission files have been created your can validate your submission to see if everything is working properly. To do so use the following syntax : zrc submission:verify <name> /path/to/submission this will verify that all files are setup correctly, or show informative errors if not.

Note: During benchmark evaluation the default behavior is to run validation on your submission, you can deactive this by adding the option --skip-verification.

Upload BETA

The upload functionality allows uploading scores or raw submissions (usually both is preferred) to our platform, this helps us keep track of new models or new publications that happen in using the benchmarks also we compile all the scores into our leaderboards.