How to participate

Choosing a train dataset

You can train on any of the standard ZeroSpeech Task 1 train sets listed on the Benchmarks and Datasets page, together or combined. You can also train on external datasets, as long as they are publicly available. During the submission process, you will be asked to specify what dataset was used to train your system, providing a link (or publication reference) if it is an external dataset.

The provided datasets can be downloaded using our toolkit or directly using the provided URLs in our repository.

Using our toolkit

It is recommended to install and use our toolkit to manage, evaluate & upload your submissions. The toolkit consists of a python package containing evaluation scripts, scripts to download datasets & other relevant files, also scripts to facilitate uploading of results to the leaderboards. You can find instructions on how to download and use our toolkit here

Submission Preparation

Each benchmark requires a specific set of files to be prepared.

To facilitate this you can use the zrc submission:init <name> <location> command from the toolkit to create an empty submission template folder. Where is the name of the benchmark (abx15, abx17, abxLS) And location is the path where the directory will be created


This file contains meta information about the author and how this submission was created.

example :

  model_id: null
  gpu_budget: 60
  system_description: "CPC-big (trained on librispeech 960), kmeans (trained on librispeech 100), LSTM. See for more details."
  train_set: "librispeech 960, librispeech 100"
  author_label: "Nguyen et al."
  authors: "Nguyen, T., Seyssel, M., Rozé, P., Rivière, M., Kharitonov, E., Baevski, A., Dunbar, E. & Dupoux, E."
  paper_title: "The zero resource speech benchmark 2021: Metrics and baselines for unsupervised spoken language modeling."
  paper_url: ""
  publication_year: 2021
  institution: "EHESS, ENS, PSL Research University, CNRS and Inria"
  team: "CoML Team"
code_url: ""
open_source: true

To Note

While most of the information in meta.yaml is optional, we appreciate if you take the time and fill this information as it allows us to verify the submissions and be able to keep track of all the systems that use our benchmarks.

We also would appreciate if you made your code open source and provided a link to it, although we understand that this is not always possible.


This file contains various parameters that can override the defaults of each benchmark.

cuda: <bool> specifies whether GPU acceleration is enabled
distance_mode: <str> The metric to use for the abx distance must be one of the  following values
 'euclidean', 'cosine', 'kl' or 'kl_symmetric'. The default used will be 'cosine'
 **WARNING** the 'cosine' metric here refers to an angular distance as in the usual ABX evaluation.
feature_size: <float> Shift (in s) between two features frames

model outputs

For the abx benchmarks each file in the dataset must have an associated 2D array.

  • These need to be organized in a similar fashion as in the original dataset.
  • All numbers in the array are encoded as floats.
  • The number of columns (the feature dimension) must be constant across all the files.
  • The number of rows depends on the speech sample duration.
  • The frame shift (the shift between two successive frames) must be given in params.yaml along with the metric used for evaluation of those features.
  • Each array must contain at least 2 frames (i.e. at least two rows)
Structure of files for each abx benchmark :
  • abxLS
dev-clean/[*.npy, *.txt]
dev-other/{*.npy, *.txt]
test-clean/[*.npy, *.txt]
test-other/[*.npy, *.txt]
  • abx17
  1s/[*.npy, *.txt]
  10s/[*.npy, *.txt]
  120s/[*.npy, *.txt]
  • abx15
scores/[*.npy, *.txt]

Running the evaluation

Once the submission has been successfully created we can now run the evaluation. Depending on your benchmark choice you can use the following command to run the evaluation :

  • zrc benchmarks:run abx15 </path/to/submission> -o scores_dir
  • zrc benchmarks:run abx17 </path/to/submission> -o scores_dir
  • zrc benchmarks:run abxLS </path/to/submission> -o scores_dir

Your results are created in the scores_dir directory.


  • A validation will run before each evaluation to skip use option --skip-validation
  • If the dataset has subsets you can run the eval on only a selected subset --sets dev
  • If the benchmark has multiple sub tasks you can run your benchmark on a selected subtask using --task clean

You can see the list of the available benchmarks by using zrc benchmarks

Uploading Results BETA

We appreciate if you upload your results so that we can compile them into our leaderboards, this helps us with a couple of ways :

  • It allows us to follow new systems that are evaluated on our benchmarks and compare them.
  • It also helps us with creating a central place where all systems trying to solve unsupervised speech processing can be indexed.
  • It shows that interest in our benchmarks is still active and motivates us to create more

To submit your results you need to create an account on our website (if one is not already available).

Using the toolkit create a local session zrc user:login provide your username & password.

Once this is done you can upload using the following command zrc upload:scores <score_dir> <submission_dir>

Multiple Submissions

If your system can be used for multiple tasks (for example, Task 1 and Task 3, Task 1 and Task 4), you are strongly encouraged to make submission to all the tasks you can. To link submissions of a single system you need to use the same model_id in your meta.yaml auto-generated after the first submission.