taiko

Prerequisites

Make sure you have the following installed:

Node.js (includes npm)
Python 3.8+
pip (Python package manager)
TypeScript compiler

Install the TypeScript compiler globally with:

npm install -g typescript

And required Python packages with:

pip install -r requirements.txt

torch is included in requirements.txt, but depending on your system, you may want to install torch separately from pytorch.org with the right CUDA version for your system.

Data pipeline

Overview

The pipeline runs over several stages:

Create labels using labeller script (data/preprocessed/labels/<diff>)
Run spectrogram pipeline
1. Build 3 log-mel spectrograms for each frame for 3 window sizes
2. Create labels for each frame using labels from step 1
3. Extract windows from spectrograms based on labels
Export dataset in batches to data/preprocessed/exports/<my_data>

A more detailed explanation can be found here (WIP).

Usage

1. Add your songs into `data/tracks/`

Various track sets can be found at TJA Portal.
Track folders can be nested, but just make sure that any folder that contains a .tja file also has an audio file.
Most audio types should be supported. See data/src/spectrogram_utils.py for supported audio types.

2. Run the dataset builder script

Supported flags:

Flag	Type	Description
`-d`	Required	Course difficulty. See supported difficulties
`-f`	Required	Output directory name under `<data>/preprocessed/exports/`
`-n`	Required	Note types (comma-separated, e.g. `don,ka`. See `data/src/spectrogram_utils.py` for supported note types.)
`-b`	Optional	Batch size; songs per dataset file (default: `50`)
`-c`	Optional	Clears labels directory for the specified difficulty.
`-r`	Optional	Percentage of total samples that are background, as a decimal (default: `0.5`, i.e. 50%).
`-H`	Optional	Hard negative radius in frames. Negatives are sampled within this many frames of a note event (default: `60`, ~0.7s). Set to `-1` to disable.
`-W`	Optional	Onset weight radius in frames. Background frames within this radius of a note onset get linearly reduced loss weight (`weight = dist / radius`). Positive frames always get weight `1.0` (default: `4`). Set to `0` to disable.

Example:

./data/src/build_dataset.sh -d easy -f my_dataset -n don,ka -b 50 -r 0.33

3. Import .npz file for each batch

Example:

import numpy as np

data = np.load(file="../preprocessed/exports/my_dataset/batch_1.npz")
X, y, W = data["X"], data["y"], data["W"]

print(X.shape) # Spectrogram windows
print(y.shape) # Spectrogram window labels
print(W.shape) # Onset weights

Note that there are multiple batch files per dataset. Load them in individually while training.

Model training

Trains a CNN on the preprocessed .npz batch files produced by the data pipeline.

Usage

python model/training.py \
  --data_dir data/preprocessed/exports/my_dataset \
  --out models/my_model.pth

Arguments

Argument	Required	Default	Description
`--data_dir`	Yes	—	Directory containing `batch_*.npz` files and `metadata.json`
`--out`	Yes	—	Path to save the trained model `.pth` file
`--epochs`	No	`100`	Number of training epochs
`--lr`	No	`0.001`	Learning rate
`--batch_size`	No	`256`	Mini-batch size
`--split_prop`	No	`0.1`	Fraction of data held out for validation
`--dropout`	No	`0.5`	Dropout rate on fully connected layers
`--seed`	No	`1`	Random seed
`--patience`	No	`10`	Early stopping patience in epochs
`--class_weights`	No	off	Weight cross entropy loss by inverse class frequency to counter class imbalance
`--onset_weights`	No	off	Use per-sample onset weights from the dataset during training

Inference

Runs a trained model on an audio file and outputs a playable .tja chart.

Usage

python model/inference.py \
  --audio path/to/song.mp3 \
  --bpm 140 \
  --model models/my_model.pth \
  --out path/to/output.tja

Arguments

Argument	Required	Default	Description
`--audio`	Yes	—	Path to input audio file
`--bpm`	Yes	—	BPM of the song. Songs with BPM changes mid-way will produce inaccurate charts.
`--model`	Yes	—	Path to trained model `.pth` file
`--out`	Yes	—	Path to write output `.tja` file
`--title`	No	`"Untitled"`	Song title written into the TJA header
`--offset`	No	`0.0`	Seconds of silence before the music starts in the audio file
`--threshold`	No	`0.5`	Minimum model confidence to count as a note (0–1). Increase to reduce false positives, decrease to catch more notes.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
data		data
jobs		jobs
model		model
trained_model		trained_model
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

taiko

Prerequisites

Data pipeline

Overview

Usage

1. Add your songs into `data/tracks/`

2. Run the dataset builder script

3. Import .npz file for each batch

Model training

Usage

Arguments

Inference

Usage

Arguments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

taiko

Prerequisites

Data pipeline

Overview

Usage

1. Add your songs into data/tracks/

2. Run the dataset builder script

3. Import .npz file for each batch

Model training

Usage

Arguments

Inference

Usage

Arguments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Add your songs into `data/tracks/`

Packages