Skip to content

GTDB-Tk v2.6.1: r214 hash mismatch + r226 download results in invalid archive #694

@Lunatic5202

Description

@Lunatic5202

Environment

  • Installed via pip (include the output of pip list)
  • Using a conda environment (include the output of conda list && conda list --revisions)
  • Using a Docker container (include the IMAGE ID of the container)

GTDB-Tk version:

gtdbtk --version
gtdbtk: version 2.6.1

Server information

CPU: AMD Ryzen 5 8000 Series
RAM: 16 GB
OS: Arch Linux

Debugging information

  • gtdbtk.log has been included (not generated due to failure during setup)
  • Genomes have been included (not applicable yet)

Additional comments

r214 integrity check output

[2026-04-05 00:56:16] INFO: GTDB-Tk v2.6.1
[2026-04-05 00:56:16] INFO: gtdbtk check_install
[2026-04-05 00:56:16] INFO: Using GTDB-Tk reference data version r214: /home/lunatic/gtdb_db/release214
[2026-04-05 00:56:16] WARNING: You are not using the reference data intended for this release: r226
[2026-04-05 00:56:16] INFO: Running install verification
[2026-04-05 00:56:16] INFO: Checking that all third-party software are on the system path:
[2026-04-05 00:56:16] INFO:          |-- FastTree         OK
[2026-04-05 00:56:16] INFO:          |-- FastTreeMP       OK
[2026-04-05 00:56:16] INFO:          |-- guppy            OK
[2026-04-05 00:56:16] INFO:          |-- hmmalign         OK
[2026-04-05 00:56:16] INFO:          |-- hmmsearch        OK
[2026-04-05 00:56:16] INFO:          |-- pplacer          OK
[2026-04-05 00:56:16] INFO:          |-- prodigal         OK
[2026-04-05 00:56:16] INFO:          |-- skani            OK
[2026-04-05 00:56:16] INFO: Checking integrity of reference package: /home/lunatic/gtdb_db/release214

[2026-04-05 00:56:17] INFO:          |-- pplacer          HASH MISMATCH 6786e9fc16b31db7d6eaaa9f8cfa87a8a4974434
[2026-04-05 00:56:17] INFO:          |-- masks            HASH MISMATCH 8d5a2139feabbb70789c62155f3761d2aeed1601
[2026-04-05 00:56:17] INFO:          |-- markers          OK
[2026-04-05 00:56:17] INFO:          |-- radii            HASH MISMATCH 4753acc920001a1400788ee89cb4632900449055
[2026-04-05 00:56:19] INFO:          |-- msa              HASH MISMATCH 75df495678a121497e14346b453caf42f4b03922
[2026-04-05 00:56:19] INFO:          |-- metadata         HASH MISMATCH a089cc36bf79a40c7506019accc5f93e940d9fed
[2026-04-05 00:56:19] INFO:          |-- taxonomy         HASH MISMATCH 89b12cf8106f326887599dcb30ef94ebba142035
[2026-04-05 00:56:19] INFO:          |-- skani            HASH MISMATCH da39a3ee5e6b4b0d3255bfef95601890afd80709
[2026-04-05 00:56:19] INFO:          |-- mrca_red         HASH MISMATCH c24a2f48bb0c1df38f92a8f526aa846f596c94c6

[2026-04-05 00:56:19] ERROR: Unexpected files were seen, or the reference package is corrupt.
[2026-04-05 00:56:19] ERROR: Controlled exit resulting from an unrecoverable error or warning.

Current directory state

~/gtdb_db contains:
- gtdbtk_r214_data.tar.gz
- gtdbtk_r226_data.tar.gz
- gtdbtk_r226_data.tar.gz.aria2__temp
- release214/

Additional issues

  • r214 database appears corrupted despite successful download
  • r226 download results in invalid archive (file reports data, not gzip)
  • GTDB-Tk v2.6.1 requires r220/r226, making r214 unusable without downgrading
  • Difficulty obtaining a valid r226 dataset

Main blocker

Unable to obtain a valid, extractable GTDB reference database (either r214 or r226), preventing successful gtdbtk check_install.(gtdbtk214) [lunatic@archlinux gtdb_db]$ gtdbtk check_install

[2026-04-05 00:56:16] INFO: GTDB-Tk v2.6.1

[2026-04-05 00:56:16] INFO: gtdbtk check_install

[2026-04-05 00:56:16] INFO: Using GTDB-Tk reference data version r214: /home/lunatic/gtdb_db/release214

[2026-04-05 00:56:16] WARNING: You are not using the reference data intended for this release: r226

[2026-04-05 00:56:16] INFO: Running install verification

[2026-04-05 00:56:16] INFO: Checking that all third-party software are on the system path:

[2026-04-05 00:56:16] INFO: |-- FastTree OK

[2026-04-05 00:56:16] INFO: |-- FastTreeMP OK

[2026-04-05 00:56:16] INFO: |-- guppy OK

[2026-04-05 00:56:16] INFO: |-- hmmalign OK

[2026-04-05 00:56:16] INFO: |-- hmmsearch OK

[2026-04-05 00:56:16] INFO: |-- pplacer OK

[2026-04-05 00:56:16] INFO: |-- prodigal OK

[2026-04-05 00:56:16] INFO: |-- skani OK

[2026-04-05 00:56:16] INFO: Checking integrity of reference package: /home/lunatic/gtdb_db/release214

[2026-04-05 00:56:17] INFO: |-- pplacer HASH MISMATCH 6786e9fc16b31db7d6eaaa9f8cfa87a8a4974434

[2026-04-05 00:56:17] INFO: |-- masks HASH MISMATCH 8d5a2139feabbb70789c62155f3761d2aeed1601

[2026-04-05 00:56:17] INFO: |-- markers OK

[2026-04-05 00:56:17] INFO: |-- radii HASH MISMATCH 4753acc920001a1400788ee89cb4632900449055

[2026-04-05 00:56:19] INFO: |-- msa HASH MISMATCH 75df495678a121497e14346b453caf42f4b03922

[2026-04-05 00:56:19] INFO: |-- metadata HASH MISMATCH a089cc36bf79a40c7506019accc5f93e940d9fed

[2026-04-05 00:56:19] INFO: |-- taxonomy HASH MISMATCH 89b12cf8106f326887599dcb30ef94ebba142035

[2026-04-05 00:56:19] INFO: |-- skani HASH MISMATCH da39a3ee5e6b4b0d3255bfef95601890afd80709

[2026-04-05 00:56:19] INFO: |-- mrca_red HASH MISMATCH c24a2f48bb0c1df38f92a8f526aa846f596c94c6

[2026-04-05 00:56:19] ERROR: Unexpected files were seen, or the reference package is corrupt.

[2026-04-05 00:56:19] ERROR: Controlled exit resulting from an unrecoverable error or warning.

(gtdbtk214) [lunatic@archlinux gtdb_db]$ gtdbtk --version

gtdbtk: version 2.6.1 Copyright 2017 Pierre-Alain Chaumeil, Aaron Mussig and Donovan Parks

(gtdbtk214) [lunatic@archlinux gtdb_db]$ gtdbtk --version

ls ~/gtdb_db

gtdbtk: version 2.6.1 Copyright 2017 Pierre-Alain Chaumeil, Aaron Mussig and Donovan Parks

gtdbtk_r214_data.tar.gz gtdbtk_r226_data.tar.gz.aria2__temp

gtdbtk_r226_data.tar.gz release214

(gtdbtk214) [lunatic@archlinux gtdb_db]$ (gtdbtk214) [lunatic@archlinux gtdb_db]$ gtdbtk --version

ls ~/gtdb_db

gtdbtk: version 2.6.1 Copyright 2017 Pierre-Alain Chaumeil, Aaron Mussig and Donovan Parks

gtdbtk_r214_data.tar.gz gtdbtk_r226_data.tar.gz.aria2__temp

gtdbtk_r226_data.tar.gz release214

(gtdbtk214) [lunatic@archlinux gtdb_db]tar -xvzf gtdbtk_r226_data.tar.gz

bash: syntax error near unexpected token `[lunatic@archlinux'

gtdbtk_r214_data.tar.gz gtdbtk_r226_data.tar.gz.aria2__temp

gtdbtk_r226_data.tar.gz release214

bash: gtdbtk:: command not found

bash: gtdbtk_r214_data.tar.gz: command not found

bash: gtdbtk_r226_data.tar.gz: command not found


Metadata

Metadata

Assignees

No one assigned

    Labels

    errorHelp required for a GTDB-Tk error.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions