rf(schema): Make participants/sessions.tsv content available, add glob() and zip() functions by effigies · Pull Request #2359 · bids-standard/bids-specification

effigies · 2026-03-06T03:41:35Z

BEP36 proposes more extensive checks comparing the contents of /participants.tsv, /sessions.tsv and /phenotype/*.tsv to one another and other session directories. This PR introduces schema changes to facilitate this.

We require access to at least sessions.tsv contents, and we (@rwblair and I) propose dataset.sessions_tsv as an object mapping column headers onto column contents.

For consistency, and because the dataset.subjects context object is an awkward way of accessing the various lists of subject IDs, we propose to make dataset.participants_tsv along the same lines, with dataset.participants_tsv.participant_id rendering obsolete dataset.subjects.participant_id.

We also need to collate rows across columns. The natural operation in Python is zip(), so we propose that here. It is not used in any checks, yet.

With dataset.subjects.sub_dirs being the only remaining content of dataset.subjects, a more general way of achieving the same goal was desired. A glob() function seems to fit the bill, which can be used to create lists of files in selectors/checks. This PR demonstrates the use of glob() and dataset.participants_tsv to obsolete dataset.subjects altogether.

Would be interested in people's thoughts on these changes. Open to any alternatives.

Additional thoughts on refactoring the validation context

I think the validation context is due a bit of a rethink after a couple years, and getting rid of dataset.subjects I think is a good idea regardless of what happens to BEP36, but I'm not married to this approach. I think we should aim for consistency and get rid of subject.sessions (and maybe subject), but that hasn't been thought through, yet.

Other things in the context that are as-yet unused:

dataset.tree - The exists() and here-introduced glob() functions do what we might imagine this object should do. Unclear how it could be used without functions or at least comprehensions and predicates.
dataset.ignored - There may be checks to write using this, but until we try, it's unclear if this is the right interface.

Things that might be reconsidered:

Both the typescript and the nascent python context implementations have a file object that represent the file. Perhaps collating the file path, and similar general filesystem-level attributes should into file.path and so-on would make sense.
- If so, would it be a good idea to move other fields underneath file? file.nifti_header or file.columns make sense. That said, I'm inclined to leave file to just contain the fields that do not need more than a stat to establish. The principle could be dataset is populated at validation start, file is populated when reading the filesystem metadata, nifti_header and so on require additional opens or computations to populate.

cc @ericearl @surchs

ericearl · 2026-03-10T15:24:23Z

I am always impressed by @effigies's and @rwblair's attention to detail. I read through your comment above and all of the edits, and while I may be biased I think the edits all look great and should enable the things we want to do! Nice work!

rwblair · 2026-03-18T18:03:22Z

Related validator PR bids-standard/bids-validator#366

ericearl mentioned this pull request Mar 9, 2026

[ENH] BEP036 - Phenotypic Data Guidelines #2123

Open

effigies added 6 commits March 23, 2026 14:24

feat(schema): Add glob and zip functions to expression language

9dc6637

feat(schema): Add dataset.{participants,sessions}_tsv to context

37df2e4

rf(schema): Rewrite rules to use new context, glob()

878bc4a

rf(schema): Remove unused dataset.subjects from context

0708e30

test(schema): Add expression tests

4b3fdef

test(bst): Drop Subjects class from context types

46f2777

effigies force-pushed the rf/tables branch from 3a9fe0c to 46f2777 Compare March 23, 2026 18:25

effigies mentioned this pull request Mar 23, 2026

BEP36 rules surchs/bids-specification#6

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rf(schema): Make participants/sessions.tsv content available, add glob() and zip() functions#2359

rf(schema): Make participants/sessions.tsv content available, add glob() and zip() functions#2359
effigies wants to merge 6 commits intobids-standard:masterfrom
effigies:rf/tables

effigies commented Mar 6, 2026

Uh oh!

ericearl commented Mar 10, 2026

Uh oh!

rwblair commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

effigies commented Mar 6, 2026

Uh oh!

ericearl commented Mar 10, 2026

Uh oh!

rwblair commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants