Skip to content

Fix channel dimension mismatch in DiscriminatorSTFT#98

Open
Mr-Neutr0n wants to merge 1 commit intofacebookresearch:mainfrom
Mr-Neutr0n:fix-msstftd-channel-mismatch
Open

Fix channel dimension mismatch in DiscriminatorSTFT#98
Mr-Neutr0n wants to merge 1 commit intofacebookresearch:mainfrom
Mr-Neutr0n:fix-msstftd-channel-mismatch

Conversation

@Mr-Neutr0n
Copy link

Summary

  • Fix input channel dimension for the second convolution layer in DiscriminatorSTFT

Problem

When filters_scale > 1, there is a dimension mismatch between convolution layers:

  1. The first convolution outputs self.filters channels
  2. The second convolution expected min(filters_scale * self.filters, max_filters) input channels

For example, with filters=64 and filters_scale=2:

  • First conv outputs: 64 channels
  • Second conv expects: 128 channels (mismatch!)

This causes a runtime dimension error.

Solution

Changed line 70 from:

in_chs = min(filters_scale * self.filters, max_filters)

To:

in_chs = self.filters

Now the second convolution correctly expects self.filters input channels, matching the first convolution's output.

Test plan

  • Construct DiscriminatorSTFT(filters=64, filters_scale=2) without dimension error
  • Run forward pass to verify channel dimensions are correct

Fixes #93

The input channels for the second convolution layer should match
the output channels of the first convolution (self.filters), not
the scaled value.

The first convolution outputs self.filters channels, so the second
convolution should expect that many input channels, not
filters_scale * self.filters which caused dimension mismatch when
filters_scale > 1.

Fixes facebookresearch#93
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 6, 2026
@Mr-Neutr0n
Copy link
Author

hi there, following up on this — the channel dimension mismatch in DiscriminatorSTFT can silently produce wrong results. would be great to get a review when you get a chance

@Mr-Neutr0n
Copy link
Author

Friendly bump! Let me know if there's anything I should update or improve to help move this forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Channel dimension mismatch in encodec/msstftd.py

1 participant