The latest MLX adds a quantization `mode` to quantization operations. In `mlx-lm` we add the mode to the config so we know what type of quantization it is. See e.g. https://huggingface.co/mlx-community/InternVL3_5-1B-4bit/blob/main/config.json#L173 That flag causes models to break with our examples here.