Skip to content

Conversation

@adityavipradas
Copy link

@adityavipradas adityavipradas commented May 31, 2025

Resolving issue #92

  • Training: do not build KV cache
  • Validation: build KV cache
  • Inference: build KV cache

Added self.training argument in language_model.py to populate KV cache during inference, validation and set it to [None] during training.

  • This keeps the code cleaner and easy to understand compared to adding additional arguments in call functions
  • but using self.training argument is not the perfect solution as it will populate KV cache during validation. Additional modifications will be needed to handle this.

added self.training argument to build KV cache only during inference and evaluation
replaced self.training with torch.is_grad_enabled()
@kashif
Copy link
Collaborator

kashif commented May 31, 2025

If self.training is true during evaluation, then perhaps we are doing something wrong by missing a call to model.eval()?

@adityavipradas
Copy link
Author

model.eval() is right where it needs to be.

What we need is:

  • Training: do not build KV cache
  • Validation: do not build KV cache
  • Inference: build KV cache

If building KV cache during validation is fine, we can use self.training or torch.is_grad_enabled(). Please let me know and I will make the modifications accordingly. Thank you.

@adityavipradas adityavipradas changed the title Feat: adding torch.is_grad_enabled() argument to implement KV cache only during inference Feat: adding self.training argument to implement KV cache during validation and inference May 31, 2025
both torch.is_grad_enabled() and self.training lead to the same KV cache building outcome.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants