Add chronos2 encoder as another encoder option in OpenTSLMFlamingo#41
Add chronos2 encoder as another encoder option in OpenTSLMFlamingo#41liu-jc wants to merge 8 commits into
Conversation
|
Hi team @RealLast @max-rosenblattl @masquare , |
|
Thanks @liu-jc for adding this! Code looks great, just have two minor comments we may want to think about @ThomasKaar @masquare:
|
Thanks @RealLast for the comments. As we discussed, I will focus more on modularize wandb with a helper class and avoid making curriculum_learning.py even longer. For modularizing and adding encoders, we maybe delay to another PR later. For this one, we just simply add chronos2encoder. I think the current structure is not too bad. We have the class - "encoder/Chronos2Encoder.py |
ThomasKaar
left a comment
There was a problem hiding this comment.
@liu-jc thanks a lot for the amazing work here!!
given also the comment by @RealLast i would propose a slight different approach that keeps the "original" functionality of the paper intact:
let's create 2 new classes.
CurriculumTrainierPretrainedEncoder: inherits fromCurriculumTrainerand just extends the constructorOpenTSLMFlamingoPretrainedEncoder: inherits fromOpenTSLMFlamingoand extends the constructor (potentially also just by hardcoding chronos for now but here we could change the encoder pretty dynamically)
and add a new padding functionality to pad_and_apply_batch so we can just naively can call pad_and_apply_batch(..., padding = "left") and _initialize_model` add a new OpenTSLMFlamingoPretrainedEncoder-logic.
that way we might be able to keep additional effort for this PR really low but still make sure it is fully backwards compatible and extends the current framework.
| # print(type(model.vision_encoder), model.vision_encoder) | ||
| # print(dir(model.vision_encoder)) |
There was a problem hiding this comment.
| # print(type(model.vision_encoder), model.vision_encoder) | |
| # print(dir(model.vision_encoder)) |
| cross_attn_every_n_layers: int = 1, | ||
| decoder_layers_attr_name: str = None, | ||
| freeze_lm_embeddings: bool = False, | ||
| encoder_type: str = "chronos2", # "cnn" or "chronos2" |
There was a problem hiding this comment.
iiuc this will now change the behavior to chronos by default. this is not ideal since we will need change the behavior of curicullum_training.py and kind of hides the fact that we just chronos.
i would propose another solution: see comment above.
There was a problem hiding this comment.
Hi @ThomasKaar,
Thanks for the comment! I agree that we may not want to change the default behavior of curicullum_training.py. I can change this back.
But for CurriculumTrainierPretrainedEncoder, I feel that would be somewhat similar to the current implementation just another layer.
For adding a new padding functionality, I agree that it would be great to add padding arg.
Will work on this after ICML ddl.
| "stage5_ecg_cot", | ||
| ] | ||
|
|
||
| load_dotenv() |
There was a problem hiding this comment.
Can we add an .env.example file to the repo, so users know which environment variables the script expects? and then please add .env to the .gitignore.
| print('AMLT_BLOB_ROOT_DIR is not set, likely we are using development mode, using current directory') | ||
| self.base_dir = os.getcwd() | ||
| else: | ||
| self.base_dir = os.path.join(self.base_dir, "juncheng","OpenTSLM") |
There was a problem hiding this comment.
please remove personal paths
| "einops>=0.6", | ||
| "wfdb>=4.0", | ||
| "open-flamingo>=0.0.2", | ||
| "chronos-forecasting>=2.2.0" |
There was a problem hiding this comment.
if you update the dependencies, can you please also update the lockfile using uv lock or uv sync
|
Hi @masquare @ThomasKaar @RealLast, I updated the code. Basically, removed the personal path, example for .env to support wandb, and added the padding_side. |
Add chronos2 encoder in OpenTSLMFlamingo
♻️ Current situation & Problem
The previous encoder only use CNNTokenizer, which I believe that it's patchTST style encoder and training from scratch. This PR mainly tried to add chronos2 as the encoder and we can loaded the pretrained chronos2 to utilize their ability to better understand time series.
⚙️ Release Notes
📚 Documentation
I use in-line comments for describing the implementation, which should be clear.
✅ Testing
Test file in test/test_chronos2_encoder.py
Code of Conduct & Contributing Guidelines
By creating and submitting this pull request, you agree to follow our Code of Conduct and Contributing Guidelines: