Attention and control#4
Conversation
|
can you resolve merge conflicts? |
I noticed that Mehran's attention code was merged into main (that's where the conflict came from), so I guess I can keep this attention implementation just in case we need it. I was coded it because I wasn't able to solve a problem in the other attention implementation. |
|
@AI-ELka Can you help us determine whether or not that problem persists in the implementation that's currently on the main branch? I've forgotten the details of the issue. |
The main problem we had was that the loss remained constant (in a case where it should decrease), but after testing now with the code, this problem seems to be gone. So the main problem we had seems to have been solved, but we now have this issue with the assertion error when using uniform initialization. |
No description provided.