-
|
@johnnyL7 will provide more details. When we collect metrics from a multi-agent training the results seem different than what happens back in AL. One theory is that for multi-agent, if an agent is skipped, we aren't adjusting for that. If there are 10 agents, and 9 are skipped, we still divide by 10 to calculate that metric, rather than dividing by 1 (the one agent that actually performed an action) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
Yeah, that's right. It was to keep things simple. Also keep in mind that we're only taking the final metric value at the end of each episode versus Anylogic which will use the metric at every step. That might cause the discrepancy as well. So the skipping part (which is a per step thing) isn't accounted for in Pathmind at all. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
Yeah, that's right. It was to keep things simple.
Also keep in mind that we're only taking the final metric value at the end of each episode versus Anylogic which will use the metric at every step. That might cause the discrepancy as well. So the skipping part (which is a per step thing) isn't accounted for in Pathmind at all.