For example, the following code fails with Flink:
var v = DataBag()
val r = v
v = DataBag()
r
The problem is that the TempResultsManager garbage collects the temp result of the 1. line after it executes the 3. line, but the 4. line then looks for the deleted file.
(A real-life example of a similar code is the inner loop of KMeans, where the last line is similar to the 2. line here. If the solution = ... line would use centroids not from the closure, but as a TempSource, then the problem would occur there.)
A solution would be to translate the val r = v line into a TempSource and an immediate TempSink.
I guess we don't want to fix this for the old backend, but we will close this issue when the backend for the new ir is done, and the problem doesn't occur there.