Skip to content

Conversation

@chusloj
Copy link
Contributor

@chusloj chusloj commented Jul 16, 2020

This PR updates the solver used in the stage2.py solver for the PUF weights from PuLP to CVXOPT. Total running time for the solver has changed from about 2 hours (PuLP) to about 20 minutes (CVXOPT). Also, the README is updated to reflect the fact that the PUF stage 2 solver no longer takes several hours to run.

@andersonfrailey

@chusloj chusloj changed the title Update solver to cvxopt for PUF stage2.py [Review] Update solver for PUF stage 2 from PuLP to CVXOPT [Review] Jul 16, 2020
@andersonfrailey
Copy link
Collaborator

Thanks for the PR, @chusloj. I'll review it later today/this weekend.

@andersonfrailey
Copy link
Collaborator

Thanks again for the PR @chusloj. Left a few comments and noticed a couple more things. It looks like there has been a change to cps_weights.csv.gz, but none of the inputs that create that file changed. Do you know why this is? And I see puf_weights.csv.gz and puf_ratios.csv also changed. This is expected given that we're switching over to a new LP model, but just so we have a record could you comment on if/how this changed taxcalc projections?

@chusloj
Copy link
Contributor Author

chusloj commented Jul 20, 2020

It looks like there has been a change to cps_weights.csv.gz, but none of the inputs that create that file changed. Do you know why this is? And I see puf_weights.csv.gz and puf_ratios.csv also changed.

For the CPS:

  • cps_weights.csv.gz changed when I ran cps_stage2/stage2.py by itself, but when I ran make all, cps_weights.csv.gz didn't change.

For the PUF:

  • The PUF weights changed somewhat - 6% of the weights changed, but only about 0.3% of the weights changed by more than 10 units.
  • puf_ratios.csv had negligible value changes of <0.01

if/how this changed taxcalc projections?

After changing puf_weights.csv.gz and puf_ratios.csv within taxcalc, results from the basic recipe in the documentation show that no change in any projected variable's value is > 0.01.

@chusloj chusloj mentioned this pull request Jul 21, 2020
@andersonfrailey
Copy link
Collaborator

After changing puf_weights.csv.gz and puf_ratios.csv within taxcalc, results from the basic recipe in the documentation show that no change in any projected variable's value is > 0.01.

Perfect. Glad we're not seeing a big change.

cps_weights.csv.gz changed when I ran cps_stage2/stage2.py by itself, but when I ran make all, cps_weights.csv.gz didn't change.

This is interesting. Could you run both one more time and see if the weights change?

@chusloj
Copy link
Contributor Author

chusloj commented Jul 22, 2020

cps_weights.csv.gz changed when I ran cps_stage2/stage2.py by itself, but when I ran make all, cps_weights.csv.gz didn't change.

This is interesting. Could you run both one more time and see if the weights change?

After running a second time, cps_stage2/stage2.py and make all create the exact same weights file, and approximately 0.014% of the weights from this new weight file are different from the old weight file. Almost all of these new weights differ from the old weights by 5 units in absolute terms.

@andersonfrailey
Copy link
Collaborator

Well that's good. Can you get the md5 checksum for the weights you're creating? I checked out this branch and want to make sure we're getting the same one. Mine is MD5 (cps_weights.csv.gz) = 958488dfba94df3976ea213eeeac1ffb. You can find it by running md5 cps_weights.csv.gz in your terminal window.

Once we confirm they're the same, I'm ok with merging this. Doesn't seem likely we'll ever track down the reason for the slight discrepancy and I've run out of ideas for where to look. But if we run into this same issue again down the road we'll need to do some deeper digging.

@chusloj chusloj changed the title Update solver for PUF stage 2 from PuLP to CVXOPT [Review] Update solver for PUF stage 2 from PuLP to CVXOPT Jul 28, 2020
chusloj added 2 commits August 6, 2020 21:41
erge remote-tracking branch 'upstream/master' into solver_update
@chusloj
Copy link
Contributor Author

chusloj commented Aug 21, 2020

With the closing of #351 , this update has been made null and void. Closing.

@chusloj chusloj closed this Aug 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants