-
Notifications
You must be signed in to change notification settings - Fork 30
Description
@andersonfrailey and @Amy-Xu, Now that taxdata pull requests #178 (fix TANF values), #185 (use Medicare and Medicaid actuarial values) and #278 (ignore veterans benefits in the distribution of other benefits to filing units) have been merged over the past month or so, I've spent some time looking at filing units that have what seem to me to be extremely large benefits.
I've found CPS records to look at by using a two-step process. First, the Python script below is used to find RECID values for filing units that have large benefits in the cps.csv.gz file. Second, the non-zero variables in each of those records are produced using the csv_show.sh bash script, which is part of the Tax-Calculator repository.
One filing unit found in this way is shown in my recent comment on taxdata pull request #135. That filing unit has an imputed TANF benefit of about $136,000 even though the taxpayer and spouse have combined earnings of over one million dollars.
Looking at the filing units with large tanf_ben and vet_ben values raises a question about how the CPS filing units are constructed. Among those with extremely large tanf_ben and vet_ben values are two different groups of fifteen records, all of whom appear to have exactly the same demographics and earnings (but different unearned incomes) and exactly the same large benefit. What's going on here? Why do the CPS data include these nearly identical records? Why are there fifteen near replicates? Where are the fifteen near replicates created in the code?
But quite apart from the groups of fifteen nearly identical filing units, I don't understand how people with high incomes can be thought to be getting TANF benefits. Is that imputation being done in C-TAM code or in taxdata code?
The one filing unit (represented by fifteen near replicates) with a very large vet_ben value could plausibly be a retired three-star general with somewhere around 35 years of service as @feenberg suggested in C-TAM issue 73. The taxpayer is 57 years old and has a vet_ben value of $169,920. That amount includes our estimate of the actuarial value of access to the VA hospital system, which is about $9,890. So, the amount of what seems to be a pension for military service is roughly $160,000 per year.
But it seems to me that including military retirement pensions in vet_ben is incorrect because as taxable income they should be added to the e01700 variable.
And the fact that vet_ben seems to include largely military (retirement or disability) pensions and retiree medical benefits raises another question in my mind. Given that vet_ben are largely deferred compensation for those who served in the military, why would this kind of income ever be considered for repeal as part of a UBI reform? If they are thought to be "welfare" (rather than deferred compensation), why didn't the C-TAM project include the pension benefits and health insurance benefits accruing to retired federal (or state and local) government employees? If retired government employees were not a focus in the C-TAM work because they are getting not "welfare" but deferred compensation, then why was the deferred compensation of those with military service included in the scope of the C-TAM project?
Now the details. First, the Python script called bentab.py:
from __future__ import print_function
import numpy as np
import pandas as pd
data = pd.read_csv('cps.csv.gz')
print('num_filing_units:', data.shape[0])
def big_recids(big):
rids = big['RECID'].tolist()
print(' RECIDs:')
for num in range(0, big.shape[0]):
print(' ', rids[num])
big = data[data['XTOT'] >= 14]
print('num_with_XTOT>=14:', big.shape[0])
big_recids(big)
big = data[data['ssi_ben'] >= 48000]
print('num_with_ssi>=$48K:', big.shape[0])
big_recids(big)
big = data[data['tanf_ben'] >= 120000]
print('num_with_tanf>=$120K:', big.shape[0])
big_recids(big)
big = data[data['vet_ben'] >= 156000]
print('num_with_vet>=$156K:', big.shape[0])
big_recids(big)
And now the output from that Python script:
taxdata/cps_data$ python bentab.py
num_filing_units: 456465
num_with_XTOT>=14: 3
RECIDs:
83778
422382
434766
num_with_ssi>=$48K: 2
RECIDs:
280113
403676
num_with_tanf>=$120K: 20
RECIDs:
76509
76510
76511
76512
76513
76514
76515
76516
76517
76518
76519
76520
76521
76522
76523
135454
191624
311549
312738
315578
num_with_vet>=$156K: 15
RECIDs:
119232
119233
119234
119235
119236
119237
119238
119239
119240
119241
119242
119243
119244
119245
119246