Econ B2000, MA Econometrics
Kevin R Foster, CCNY
1. What are the names of the people in your study group?
The next few questions will use CEX data on consumer expenditure.
- (this question is from a past exam) I used the CEX data to look at the fraction of spending going to health insurance. You don’t need to load the data until the final part. For a particular subset, I get the following table, grouped by education of the reference person:
less than 10% |
467 |
1385 |
1191 |
615 |
1181 |
521 |
11% - 20% |
82 |
231 |
157 |
71 |
122 |
58 |
21% - 30% |
21 |
65 |
27 |
10 |
32 |
7 |
more than 30% |
8 |
18 |
14 |
1 |
3 |
2 |
- Conditional on the reference person having a college degree (Associate’s, Bachelor’s or Advanced), what fraction devote more than 20% of spending to health insurance? 2.09%
- Conditional on the reference person having less than a college degree, what fraction spend more than 20% on health insurance? 4.17%
- Is this difference statistically significant? Tstat is almost 5, yes
- What is the overall share (in this sample) of people with any college degree? What share of people spending more than 20% is made up of people with any college degree? 72% vs 36%
- Are those break points (+/- 20%; any degree) reasonable? Can you suggest better? Explain.
- What problems might there be, with the classification and analysis here? Can you do better with the CEX data?
- Use the CEX data that I provided and consider the fraction spent on entertainment, ENTERTPQ/TOTEXPPQ.
- Find some descriptive statistics about this fraction, for some subgroups. Tell me something interesting about this data. Are there sub-categories that explain some of the variation?
- Create a histogram and/or density plot. What do these reveal?
- Estimate a linear regression and discuss what this shows.
- Estimate a k-nn classification to predict which households are in the lowest 25% in terms of entertainment spending. Discuss what variables are important in classifying.