Due 8am EST Wednesday Oct 4, 2017

Econ B2000, MA Econometrics

Kevin R Foster, CCNY

Each student should submit a separate assignment, even if it is an identical computer file to the rest of your study group. When submitting assignments, please include your name and the assignment number as part of the filename. Please write the names of your study group members at the beginning of your homework.

  1. What are the names of the people in your study group?

The next few questions will use CEX data on consumer expenditure.

  1. (this question is from a past exam) I used the CEX data to look at the fraction of spending going to health insurance. You don’t need to load the data until the final part. For a particular subset, I get the following table, grouped by education of the reference person:
%Insurance No HS HS diploma Some college Assoc degree Bach degree Adv degree
less than 10% 467 1385 1191 615 1181 521
11% - 20% 82 231 157 71 122 58
21% - 30% 21 65 27 10 32 7
more than 30% 8 18 14 1 3 2
  1. Conditional on the reference person having a college degree (Associate’s, Bachelor’s or Advanced), what fraction devote more than 20% of spending to health insurance?
  2. Conditional on the reference person having less than a college degree, what fraction spend more than 20% on health insurance?
  3. Is this difference statistically significant?
  4. What is the overall share (in this sample) of people with any college degree? What share of people spending more than 20% is made up of people with any college degree?
  5. Are those break points (+/- 20%; any degree) reasonable? Can you suggest better? Explain.
  6. What problems might there be, with the classification and analysis here? Can you do better with the CEX data?
  1. Use the CEX data that I provided and consider the fraction spent on entertainment, ENTERTPQ/TOTEXPPQ.
  1. Find some descriptive statistics about this fraction, for some subgroups. Tell me something interesting about this data. Are there sub-categories that explain some of the variation?
  2. Create a histogram and/or density plot. What do these reveal?
  3. Estimate a linear regression and discuss what this shows.
  4. Estimate a k-nn classification to predict which households are in the lowest 25% in terms of entertainment spending. Discuss what variables are important in classifying.