Due 8am EST Wednesday Sept 26, 2018

Econ B2000, MA Econometrics

Kevin R Foster, CCNY

Remember, there is Stats Diagnostic Test on Hawkes due by midnight Monday Sept 24.

Each student should submit a separate assignment, even if it is an identical computer file to the rest of your study group. When submitting assignments, please include your name and the assignment number as part of the filename. Please write the names of your study group members at the beginning of your homework.

  1. What are the names of the people in your study group?
  2. Find some additional data for a Benford’s Law analysis (hint: for many economic and financial variables, take the change rather than the level), graph it. Consider how likely it is, to follow Benford’s Law - can you suggest some measure? (example in lecture notes)
  3. Using the PUMS data, what proportion of people* with less than a college degree earn less than $70,000 income? Above? What proportion of people with a college degree make more or less than that? (example of code in lecture notes) Look at these by borough or neighborhood.
  1. What is the richest neighborhood in each borough? Poorest? How might you measure income inequality by neighborhood? (Various measures include std dev, IQR, 90-10 range, even 90-50 and 50-10.) What is most unequal neighborhood in each borough? How much do rankings of inequality change, for slightly different measures? Is there a mean/risk tradeoff? Extra: if you have time, consider using different measures of income - for instance sometimes researchers subtract housing expenditure. Or measures of capability (so income adjusted by education). How does inequality of income correlate with other factors, what patterns do you see?

  2. Extra - some of you may have completed the Diagnostic Test already so you have time to push on. Using a subsample of the taxi data, I find that on weekends there were 193750 rides paid with credit cards and 187694 rides paid with cash.
  1. Find a 90% confidence interval for the fraction of rides paid in cash.
  2. On weekdays there were 582335 rides paid with a card and 509798 paid in cash. What is a 90% confidence interval for the fraction paid in cash now?
  3. Are these proportions statistically significantly different? Explain and calculate t-stat and p-value.
  4. What are some possible explanations? What data would you want to consider additionally? I’m not (yet) asking for data just an explanation of your thought process.
  5. Using the taxi data that I provided from class website, consider which fares are likely to tip more than 10%, 15%, or more. Can you find interesting differences?