Practice for Exam 1 Econ
20150 Kevin R Foster, CCNY Fall
2012 |
|
Not all of these questions are strictly relevant; some might require
a bit of knowledge that we haven't covered this year, but they're a generally
good guide.
1.
Standard Normal
calculations.
a.
For a Standard
Normal distribution what is area to the right of 0.2? A. 0.5000 B. 0.8415 C.
0.4207 D. 0.4602
b.
For a Standard
Normal distribution what is area to the right of 0.9? A. 0.3264 B. 0.1841 C.
0.8159 D. 0.3681
c.
For a Standard
Normal distribution what is area to the right of 2.0? A. 0.0228 B. 0.8749 C.
0.1587 D. 0.0510
d.
For a Standard
Normal distribution what is area to the left of 0.5? A. 0.0107 B. 0.0472 C.
0.6915 D. 0.3130
e.
For a Standard
Normal distribution what is area to the left of -1.6? A. 0.1093 B. 0.0548 C.
0.1769 D. 0.9479
f.
For a Standard
Normal distribution what is area in both tails farther from the mean than -2.2?
A. 0.0916 B. 0.9868 C. 0.6005 D. 0.0278
g.
For a Standard
Normal distribution what is area in both tails farther from the mean than 1.1?
A. 0.0714 B. 0.2713 C. 0.4979 D. 0.1354
h.
For a Standard
Normal distribution what is area in both tails farther from the mean than 0.8?
A. 0.4237 B. 0.3364 C. 0.8474 D. 0.0546
i.
For a Standard
Normal distribution what is area closer to the mean than 2.4? A. 0.9836 B.
0.0318 C. 0.9254 D. 0.1151
j.
For a Standard
Normal distribution what values leaves probability 0.778 in both tails? ± 0.2819
± 1.0705 ± 0.8950 ± 0.1745
k.
For a Standard
Normal distribution what values leaves probability 0.281 in both tails? ± 0.9100
± 1.6564 ± 1.0781 ± 2.1560
2.
Normal Distribution calculations.
a.
For a Normal
Distribution with mean 12 and standard deviation 1.2, what is area to the right
of 14.9? A. 0.7069 B. 0.0082 C. 0.0164 D. 0.1151
b.
For a Normal
Distribution with mean -5 and standard deviation 1.8, what is area to the left
of -7.7? A. 0.0027 B. 0.9973 C. 0.7734 D. 0.0668
c.
For a Normal
Distribution with mean -5 and standard deviation 3.0, what is area to the left
of -11.6? A. 0.0278 B. 0.9861 C. 0.1587 D. 0.0139
d.
For a Normal
Distribution with mean 4 and standard
deviation 0.9, what is area in both tails farther from the mean than 2.1? A.
0.1587 B. 0.0357 C. 0.9821 D. 0.7772
e.
For a Normal
Distribution with mean 7 and standard
deviation 7.1, what is area in both tails farther from the mean than 16.9? A.
0.3230 B. 0.1615 C. 0.1721 D. 0.1587
f.
For a Normal
Distribution with mean 3 and standard
deviation 3.4, what values leaves probability 0.05 in both tails? A. (–4.6208,
10.6208), B. (–3.6639, 9.6639), C. (–2.5925, 8.5925), D. (–1.6449, 1.6449)
g.
For a Normal
Distribution with mean 8 and standard
deviation 8.7, what values leaves probability 0.351 in both tails? A. (3.943,
12.057), B. (12.6124, 3.3876), C.
(–4.5574, 12.4426), D. (–0.1141, 16.1141)
h.
For a Normal Distribution
with mean -11 and standard deviation 0.2, what values leaves probability 0.291
in both tails? A. (-10.9586, -11.0414), B. (-5.7112, ‑5.2888), C.
(-56.0559, -53.9441), D. (-11.2112, -10.7888)
3.
Bloomberg News
reported on a US audit of mortgages, showing that the average number of
"seriously delinquent" loans originated by Countrywide (now owned by
Bank of America) was 6.76%, versus a general average of 3.59% made by all other
originators. There were 4050 loans made
by Countrywide. (H. Son, D. Kopecki, D. Griffin, Oct. 5, 2011, "BofA May Face Fraud Claims for Soured Loans,"
Bloomberg News.)
a.
Test the null
hypothesis that Countrywide's delinquent loan rate was actually not different
from 3.59%; how likely is it that Countrywide was just unlucky? (What is the
p-value?)
b.
Create a 95%
confidence interval for Countrywide's delinquency rate. What is the 90% confidence interval? The 99% interval?
c.
What reasons,
other than fraud, could explain why Countrywide's loans might have a higher
delinquency rate?
4.
Using the PUMS
data on people in New York City, consider educational attainment.
a.
What fraction of
people 25-55 have a college degree (associate, bachelor, or advanced)?
b.
Sampling people
aged 25-55, what is the probability of finding someone with some college
degree? With a bachelor or higher
degree?
c.
What is the
conditional probability that a person, age 25-55 and having some college
degree, has a family income below the poverty line? Below 150% of the poverty line?
d.
Form a hypothesis
test for whether the chance of being in poverty (under 100% of poverty line) is
independent of degree. What is the
p-value? What is a 95% confidence
interval for the difference?
e.
Why do you think
that we would find these results?
Explain (perhaps with some further empirical results from the same data
set).
5.
A random variable
is distributed as a standard normal.
(You are encouraged to sketch the PDF in each case.)
a.
What is the
probability that we could observe a value as far or farther than 1.3?
b.
What is the
probability that we could observe a value nearer than 1.8?
c.
What value would
leave 10% of the probability in the right-hand tail?
d.
What value would
leave 25% in both the tails (together)?
6.
You are given the
following data on the number of people in the PUMS sample who live in each of
the five boroughs of NYC and who commute in each specified manner (where
'other' includes walking, working from home, taking a taxi or ferry or rail).
Bronx |
Manhattan |
Staten Is |
Brooklyn |
Queens |
|
car |
5788 |
2692 |
5526 |
10990 |
16905 |
bus |
3132 |
2789 |
1871 |
4731 |
4636 |
subway |
6481 |
13260 |
279 |
18951 |
14025 |
other |
2748 |
10327 |
900 |
6587 |
4877 |
a.
Find the Joint
Probability for drawing, from this sample, a person from Queens who commutes by
bus. Find the Joint Probability of a
person from the Bronx who commutes by subway.
b.
Find the Marginal
Probability of drawing, from among the people who commute by subway, someone
who lives in Brooklyn. Find the Marginal
Probability, of people who commute by bus, someone who lives in the Bronx.
c.
Find the Marginal
Probability of drawing, from among the people who live in Staten Island,
someone who drives a car to work. Find
the Marginal Probability, of people in Brooklyn, who commute by subway.
d.
Are these two
choices (which borough to live in, how to commute) independent? Explain using the definition of statistical
independence.
7.
Download the PUMS
data on people living in the 5 boroughs.
Run a regression that models the variable, "GRPIP,"
"Gross Rent as Percent of Income," which tells how burdensome are
housing costs for different people.
a.
What are the mean,
median, 25th, and 75th percentiles for Rent as a fraction
of income? Does this seem
reasonable?
b.
What is the
fraction spent on rent by households in Brooklyn? In Queens?
Is the difference statistically significant? Between Brooklyn and the Bronx?
8.
Using the NHANES
2007-09 data (not necessary to actually download), reporting a variety of
socioeconomic variables as well as behavior choices such as the number of
sexual partners reported (number_partners), we want
to see if richer people have more sex than poor people. The following table is constructed, showing
three categories of family income and 5 categories of number of sex partners:
number of sex partners |
||||||
family income |
zero |
1 |
2 - 5 |
6 - 25 |
>25 |
Marginal: |
< 20,000 |
11 |
63 |
236 |
255 |
92 |
______ |
20 - 45,000 |
7 |
117 |
323 |
308 |
117 |
______ |
> 45,000 |
3 |
234 |
517 |
607 |
218 |
______ |
Marginal: |
______ |
______ |
______ |
______ |
______ |
a.
Where is the
median, for number of sex partners, for poorer people? For middle-income people? For richer people?
b.
Conditional on a
person being poorer, what is the likelihood that they report fewer than 6
partners? Conditional on being
middle-income? Richer?
c.
Conditional on
reporting 2-5 sex partners, what is the likelihood that a person is
poorer? Middle-income? Richer?
d.
Explain why the
average number of sex partners might not be as useful a measure as, for
example, the data ranges above or the median or the 95%-trimmed mean.
9.
A random variable
is distributed as a standard normal.
(You are encouraged to sketch the PDF in each case.)
a.
What is the
probability that we could observe a value as far or farther than -0.9?
b.
What is the
probability that we could observe a value nearer than 1.4?
c.
What value would
leave 5% of the probability in the right-hand tail?
d.
What value would
leave 5% in both the tails (together)?
10.
You are in charge
of polling for a political campaign. You
have commissioned a poll of 300 likely voters.
Since voters are divided into three distinct geographical groups, the
poll is subdivided into three groups with 100 people each. The poll results are as follows:
|
|
total |
|
A |
B |
C |
|
number in favor of candidate |
170 |
|
58 |
57 |
55 |
|
number total |
300 |
|
100 |
100 |
100 |
|
std. dev. of poll |
0.4956 |
|
0.4936 |
0.4951 |
0.4975 |
Note
that the standard deviation of the sample (not the standard error of the
average) is given.
a.
Calculate a
t-statistic, p-value, and a confidence interval for the main poll (with all of
the people) and for each of the sub-groups.
b.
In simple language
(less than 150 words), explain what the poll means and how much confidence the
campaign can put in the numbers.
c.
Again in simple
language (less than 150 words), answer the opposing candidate's complaint,
"The biased media confidently says that I'll lose even though they admit
that they can't be sure about any of the subgroups! That's neither fair nor accurate!"
11.
Calculate the
probability in the following areas under the Normal pdf
with mean and standard deviation as given.
You might usefully draw pictures as well as making the
calculations. For the calculations you
can use either a computer or a table.
a.
What is the probability,
if the true distribution has mean -15 and standard deviation of 9.7, of seeing
a deviation as large (in absolute value) as -1?
b.
What is the
probability, if the true distribution has mean 0.35 and standard deviation of
0.16, of seeing a deviation as large (in absolute value) as 0.51?
c.
What is the
probability, if the true distribution has mean -0.1 and standard deviation of
0.04, of seeing a deviation as large (in absolute value) as -0.16?
12.
Using data from
the NHIS, we find the fraction of children who are female, who are Hispanic,
and who are African-American, for two separate groups: those with and those
without health insurance. Compute tests
of whether the differences in the means are significant; explain what the tests
tell us. (Note that the numbers in
parentheses are the standard deviations.)
|
with health insurance |
without health insurance |
female |
0.4905 (0.49994) N=7865 |
0.4811 (0.49990) N=950 |
Hispanic |
0.2587 (0.43797) N=7865 |
0.5411 (0.49857) N=950 |
African American |
0.1785 (0.38297) N=7865 |
0.1516 (0.35880) N=950 |
9.
Using the BRFSS
2009 data, the following table compares the reported health status of the
respondent with whether or not they smoked (defined as having at least 100
cigarettes)
SMOKED AT
LEAST 100 CIGARETTES |
||||
Yes |
No |
Marginal |
||
GENERAL HEALTH |
Excellent |
27775 |
49199 |
____ |
Very good |
58629 |
77357 |
____ |
|
Good |
64237 |
67489 |
____ |
|
Fair |
31979 |
26069 |
____ |
|
Poor |
15680 |
9191 |
____ |
|
Marginal |
____ |
____ |
a.
(10 points) What is the median health status
for those who smoked? For non-smokers?
b.
(10 points) Fill
in the marginal probabilities – make sure they are probabilities.
c.
(5 points) Explain
what you might conclude from this data.
10.
For a Standard
Normal distribution (you are encouraged to sketch the PDF in each case),
a.
what is the area
to the left of -1.5?
b.
what is the area
to the right of 0.2?
c.
what is the area
to the right of -1.6?
d.
what is the area
to the left of -2.2?
e.
what is the area
in both tails farther than 1.7?
f.
what is the area
in both tails farther than -1.4?
g.
what distance from
the mean (in absolute value) leaves 0.17 in both tails?
h.
what distance from
the mean (in absolute value) leaves 0.29 in both tails?
11.
For a Normal
distribution(you are encouraged to sketch the PDF in each case),
a.
with mean 12 and
standard deviation of 4, what is the area to the left of 20.4?
b.
with mean 7 and
standard deviation of 4, what is the area to the right of -0.2?
c.
with mean -12 and
standard deviation of 5, what is the area in both tails farther from the mean
(in absolute value) than -3.5?
d.
with mean 13 and
standard deviation of 2, what is the area in both tails farther from the mean
(in absolute value) than 11.6?
e.
with mean -13 and
standard deviation of 9 what values leave 0.09 in both tails?
f.
with mean -12 and standard deviation of 9 what
values leave 0.97 in both tails?
12.
With the ATUS
dataset, people 20-50 years old with positive earnings were selected and then
grouped into "low-earning" (people in families with earnings below
the 25th percentile) and "high-earning" (people in families with
earnings above the 75th percentile). The
following statistics, the sample average and sample standard deviation, were
calculated by SPSS:
hours shopping per day |
|||
N |
Average |
Std Dev |
|
low earnings |
9372 |
44.70 |
77.97 |
high earnings |
9503 |
46.08 |
78.14 |
a.
What is the
difference in average time spent shopping?
For the null hypothesis of zero difference, form a hypothesis test and
explain the result.
b.
What is a
confidence interval for the difference?
c.
What is the
p-value of the difference?
13.
With the ATUS
dataset, people 20-50 years old with positive earnings were selected and then
grouped into "low-earning" (people in families with earnings below
the 25th percentile) and "high-earning" (people in families with
earnings above the 75th percentile). The
following statistics, the sample average and sample standard deviation, were
calculated by SPSS:
hours watching TV per day |
|||
N |
Average |
Std Dev |
|
low earnings |
9372 |
2.31 |
2.40 |
high earnings |
9503 |
1.90 |
2.01 |
a.
What is the
difference in average time spent watching TV?
For the null hypothesis of zero difference, form a hypothesis test and
explain the result.
b.
What is a
confidence interval for the difference?
c.
What is the
p-value of the difference?
14.
SPSS produces the following output from the
CPS data, a crosstab of income category with kids in the household. "Low family income" means that the
person is in a family with income in the lowest quartile; middle means income
in the next two quartiles; high is in the top quartile. Each household is classified with either no
children, children under 6, or children under 18 but not under 6. (At 6 years old, children must start school.)
Count |
|||||
|
|
children in hh categories |
Total |
||
|
|
no kids |
kids under 6 |
kids older than 6 but
less than 18 |
|
family income categories |
low family income (less than 25th percentile) |
33782 |
10417 |
9712 |
53911 |
mid family income (25th - 75th percentile) |
41349 |
28450 |
33409 |
103208 |
|
high family income (more than 75th percentile) |
16964 |
13988 |
21731 |
52683 |
|
Total |
92095 |
52855 |
64852 |
209802 |
a.
What is the
marginal probability for a household without children to have a low family
income? What is the marginal probability
for a household without children to have a high family income?
b.
What is the marginal
probability for a household with a high
family income to have children under 6 years old? What is the marginal probability for a
household with low family income to have children under 6?
15.
Using the ATUS
dataset that we've been using in class (download from Blackboard), form a
comparison of the mean amount of time spent on religious activity by two groups
of people (you can define your own groups, based on any of race, ethnicity,
gender, age, education, income, or other of your choice).
a.
What are the means
for each group?
b.
What is the
standard deviation of each mean? What is
the standard error of each mean?
c.
What is a 95%
confidence interval for each mean?
d.
Is the difference
statistically significant? Explain
carefully. What can be concluded from
this?
16.
Calculate the
probability in the following areas under the Standard Normal pdf with mean of zero and standard deviation of one. You might usefully draw pictures as well as
making the calculations. For the
calculations you can use either a computer or a table.
a.
What is the probability, if the true distribution is a Standard
Normal, of seeing a deviation from zero as large (in absolute value) as 1.9?
b.
What is the probability, if the true distribution is a Standard
Normal, of seeing a deviation from zero as large (in absolute value) as -1.5?
c.
What is the probability, if the true distribution is a Standard
Normal, of seeing a deviation as large (in abs0lute value) as 1.2?
17.
Calculate the
probability in the following areas under the Normal pdf
with mean and standard deviation as given.
You might usefully draw pictures as well as making the
calculations. For the calculations you
can use either a computer or a table.
a.
What is the probability, if the true distribution has mean -1 and
standard deviation of 1.5, of seeing a deviation as large (in absolute value)
as 2?
b.
What is the probability, if the true distribution has mean 50 and
standard deviation of 30, of seeing a deviation as large (in absolute value) as
95?
c.
What is the probability, if the true distribution has mean 0.5 and
standard deviation of 0.3, of seeing a deviation as large (in absolute value)
as zero?
18.
A paper by Chiappori, Levitt, and Groseclose
(2002) looked at the strategies of penalty kickers and goalies in soccer. Because of the speed of the play, the kicker
and goalie must make their decisions simultaneously (a Nash equilibrium in
mixed strategies). For example, if the
goalie moves to the left when the kick also goes to the left, the kick scores
63.2% of the time; if the goalie goes left while the kick goes right, then the
kick scores 89.5% of the time. In the
sample there were 117 occurrences when both players went to the left and 95
when the goalie went left while the kick went right. What is the p-value for a test that the
probability of scoring is different?
What advice, if any, would you give to kickers, based on these
results? Why or why not?
19.
A paper by Claudia Goldin
and Cecelia Rouse (1997) discusses the fraction of men and women who are hired
by major orchestras after auditions. Some
orchestras had applicants perform from behind a screen (so that the gender of
the applicant was unknown) while other orchestras did not use a screen and so
were able to see the gender of the applicant.
Their data show that, of 445 women who auditioned from behind a screen,
a fraction 0.027 were "hired".
Of the 599 women who auditioned without a screen, 0.017 were hired. Assume that these are Bernoulli random variables. Is there a statistically significant
difference between the two samples? What
is the p-value? Explain the possible
significance of this study.
20.
Another paper, by
Kristin Butcher and Anne Piehl (1998), compared the
rates of institutionalization (in jail, prison, or mental hospitals) among
immigrants and natives. In 1990, 7.54%
of the institutionalized population (or 20,933 in the sample) were
immigrants. The standard error of the
fraction of institutionalized immigrants is 0.18. What is a 95% confidence interval for the
fraction of the entire population who are immigrants? If you know that 10.63% of the general
population at the time are immigrants, what conclusions can be made? Explain.
21.
Calculate the
probability in the following areas under the Standard Normal pdf with mean of zero and standard deviation of one. You might usefully draw pictures as well as
making the calculations. For the
calculations you can use either a computer or a table.
a.
What is the
probability, if the true distribution is a Standard Normal, if seeing a value
as large as 1.75?
b.
What is the
probability, if the true distribution is a Standard Normal, if seeing a value
as large as 2?
c.
If you observe a
value of 1.3, what is the probability of observing such an extreme value, if
the true distribution were Standard Normal ?
d.
If you observe a
value of 2.1, what is the probability of observing such an extreme value, if
the true distribution were Standard Normal ?
e.
What are the
bounds within which 80% of the probability mass of the Standard Normal lies?
f.
What are the
bounds within which 90% of the probability mass of the Standard Normal lies?
g.
What are the
bounds within which 95% of the probability mass of the Standard Normal lies?
22.
Consider a
standard normal pdf with mean of zero and standard
deviation of one.
a.
Find the area
under the standard normal pdf between -1.75 and 0.
b.
Find the area
under the standard normal pdf between 0 and 1.75.
c.
What is the
probability of finding a value as large (in absolute value) as 1.75 or larger,
if it truly has a standard normal distribution?
d.
What values form a
symmetric 90% confidence interval for the standard normal (where symmetric
means that the two tails have equal probability)? A 95% confidence interval?
23.
Now consider a
normal pdf with mean of 3 and standard deviation of
4.
a.
Find the area
under the normal pdf between 3 and 7.
b.
Find the area under
the normal pdf between 7 and 11.
c.
What is the
probability of finding a value as far away from the mean as 7 if it truly has a
normal distribution?
24.
If a random
variable is distributed normally with mean 2 and standard deviation of 3, what
is the probability of finding a value as far from the mean as 6.5?
25.
If a random
variable is distributed normally with mean -2 and standard deviation of 4, what
is the probability of finding a value as far from the mean as 0?
26.
If a random
variable is distributed normally with mean 2 and standard deviation of 3, what
values form a symmetric 90% confidence interval?
27.
If a random
variable is distributed normally with mean 2 and standard deviation of 2, what
is a symmetric 95% confidence interval?
What is a symmetric 99% confidence interval?
28.
A random variable
is distributed as a standard normal.
(You are encouraged to sketch the PDF in each case.)
a.
What is the
probability that we could observe a value as far or farther than 1.7?
b.
What is the
probability that we could observe a value nearer than 0.7?
c.
What is the
probability that we could observe a value as far or farther than 1.6?
d.
What is the
probability that we could observe a value nearer than 1.2?
e.
What value would
leave 15% of the probability in the left tail?
f.
What value would
leave 10% of the probability in the left tail?
29.
A random variable
is distributed with mean of 8 and standard deviation of 4. (You are encouraged to sketch the PDF in each
case.)
a.
What is the
probability that we could observe a value lower than 6?
b.
What is the probability
that we could observe a value higher than 12?
c.
What is the
probability that we'd observe a value between 6.5 and 7.5?
d.
What is the
probability that we'd observe a value between 5.5 and 6.5?
e.
What is the
probability that the standardized value lies between 0.5 and -0.5?
30.
You know that a
random variable has a normal distribution with standard deviation of 16. After 10 draws, the average is -12.
a.
What is the
standard error of the average estimate?
b.
If the true mean
were -11, what is the probability that we could observe a value between -10.5
and -11.5?
31.
You know that a
random variable has a normal distribution with standard deviation of 25. After 10 draws, the average is -10.
a.
What is the
standard error of the average estimate?
b.
If the true mean
were -10, what is the probability that we could observe a value between -10.5
and -9.5?
32.
You are consulting
for a polling organization. They want to
know how many people they need to sample, when predicting the results of the
gubernatorial election.
a.
If there were 100
people polled, and the candidates each had 50% of the vote, what is the
standard error of the poll?
b.
If there were 200
people polled?
c.
If there were 400
people polled?
d.
If one candidate
were ahead with 60% of the vote, what is the standard error of the poll?
e.
They want the poll
to be 95% accurate within plus or minus 3 percentage points. How many people do they need to sample?