Refresher on Research Methods Kevin R Foster, CCNY Fall 2012 |
|
Topics include:
1.
Basics
about
2. Plagiarism and rules for proper citations
3. Internet searches and databases (EconLit, JStor, Lexis/Nexus)
4. Sources of data for economics
5. Basics of computers: MS Word & Excel (including equation editor and Analysis ToolPak)
6. Math Review (percentage growth, exponents, ex and ln(x))
7. Crash lesson on derivatives ("double the coefficient rule")
Below are
some (random) thoughts on each one. Note
that this online version is an ongoing draft check periodically to get the latest
version. This is a draft, so please be
generous in sending suggestions, corrections, or errata to
kfoster@ccny.cuny.edu.
Don't just print
out the whole pdf file, please
|
1. Basics |
Welcome to the CCNY MA Economics program! I have a few items here to give advice to new students. Please excuse me if they are too basic, but I want to make sure that everyone knows about these.
2. Plagiarism |
Read CUNY's policy on acadmic integrity (on the course syllabus and CCNY web page).
see also http://www.dartmouth.edu/~sources/, http://www.princeton.edu/pr/pub/integrity/index.html http://www.economics.harvard.edu/files/WritingEconomics.pdf
RULES:
example of cite:
"The full time earnings of black and Hispanic women averaged less than 60 percent of white male earnings" (Ehrenberg and Smith, p. 417).
example of long quote with cite:
Word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word word (BlabBlab, p. 10000)
example:
In 1997, Hispanic females earned 51% of what White males earned (Ehrenberg and Smith, p. 417).
examples:
Ehrenberg, Ronald G., and
Robert S. Smith, (2000). Modern Labor Economics: Theory and Policy, seventh
edition.
Heckman, James, (1974). "Shadow Prices, Market Wages, and Labor Supply," Econometrica, 42(4).
Bureau of Labor Statistics, (2000). Data on "1998 Internet and Computer Use," downloaded from http://www.bls.census.gov/cps/computer/1998/sdata.htm
Vames, Steven, (2000). "Income Gains in March Outpaced Spending," New York Times Online, April 28, 2000. http://www.nytimes.com/yr/mo/day/news/financial/28tsc-economy.html
Failure to follow these rules is a violation of Academic Integrity. This is a severe violation of the basic principles of the academic community. You may be brought up on academic charges before a Disciplinary Committee of the College, where you are subject to a range of consequences up to expulsion.
3. Internet Resources |
An internet search on a given topic will return a wide variety of hits. The most difficult task is to differentiate the junk (most of it) from the few bits of useful information. Since you are a student, just beginning to learn the field, it is only to be expected that you will have a more difficult time distinguishing the good from the bad. You must be wise, dutiful in checking out sources, and should ask questions.
You might
usefully create a web page of your own.
This takes about 30 minutes to learn, and removes any mystery. Some students still think that publishing
online is difficult, so only very high-quality material should be online FALSE.
Any imbecile can put any damn thing online
and we do!
You've got to be careful. Blogs
and wikis have lowered the bar even further.
A reader needs to be careful and critical of every source.
There are certain
sources that have filtered out much of the worst junk. You can limit your search to only articles
published in refereed journals by searching online databases (from the CCNY
Library), such as EconLit and JStor. Of
course not everything that is published is correct you must still be diligent in finding recent
sources, making your own evaluation of the plausibility of the claims, and
arriving at your own judgments.
Both of these links are easily found from the CCNY Library's page, where you can pick them from the list. You need to access them from a CCNY computer, or else you will need a login (which the library can provide you, however this takes time so don't wait for the night before the paper is due!)
EconLit collects citations, most with a detailed abstract, and a large fraction have full text available. "Full text" means that you don't have to find the physical journal; you never touch paper. Just save the .pdf file that it produces. Sometimes there is no linked fulltext but a Google search of author/title will hit.
A hint: one of my
favorite journals to recommend to students is the Journal of Economic
Perspectives (JEP). This gives
excellent overviews of particular topics in economics, meant to be accessible
to a non-specialist, written by some of the most prominent people in those
fields. It is published by the American
Economic Association (AEA) and is available through both EconLit and
JStor. The Journal of Economic
Literature (JEL) if also from the AEA and it also has occasional articles
that summarize a topic. The library has
both journals on the shelves you can browse through these journals, just
skimming to find interesting articles.
It's a great way to spend a few hours!
JStor has the full text of articles published in the foremost journals of various disciplines (including Economics, Finance, and Statistics). Generally these articles are at least 3-5 years old, but it gives access to every article in the most important journals for the past several decades.
For news stories, you can search Lexis/Nexus. This collects the full text of major newspapers, including the New York Times and the Economist. Again, you need to access them from a CCNY computer, or get a login to work remotely.
There are other outlets, such as prominent and well-regarded thinktanks and policy institutions. In economics, the National Bureau of Economic Research (NBER, at www.nber.org) is highly regarded, as is the Brookings Institution (www.brookings.edu).
Of course all of these sources give "the establishment view" not the ideas and opinions of extremists. This is true by definition: formerly extreme views become mainstream once "the establishment" has published them. I do not want to discourage you from research on the fringes, however many classes at this College will require that you demonstrate a knowledge of the mainstream. (Marx and Keynes began their radical writings by first demonstrating their knowledge of what had been written previously, to show where it had holes.)
Finally, the CUNY library catalog is CUNY+. I often find it useful to click on the tab for "Select Individual CUNY Libraries" to search only CCNY's library. Then, if this one doesn't have it, you can search the other CUNY schools.
4. Sources of data for economics |
There are many sources of econmics data online. Although the principal sources change depending on the field, here are some of the basics:
http://rfe.org RFE is Resources for Economists, sponsored by
the American Economic Association. It is
a great site, a clearinghouse for all of the important econ sites.
http://www.federalreserve.gov The Federal Reserve
http://www.ny.frb.org our own
http://www.census.gov/ the
[or http://www.census.gov/main/www/subjects.html is a list of topics on which the Census Bureau has data.]
http://stats.bls.gov/ is the Bureau of Labor Statistics
http://www.oswego.edu/~economic/data.htm is a good portmanteau of links from SUNY Oswego
http://www.worldbank.org/lsms/ World Bank
CPI Chained http://www.bls.gov/cpi/super_cpi.pdf
Overview of BLS data www.bls.gov/bls/inflation.htm
GDP Deflator (from BEA) www.bea.gov/bea/dn/home/gdp.htm
Overview of BEA data http://www.bea.gov/
GDP www.bea.gov/bea/dn/home/gdp.htm
NIPA tables www.bea.gov/bea/dn/home/gdp.htm
Unemployment www.bls.gov/cps/home.htm
Wages www.bls.gov/bls/employment.htm
BLS details on coverage www.bls.gov/opub/hom/homch1_a.htm
5. Basics of computers: MS Word & Excel (including equation editor and Analysis ToolPak) |
Scary basic...
You will be using a few computer programs a lot while you're in college. The most common programs that an econ student would use are:
- an internet browser, like Microsoft's Internet Explorer or Mozilla's Firefox or Google Chrome or Apple Safari or Opera or ....
- a word processing program, probably Microsoft Word although OpenOffice gives a free version
- a spreadsheet, probably Microsoft Excel although OpenOffice gives a free version
- a presentation program, such as Microsoft PowerPoint or OpenOffice version
- data analysis programs such as SAS or SPSS (for econometrics)
- mathematical programs such as Matlab or Gauss (for advanced work)
I recommend OpenOffice to you, especially if you have a little netbook. It does most everything that you would pay Microsoft for but it's free.
The internet browser is one of the most basic, which you've used if you've ever read a web page. In the Econ computer lab, just double click (click the left-hand button on the mouse twice in quick succession) when the arrow (moved by the mouse) is on the picture above the words "Internet Explorer." From there you can click to find the world!
The word processor is what you'll use to write papers. You basically just type the text that you want. It has a few extras, for instance you can click on "Insert" (on the bar on top) and choose to insert pictures (if you have them on disk) or charts from a stats program. The other common thing to insert is an equation. Getting to the equation editor requires going to "Insert" then "Object…" then find "Microsoft Equation" under the "Create New" tab. Equation Editor is useful to produce text such as:
or whatever ferocious-looking stuff you want to create. Read the Help section to figure out all of the details of how to create equations with subscripts, superscripts, sum and integral signs with the proper notations, etc.
The more practice you have with MS Word, the better you'll get, using things like bold, italics, sub- or super- scripts, new fonts, tables, pictures, graphs, etc. You can get tips from other students, who were once as new to computing as you are now! There's no shame in not knowing something; the shame is to choose to remain ignorant.
If you go to "File" then "New" you can find templates for common document types: they have some decent resume templates, for instance. Of course if you're a professional graphic designer you can do better, but if you're a novice you can do worse!
The next common MS program is Excel, for spreadsheets. This is just a fancy calculator that can do a great deal of work for you. You might use it in many classes. Suppose you have data on some statistic that x = 100, 110, 121, 133. You type those four values into the spreadsheet like this:
100 |
110 |
121 |
133 |
Suppose you want
to find the percent growth. The formula
for percent growth from x0 to x1 is . So in the next column we want to type a
formula to find this growth rate. The
first row we leave blank, but the next row we type the equals sign,
"=", which tells the program that you're entering a formula.
100 |
|
110 |
= |
121 |
|
133 |
|
Then after typing the equal sign (but before hitting return or some other key) we can just click to reference any other value. We want to find the difference so we click on the cell where we typed "110" then the minus key, "-", then click on the cell where we have 100. Note that Excel will fill in its version of the formula: if you typed the numbers in column A and the formula in column B, then the formula in the second row of column B (which is B2) will read "=A2-A1" which tells Excel to subtract the value in A1 from the value in A2. It will look like:
100 |
|
110 |
=A2-A1 |
120 |
|
130 |
|
Now hit enter and Excel will calculate the value. Next you can then copy the formula (click on the cell with the formula you want then "Edit/Copy") and paste it in to the other cells, getting:
100 |
|
110 |
10 |
121 |
11 |
133 |
12 |
For only four numbers that's not a big time-saving but you can imagine if you have, say, quarterly data on GDP back to 1948. To find the percent change we next take the difference and divide by the initial value, so in the second row of column C we type "=" again, then click on the cell with 10 then the divide "/" then click on the cell with 100. Now Excel creates the formula "=B2/A1", or 0.1. Copy this down so get:
100 |
|
|
110 |
10 |
0.1 |
121 |
11 |
0.1 |
133 |
12 |
0.099174 |
If you forget how to convert decimals to percents you just highlight the cells in column C that you want to format, click on the "%" button, and Excel does it for you.
Of course you would have done the calculation in one step by entering the formula (in D2) as "=(A2-A1)/A1".
You will quickly
figure out that you can write much more complicated formulas, which can use
Excel's embedded formulas: from simple ones like "=sqrt(A2)" to
ln(A3) to particular Excel formulas (check the Help menu or the button marked to see a list) for stats like Average,
financial formulas for annuities or time value of money problems, even some
that you probably haven't heard of.
Excel will plot figures ("Insert" then "Chart" to see the many types) and do basic data analysis ("Tools" then "Data Analysis" for regressions, histograms, lots of the basics). You can use it to present and analyze small datasets. (Where "small" is hard-coded into older versions of the program to mean less than about 65,000 observations.) Macro data is often short enough to fit (even monthly data on, say, industrial production, is available for maybe 70 years so that's just 840 obs!) while many other datasets are far larger.
But note that you
must have the "Analysis TookPak" installed, if you're using Excel at
home. For Excel 2003 you can just find
"Add-in's" and follow those instructions; for other versions you
should just search under the Help menu for Analysis TookPak (note the strange
spelling another reason to condemn Bill Gates).
6. Math Review (percentage growth, exponents, exp and ln(x)) |
Percent:
Remember that a percent is just a convenient way of writing a decimal. So 15% is really the number 0.15, 99% is 0.99, and 150% is 1.50. I assume you know this, but sometimes it can get confusing. For instance, we might want to find 10% of 10%. We CANNOT just multiply 10*10, get 100, and leave that as the answer! Rather we first convert them to decimals and then multiply: so 0.10 * 0.10 = 0.01 = 1%.
Examples: (click on the ? to get the answers).
30%*7% + 70%*5% = ?
50%*10% + 50%*3% = ?
Percentage Change (Growth Rate):
Consider GDP in the Great Depression. The data (billions of $) are:
1929 |
103.8 |
1930 |
91.1 |
1931 |
76.4 |
1932 |
58.6 |
1933 |
56.2 |
We want to figure out the percentage change in GDP during the Depression. We can see that, from 1929 to 1930, US GDP fell by $(103.8 - 91.1) bn = $12.7 bn. But how big is this? From the first to the second quarter of 1997, US GDP grew by $108.4 bn (from 7955 to 8063.4). The economy has gotten much larger in the intervening span of time. The percentage change in GDP tells us how large the fall in GDP was, relative to the total amount of $103.8bn available to be lost.
So we calculate the percentage
change in GDP as (91.1 - 103.8)/103.8, which is -0.1223. To make this decimal
into a percentage, we multiply by 100 (move the decimal point 2 spots to the
right), so this is a growth rate of -12.23% (or a fall of 12.23% -- sometimes
we use a word instead of the negative sign). In general, if we have values for
some series, x, and write x1, x2, x3, ...
(where the subscript number refers to the time period) the percent change from
time 1 to time 2 is 100 * .
We also sometimes write this as the change in x (
) divided by the level of x, or 100*
.
Finally, we can also calculate the percentage change as 100*
which is the same thing (as a bit of algebra
should be able to convince you).
Examples:
What are the percent changes for US GDP in the years of the Great Depression? Answer.
Calculus and Percentages:
Calculus (for those of you who
have taken the course) can simplify our use of percentages and percentage
changes. Remember that the derivative gives us the change in the value, so .
Then remember that the derivative of the natural logarithm is
,
which is the formula for the percent change. Thus the derivative of the natural
log is the percentage change (another reason that logarithms are your friend!).
Percent:
30%*7% + 70%*5% = 0.3*0.07 + 0.7+0.05 = 0.021 + 0.035 = 0.056 = 5.6%
50%*10% + 50%*3% = 0.5*0.1 + 0.5*0.03 = 0.05 + 0.015 = 0.065 = 6.5%
Percent Change in
1929-30 |
-0.12235 |
1930-31 |
-0.16136 |
1931-32 |
-0.23298 |
1932-33 |
-0.04096 |
Review of Math: exponents, the mysterious e, and natural logarithms
You might not have used exponents recently so you may have forgotten the details. To review, recall that exponents are simply a compact way of writing out multiplication (just as multiplication just means adding many times). So 102 is 10 multiplied by itself twice, 10*10=100. Then 103 = 1000, 33 = 27, and 25 = 32. (Note that some computer programs, where superscripts are difficult, use the symbol ^ so they would write 10^3, 3^3, and 2^5. Both these are often used when the exponent is a complicated function, so that the poor reader doesn't strain her eyes reading tiny superscript type.)
That is pretty simple, but then the mathematicians asked, what if these exponents are not nice counting numbers but zero, or negative, or fractions? First, any number raised to a zero exponent is defined as being equal to 1. Negative exponents are defined as raising the reciprocal to that power, so 10-2 is 1/100 = 0.01, 3-3 = 1/27, and 2-5 = 1/32.
Note that for numbers in decimal format, exponents mean moving the decimal point left or right: 3.1415*102 = 314.15; 3.1415*10-2 = 0.031415.
Fractional
exponents are defined analogously to their whole-number relatives: if 102
is 10*10=100 then "un-does" this, taking the square
root so getting back 10. Again, so this
means that
and
.
From these examples we can figure out some of the basic rules for what happens when two numbers, with a common base, are multiplied together or when a base-exponent pair is itself raised to a power.
First, what is 23 * 22 = ? By the definition of exponents, 23 = 2*2*2 and 22 = 2*2 so 23 * 22 = (2*2*2)*(2*2) = 25. This rule generalizes (not all examples do!) to the form that, given a base x, exponent A and exponent B,
.
This holds as long as x, A, and B are real numbers (positive, negative, fractions, zero, whatever).
Next, what if we
had, instead, =?
Again we just write out the simple meanings of those exponents. Working from outside in, any number that is
squared is simply multiplied by itself so
. The general rule, for a base, y, and
exponents C and D, is that:
.
Again, this holds
for all real numbers. (I'm ignoring
imaginary numbers only because economists don't often use those contrary to many criticisms!)
You can see that the rules for exponents are, in some sense, taking the operations down a rank: multiplying two numbers with exponents means adding exponents; taking exponents of a number with exponents means multiplying the exponents.
Let me point out a couple of common applications of these rules (and a couple of others):
Next is
logarithms. These arise quite naturally
if we want to un-do the operations of exponents. From knowing that 103 = 1000,
suppose we asked, 10 to what power equals 1000?
We could write 10x = 1000 or that log10(1000) =
x. The difference is only notational.
The Mysterious "e"
Mathematicians have come to depend on one particular special number when working with exponents, which is notated e. It's like pi (π), 3.1415… because it's a number (called a transcendental) that goes on forever and seems quite mysterious until you've worked with it a while and gotten used to it.
For now, take
this information on faith, don't worry too much about where this number comes
from. (If you want to prove that
mathematics has logical holes, that might possibly all come crashing down,
you're too late Gödel did it decades ago!) The number, e, is the area under a hyperbola
(the equation y=1/x) rightwards from 1. Where
π is a basic measure for a
circle (it is the area of a circle with radius of 1), e is a basic measure for
calculus, because ex is a function of x with derivative ex.
The mysterious e can be understood in the context of finding interest rates. Suppose you had $100 to invest at a 10% interest rate. After a year, it would clearly be worth $110. But suppose that another bank offered the same 10% rate, but with a bit of a difference: after six months they calculate the interest up to then (the 5%) and then again after six months they do the same. So you get the effect of compounding: after one year, you'd have $110.25. What if another bank offered to "compound" three times? Four? More? In the limit, compounding daily or hourly or minutely, approaching "continuous compounding" which would get you a final balance of 100*e0.1 = 110.5171 (you could see this with a simple Excel sheet).
Now, in the real world, banks don't generally offer these continuously compounded rates, but it's nice for our theories. It also comes in handy for growth and decay processes and solving differential equations. The mathematicians have learned to love it, and even have figured out that eiπ (using i, the value of the square root of negative one, and pi) is 1, which brings together some of the strangest bits of math.
The inverse
function, to un-do what ex does, is the natural logarithm, denoted
ln(x). This is the same as loge,
log with base e. The notation, ex=y,
is asking, taking e to what power (y), will give me an answer of x. The "natural" logarithm is natural
because it uses the same number, e, that we figured out above as having
desirable features for growth. The main
problem to watch for (particularly if you're making transformations of one
variable to another) is that e cannot be raised to any power to make it equal
to zero or less (negative). So ln(0) or
ln(-1) return nonsense answers (either
Outside finance, the macroeconomists probably make the greatest use of ex and ln(x), because of the connection with growth. Suppose that an economy's GDP grows at 5% per year, every year, for 50 years (assuming continuous compounding, so that by the end of the 50 years, it's grown to nearly 12 times its original size. A graph of raw GDP would look like this (starting at 1):
shooting up to ever-larger values, quickly becoming difficult to work with. (For instance, can you tell, just from the graph, how big the economy is after 25 years? It's somewhere between 2 and 4 times as big as it started, but that's about all that you could tell.) So taking the natural log of GDP would return a nice series like this:
It's transformed from a difficult graph to a simple line. This is equivalent to plotting the previous graph on a log scale.
If the "log" referred to were to a base 10, then the graph would essentially be plotting how many zeros were in the GDP: from 1 to 10 is adding a zero, from 10 to 100 is adding another, etc. When the "log" referred to is base e, then the interpretation as "zeros before the decimal" is no longer quite correct, but it's similar.
GDP values often
have these growth paths, along with financial values (so the value of the Dow
Jones average of the
Politicians often play games with these sorts of numbers (knowingly or unknowingly) to confuse innumerate people. Suppose an economy has been growing at 5% per year for many years, attaining last year's lucky GDP of 8888 per person. In the last year, a new political party took power and now GDP per person is 9188. The new political party might brag that they "grew" GDP by 300, and show a graph like this:
The slow-down in growth is tough to see. But the growth rate did actually slow, which a log-chart would show, so that:
Where you can see that the line sags noticeably in the last year charted. A 5% growth rate would have gotten the economy to 9332 not 9188, so the new policies could be criticized for stopping a twenty-year record of steady growth.
7. Crash lesson on derivatives ("double the coefficient rule") |
The derivative gives us the slope of a function at a particular point.
A line always has the same slope, so if the equation for a line is:
then the slope is
at every point. (When X rises by 1, Y rises by m.)
But other lines have different slopes at different places. Consider the function,
. This is a parabola, a smooth curve that, near
zero, has a very low slope, but then gets steeper and steeper. Can we find a neat equation that tells us
exactly how much steeper? Yes: the derivative. The derivative of
is
-- we take the coefficient on the x-squared
term (which is 1) and double it.
If there were some other coefficient, say:
,
then the derivative would be
.
If there were both linear and squared terms added together, then we would proceed item-by-item:
has derivative
.
[Note: this only works for items added together.]
Want to know
more? Take calculus it's fun and healthy!