Answer all questions following a similar format of the answers to your tutorial questions. That is, when you use Stata (R) to conduct empirical analysis, you should show your Stata (R) commands and outputs (e.g., screenshots for commands, tables, and figures). When you are asked to discuss or interpret, your response should be brief and compact. To facilitate the grading work, please clearly label all your answers.
The marking system will check the similarity, and UQ’s student integrity and misconduct policy on plagiarism apply.
Question 1: OLS, TSLS, and Panel Data Regression
Consider the following linear panel data model
where (xit,1; xit,2; xit,3) are explanatory variables, uit is unobservable error, and (β0, β1, β2, β3) are unknown parameters of interest. As usual, i = 1, …., N refers to individuals (id, cross-sectional units) and t = 1, …., T refers to time periods. Use the data file Q1data.dta (provided) to answer the following questions. Unless otherwise specified, use 5% as the significance level for all the tests below.
(f) (10 points) To capture potential time effects, consider the following model
where ds,t are time dummies (ds,t = 1 if s = t, and 0 otherwise). Note that the sample includes data from t = 1 to t = T, but (2) includes only dummies for t = 2 to t = T. Why?
Estimate (2) using TSLS with zit,1 and zit,2 as IV and test if time effects are significant, i.e., at least one γt are not zero. With time effects controlled, do you think xit,1 is still an endogenous regressor? [Hint: Use OLS and TSLS to estimate (2) and compare their estimates.]
(g) (10 points) Suppose that vit = αi + eit with . Re-write (2) as
Treat αi as fixed effects (FE). Use an FE estimator to estimate (3) (To report the estimation results, you only need to post the FE regression table returned by Stata). Justify the fact that the FE estimator cannot estimate all slope coefficients. Compare the FE estimates with the TSLS estimates obtained in (f). Comment on your findings.
Question 2: Binary Response Model
In April 2008, the unemployment rate in the United States stood at 5%. By April 2009, it had increased to 9%, and it had increased further, to 10%, by October 2009. Were some groups of workers more likely to lose their jobs than others during the Great Recession? For example, were young workers more likely to lose their jobs than middle-aged workers? What about workers with a college degree versus those without a degree or women versus men? The data file employment 08-09.dta (provided) contains a random sample of 5440 workers who were surveyed in April 2008 and reported that hey were employed full-time. A detailed description is given in employment 08 09 description.pdf (provided). These workers were surveyed one year later, in April 2009, and asked about their employment status (employed, unemployed, or out of the labor force). The data set also includes various demographic measures for each individual. Use these data to answer the following questions.
(f) (10 points) The data set includes variables measuring the workers’ educational attainment, sex, race, marital status, region of the country, and weekly earnings in April 2008. Repeat (a)-(c) using these factors as additional regressors and construct a table like Table 11.2 in SW (pp. 410-411) to investigate whether the conclusions on the effect of age on employment from (a)-(c) are affected by omitted variable bias. Use the regressions in your table to discuss the characteristics of workers who were hurt most by the Great Recession. [Hint: You will need to generate dummies for race groups and use logarithm of weekly earnings.]