ECON30130 经济分析

ECON30130 Econometrics

R Project – Deadline April 10
Dr Benjamin Elsner
Rules & Guidelines
Ground rules

This assignment counts 30% of your final grade. You have to work through a set of tasks using R, and write
up your answers using Word, LaTeX, or R Markdown. The rules are as follows:
 Below you will find a set of tasks. Please answer all questions and work through all tasks. There is no
word or page limit, but please be concise.
 Deadline is April 10, 2022, at 11:59:59pm
 For late submissions, UCD’s Late Submission of Coursework policy applies.
 Papers are to be submitted on Brightspace → Assessment → Assignments
 Submissions should be in one pdf, and should include: 1) the write-up of the assignment, 2) the R code.
 Students are allowed to work in groups of up to five. If students work in a group, only one group
member should submit the paper on Brightspace. On the first page of the paper it should be clearly
stated that this was a group project and the names and student numbers of the group members should
be given.
 UCD’s Student Plagiarism Policy will apply. I reserve the right to run plagiarism checks on Brightspace.
 Questions should be posted on Brightspace.
 A solution will not be provided after the deadline.

Grading

Students will receive a letter grade for this assignment. Grading is based on the following criteria:
Correctness of the analysis and interpretations
Writing (clear and concise)
Exposition: are graphs and tables done well? They don’t need to look fancy, but it has to be clear
what is shown. For regression tables, please use stargazer or alternative packages that give you nicely
formatted regression tables.
Bonus: a higher grade (1 notch, e.g. from B+ to A-) is given if all of the following are done:
project written with R Markdown (can be done via RStudio); please indicate on the first page if

  1. you do so; for an introduction, see here
    all graphs and tables have been programmed with R, i.e. no copy & paste anywhere
    all graphs done with ggplot (but not with the default grey background);
    tidyverse functions (especially the pipe operator) are frequently used.
    Some tips
    The aim of this assignment is to get students to “figure things out.” In the tutorials, clear instructions and
    coding examples were given along with a clean data set. However, this is far away from the work data analysts
    1
    are doing. Their projects typically have a clear goal, but the data are often messy and it is unclear how to
    reach the goal of the analysis. Simply put, the analyst has to “figure things out”: how to best clean the
    data set, how to best visualise data, how to bring the data into a format that is suitable for visualisation
    and regression analysis, etc. If you’re working in a company, you neither refuse to do a project because “we
    haven’t learned about a certain procedure in class”, nor can you run to your manager with every little error
    message you encounter. Ultimately, data analysts are paid for solving problems themselves or collaboratively
    with team members. The sooner you get into that mindset, the better. This assignment is similar to a project
    one would encounter in a data analytics project.
    How to figure things out?
    Google is your friend. Get a strange error message? Type it into google; chances are someone else
    had the problem before. You can also search StackOverflow, the forum for all things programming (R,
    Python, C++, etc)
    If one solution doesn’t work, try another one. Solving problems is often frustrating; it takes time and
    a decent bit of grit. So if you encounter a problem, solve it or find a way around. There is always a
    solution!
    Preparation
    For some of the tasks below, you will need to know how to incorporate binary variables into a regression.
    Once you know how regression works, this is pretty straightforward. Here are some sources you may want to
    consult:
    ? When a regressor is a dummy: Chapter 5.3 in Stock & Watson; here is a good video
    ? When the dependent variable is a dummy (also called linear probability model): Chapter 11.1. in Stock
    & Watson. See also this video, this video and this video. The latter video is based on Stock & Watson’s
    materials.
    A. Theory Tasks
    Suppose you want to quantify the extent of discrimination in an online market. You have data on all the
    sales of a given product (say a smartphone) that took place on an online auction site in the U.S. in 2015.
    You observe whether a product was sold, at what price, and whether the seller is a member of an ethnic
    minority. On the auction site, consumers don’t directly observe minority status, but they can infer it from
    the first names of the sellers.
  2. Suppose you want to estimate the effect of minority status (i.e. a dummy that equals one if a person
    belongs to a minority and zero if the person is white) on the sale price. Write down a regression equation
    that would allow you to estimate this effect.
  3. Explain what parameter you are interested in estimating and provide an interpretation of this parameter.
  4. Discuss the random sampling assumption and the conditional independence assumption (in the lecture
    it was E(u|X) = 0). Are these assumptions fulfilled in this case (explain why or why not)? Explain
    intuitively the likely consequences of these assumptions (not) being fulfilled for estimating the effect of
    interest.
  5. If you could run an experiment (regardless of ethical considerations) to estimate the effect of interest,
    what would this experiment look like and why? (N.B.: the ideal experiment asked for here is different
    from an experiment described further below.)
    B. Empirical Analysis
    Introduction
    Jennifer Doleac and Luke Stein ran an experiment on ebay small ads, a platform that lists classified ads in
    local markets in the U.S. (similar to adverts.ie in Ireland). In their experiment, they put up ads for new ipod
    nanos. Their goal was to study whether buyers discriminate between black and white sellers, i.e. whether
    they are less likely to contact a black seller, make lower offers and are less friendly in correspondence. To
    experimentally vary the race of the seller, they showed the ad listings with a photo in which the same ipod is
    held by a white hand, a black hand, or a white hand with a wrist tattoo (which buyers may see as a sign
    of lower social status). In addition, they experimentally varied the quality of the ad text and whether the
    ipod was held in the right or left hand (such that not all ads look the same), and the asking price (between
    three price points). Each ad was online for 12 hours. The authors collected information on the number of
    responses, the number of offers, the friendliness of the responses, the amount offered, among others.
    Paper and data
    You can find the paper here and on Brightspace:
    ? Doleac, J.L. and Stein, L.C. (2013), The Visible Hand: Race and Online Market Outcomes. Econ J,
    123: F469-F492. https://doi.org/10.1111/ecoj....
    Along with the assignment on Brightspace, you find the dataset data_doleacstein.dta, which is in Stata
    .dta format. We will use this dataset for the analysis to follow. Each observation is one email that was sent.
    The main variables for our analysis are shown in Table 1.
    Tasks
  6. Load the dataset into R and produce a table of summary statistics (number of observations, mean,
    sd, median, min, max, number of missing observations) for the variables all variables listed in Table
  7. except ad and texttype. Interpret the mean of responses, offers, white, black, tattoo, and
    polite.
  8. Generate a new dummy variable anyresponse that equals 1 if an ad received at least one response.
    3
    Table 1: Main Variables
    Variable name Content
    ad ad ID (for authors’ use only)
    responses number of responses received
    price asking price
    offers number of offers received
    bestoffer best offer received for ipod
    meanoffer mean offer if there were multiple offers
    name dummy: 1 if buyer signed response with name
    polite dummy: 1 if response was polite
    texttype indicator for quality of text; 0=high quality, 1=medium quality, 2=low quality
    black dummy: 1 if seller is black
    tattoo dummy: 1 if seller is white and has a wrist tattoo
    white dummy: 1 if seller is white (without a wrist tattoo)
  9. Produce a frequency table for the number of ads that were put up for each seller type (black, white,
    tattoo). The table should include the number of ads per seller type (absolute numbers and shares,
    i.e. the share of ads that were assigned to a particular seller type).
  10. Produce a frequency table with seller types on the horizontal and asking prices (90, 110 and 130 USD)
    on the vertical axis. Each cell should show the share of all ads that were put up by a given seller type
    for a given asking price (hint: search for cross tabulation). Do not show the absolute numbers, only the
    shares. What does the result tell you about the quality of the randomisation in the experiment?
  11. Run t-tests comparing the difference in means between white and black sellers for the following variables:
    anyresponse, bestoffer, meanoffer, polite. The results of the t-tests should be presented in a table
    that shows the following: each row is a variable; columns: mean of the variable for Whites, mean of
    variable for Blacks, difference in means between Whites and Blacks, p-value of t-test. Interpret your
    findings regarding magnitude and statistical significance.(Hint: you can use t.test which will save the
    results of each t-test in an object that you can see under “Environment”. You can then combine these
    objects to a table.)
  12. Regress the dummy anyresponse on the dummies black and tattoo. Interpret the coefficients of the
    slopes and intercept, comment on statistical significance, and compare your results to those in the table
    produced in 4.
  13. Another way of analysing the results of an experiment like this is through bar charts with error bars.
    You plot the means for the treatment and control group and attach to each bar a so-called error bar
    (y ± sd(y)). The error bars give an indication of the variation in each seller group. Produce such a
    chart (separate bars for black sellers, white sellers, and sellers with a tattoo) for the following outcomes:
    bestoffer, meanoffer.
  14. Not only did the researchers randomise whether the seller is black, but they also randomised the quality
    of the ad text. Create dummies highquality (1 if text of high quality), and mediumquality (1 if text
    of medium quality). Run a regression of black on highquality and mediumquality and interpret
    your result. Comment on the meaning of this result for the experimental design.

你可能感兴趣的:(算法)