The importance of good evaluation design, and how the design stage can be used to prevent problems that are much harder to fix using statistical methods at the analysis stage;
Key problems with data collection that can arise when conducting an impact analysis. In particular, low survey response rate and social desirability bias;
Internal and external validity and the extent to which these are threatened by problems with data collection; AND
Statistical power: the consequences of having insufficient power, and the importance of estimating statistical power of evaluation design at the outset.
CASE
New York City’s Teen ACTION Program: an Evaluation Gone Awry
ASSIGNMENT: BASED ON WHAT YOU HAVE LEARNED OVER THE COURSE OF THE TERM,
PART 1: Provide a detailed and critical policy assessment of this evaluation gone awry, and what can be learned from this to be better analysts–Be specific about what you would do to do a better analysis. Make sure to draw on readings from previous units.
After you have completed PART 1, go to the web page for Teen Action, and …
AUDIO VIDEO ONLINE
New York City’s Teen ACTION Program: an Evaluation Gone Awry: The Teen ACTION Evaluation – Part 1
AUDIO VIDEO ONLINE
New York City’s Teen ACTION Program: an Evaluation Gone Awry: The Teen ACTION Evaluation – Part 2
PART 2
… based on the available information, write a policy memo detailing the type of analysis you would recommend to the mayor and why?
The evaluation of New York City’s Teen ACTION Program serves as a cautionary tale for policymakers and analysts. Here’s a breakdown of the problems with the study and how a better analysis could be conducted:
Flaws in Evaluation Design:
Selection bias: The evaluation compared teens who participated in the program to those who didn’t. However, it’s unclear how participants were selected. Selection bias could occur if teens who were more likely to improve were more likely to participate, leading to misleading positive results. A better approach would be to use a randomized controlled trial (RCT) where participants are randomly assigned to either the program or a control group.
No baseline data: The evaluation lacked data on the participants before they entered the program. This makes it impossible to determine if any positive changes were actually caused by the program. Collecting baseline data on factors like grades, attendance, and risky behaviors would allow for a more accurate assessment of program impact.
Data Collection Issues:
Low survey response rates: The evaluation relied heavily on self-reported data from surveys. Low response rates can lead to non-response bias, where those who respond are systematically different from those who don’t. Strategies like offering incentives, reminder emails, and alternative response methods (phone or online) could improve response rates.
Social desirability bias: Teens might be more likely to report positive behaviors, even if they didn’t occur. This social desirability bias can inflate the program’s perceived effectiveness. Using anonymous surveys and incorporating objective measures (e.g., school records) can mitigate this bias.
Threats to Validity:
Internal validity: Internal validity refers to whether the program caused the observed changes. Selection bias and lack of baseline data threaten internal validity in this case.
External validity: External validity refers to whether the findings can be generalized to a broader population. The evaluation only looked at teens in NYC, limiting the generalizability of the findings.
Improving the Analysis:
To conduct a better analysis, I would recommend the following:
Learning as Analysts:
This case highlights the importance of meticulous planning and design in evaluation research. By anticipating potential threats to validity and incorporating strategies to mitigate bias, analysts can ensure their findings are reliable and informative for policy decisions.
To: Mayor of New York City
From: [Your Name], Policy Analyst
Date: 2024-05-01
Subject: Recommendation for Re-evaluating the Teen ACTION Program
The previous evaluation of the Teen ACTION Program suffered from methodological flaws that limit its usefulness in determining the program’s effectiveness. These flaws include selection bias, lack of baseline data, and potential biases in data collection.
To gain a more accurate understanding of the program’s impact, I recommend conducting a new evaluation using a randomized controlled trial (RCT) design. This would involve randomly assigning teens to either participate in the program or a control group that does not. By comparing outcomes between these groups, we can isolate the impact of the program and minimize selection bias.
The new evaluation should also collect baseline data on key outcome variables, such as grades, attendance, and risky behaviors, before teens enter the program. This data will serve as a benchmark to assess program-induced changes.
Furthermore, the evaluation should utilize a mix of data collection methods, including surveys, interviews, and potentially collecting data from school records. This triangulation of data sources can help mitigate potential biases in self-reported data.
By implementing these recommendations, we can ensure a more rigorous evaluation that provides reliable evidence on the effectiveness of the Teen ACTION Program. This will allow you to make informed decisions about future funding and resource allocation for the program.