Today’s class begins with the following research question:
Are after-school programs effective in improving academic performance?
If we simply compare the scores of students who participated in after-school classes with those who did not, and then draw conclusions based on that, the study’s internal validity will be threatened.
This is because students’ grades are influenced by many factors besides the after-school program, such as their original performance, household income, and parental involvement.
This is called a selection problem.
It occurs because there are systematic differences between the two groups.
So how can we eliminate this?
1. Random assignment

Random assignment means assigning research subjects to groups randomly, without the researcher’s intervention.
It is like flipping a coin to decide whether a student will participate in the after-school program or not.
When groups are assigned this way, the characteristics of the two groups become extremely similar in the long run.
Eliminates selection bias between groups
Ensures even distribution of potential confounding variables
Strengthens the internal validity of causal inference
If, after random assignment, the experimental results show a 40-point difference in scores between participants and non-participants in the after-school program, we can conclude that this difference is due to the effect of the after-school class.
This is because random assignment has removed the differences between the two groups, so those differences can no longer influence the conclusion.
2. Random assignment and self-selection

In educational research, participation in activities such as choosing an after-school program cannot realistically be forced through random assignment.
Generally, such activities are decided by students and parents themselves, according to their grades, free time, and other circumstances.
Because of this, it is difficult to secure homogeneity between the two groups.
Therefore, when random assignment is not possible, researchers need alternative methods, namely quasi-experimental designs.
3. Research designs that allow self-selection
1) Post-hoc statistical adjustment through regression and matching
(1) Regression analysis
This method anticipates and surveys key differences between students who participated in after-school classes and those who did not, and then statistically adjusts for those differences.
If there are large differences in prior achievement, household income, or parental involvement between participants and non-participants, we measure these and statistically align conditions so they are similar.
Methods include pooled OLS and analysis of covariance (ANCOVA).
Category | Pooled OLS | ANCOVA |
|---|---|---|
What it is | Run a regression on all the data as a single bundle | Compare group differences while considering covariates |
When to use | For panel data when you’re not especially accounting for control variables | When comparing groups in experimental/educational research while controlling covariates |
Core purpose | Estimate the effect of x on y | Accurately compare group differences after adjusting for covariates |
Strengths | Simplicity | Enables fair comparison (corrects pre-existing differences) |
Weaknesses | May be biased by ignoring individual/time differences | Requires the assumption that covariates are independent of the treatment |
(2) Matching
Matching is a method of pairing units in the treated and untreated groups that have similar characteristics and comparing them.
If there are systematic score differences between after-school participants and non-participants depending on parental involvement, matching gathers students with similar levels of parental involvement and compares them.
Within that range, the effect of participation/non-participation in after-school classes becomes much smaller.

2) Difference score analysis (difference-in-differences)
This method measures pre-test scores and then calculates the difference from post-test scores.
However, simply comparing pre- and post-test scores with this method also threatens internal validity.
This is because learner growth (maturation) occurs during the period.
This threat can be addressed by including non-participant students in the study.
If students’ growth is occurring, it should happen in both groups.
But if it does not occur in the non-participant group, we can prove that the effect is not due to maturation.
4. One More Thing?
There were many analysis methods here that I was seeing for the first time, so I organized them as follows.
Research Design | Core Idea | Strengths | Weaknesses | Suitable Cases |
|---|---|---|---|---|
Regression Discontinuity Design (RDD) | Compare units near the cutoff | High internal validity | Hard to generalize, requires a cutoff | Score- or threshold-based policies |
Instrumental Variables (IV) | Use exogenous instruments | Can remove unobserved confounding | Hard to find valid instruments, weak IV problems | Empirical studies in economics and education |
Interrupted Time Series (ITS, with comparison group) | Compare trends before and after intervention | Captures changes over time, can include a comparison group | Hard to control external factors | Effects of policy or institutional changes |
RDD: People right around the cutoff are almost identical, so we compare them.
IV: Use a neutral lever (instrument) that links cause and effect to infer causality indirectly.
ITS: Look at whether the flow or trend changes over time before and after the intervention.
5. Afterthoughts
When I was supervising a science research project in the past, one student who later went to Seoul National University ran an experiment with gathered participants.
They also conducted pre- and post-tests, and when I saw them using ANOVA at the time, I thought, “So there’s a method like this,” and now I’m the one learning it.
This makes me realize how helpful it is to actually write a paper when supervising research projects.
In the past, I only focused on the process of drawing conclusions through experiments, but from now on I think I’ll be able to ask strong questions about the validity of those experimental conclusions.
It was a rewarding day of learning.
댓글을 불러오는 중...