Enanpad 2024
17-09-2024
It is very common these days to hear someone say “correlation does not mean causality.”
In essence, that is true.
The killer struck during daylight. Had the sun not been out that day, the victim would have been safe.
There is a correlation, but it is clear there is no causation.
Sometimes, there is causality even when we do not observe correlation.
The sailor is adjusting the rudder on a windy day to align the boat with the wind, but the boat is not changing direction. (Source: The Mixtape)
Note
In this example, the sailor is endogenously adjusting the course to balance the unobserved wind.
I will avoid the word “endogeneity” as much as possible.
Imagine that you want to investigate the effect of Governance on Q
\(𝑸_{i} = α + 𝜷 × Gov_{i} + Controls + error\)
All the issues in the next slides will make it not possible to infer that changing Gov will CAUSE a change in Q
That is, cannot infer causality
Perhaps it is \(Q\) that causes \(Gov\).
OLS based methods do not tell the difference between these two betas:
\(Q_{i} = \alpha + \beta × Gov_{i} + Controls + \epsilon\)
\(Gov_{i} = \alpha + \beta × Q_{i} + Controls + \epsilon\)
If one Beta is significant, the other will most likely be significant too.
You need a sound theory (and possibly play with lags, might not be enough)!
Perhaps \(Gov\) and \(Q\) are determined simultaneously.
That is, there is a third variable causing both.
An OLS regression will provide a biased estimate of the effect.
Also, the sign might be wrong.
Imagine that you do not include an important “true” predictor of \(Q\)
Let’s say, long is: \(𝑸_{i} = \alpha_{long} + \beta_{long}* Gov_{i} + δ * omitted + error\)
But you estimate short: \(𝑸_{i} = \alpha_{short} + \beta_{short}* Gov_{i} + error\)
\(\beta_{short}\) will be:
\(\beta_{short} = \beta_{long}\) + bias
\(\beta_{short} = \beta_{long}\) + relationship between omitted (omitted) and included (Gov) * effect of omitted in long (δ)
Thus, OVB is: \(\beta_{short} – \beta_{long} = ϕ * δ\)
Bad controls are variables that are also outcome of the treatment (i.e., \(Gov\)) being studied.
A Bad control could very well be a dependent variable of \(Gov\) as well.
Good controls are variables that you can think as being fixed at the time of the treatment.
Assuming you also have something that is the consequence of good governance (e.g., Novo Mercado dummy). Should you include it in the model?
No. In this case, the coefficient of interest no longer has a causal interpretation.
Warning
It is not hard to come up with stories of why a control is a bad control.
Collider bias occurs when an independent variable and outcome each influence a third variable and that variable or collider is included in the regression.
In the analysis below
Including, for instance, CEO Reputation (assuming that both \(Q\) and \(Gov\) influences CEO Reputation) creates a false correlation between \(Gov\) and \(Q\).
\(Q_{i} = \alpha + \beta × Gov_{i} + Controls + \epsilon\)
Even if we could perfectly measure \(Gov\) and all relevant covariates, we would not know for sure the functional form through which each influences \(Q\).
Misspecification of x’s is similar to OVB.
Perhaps, some individuals are signaling the existence of an X without truly having it:
This is similar to the OVB because you cannot observe the full story.
Some constructs (e.g. \(Gov\)) are complex and sometimes have conflicting mechanisms.
We usually don’t know for sure what “good” governance is, for instance.
It is common to use imperfect proxies, that may poorly fit the underlying concept.
“Classical” random measurement error in x’s will bias the coefficient toward zero
“Classical” random measurement error in the Y will inflate standard errors but will not lead to biased coefficients.
Maybe the causal effect of \(Gov\) on \(Q\) depends on observed and unobserved firm characteristics:
In such case, we may find a positive or negative relationship.
Neither is the true causal relationship.
This is analogous to the Hawthorne effect, in which observed subjects behave differently because they are observed.
Firms which change gov may behave differently because their managers or employees think the change in \(Gov\) matters, when in fact it has no direct effect.
If you run a regression with two types of companies
Without any matching method, these companies are likely not comparable
Thus, the estimated beta will contain selection bias, which can be either be positive or negative
Self-selection is a type of selection bias. Usually, firms decide which level of governance they adopt
It is like they “self-select” into the treatment.
There are reasons why firms adopt high governance
Important
More data is not necessarily a solution, you need a sound empirical design.
QUESTIONS?
Henrique C. Martins
[Henrique C. Martins] [henrique.martins@fgv.br][Do not use without permission]