COURSE PHILOSOPHY
Most medical research is shallow.
Not because clinicians are unintelligent —
but because the system rewards speed over rigor and volume over depth.
Overview of current pitfalls
-
weak assumptions
-
soft models
-
p-hacked results
-
copy-paste methods
-
superficial inference
-
research as résumé filler
Instead, this understanding helps me to
Build intellectual leverage through rigor, design, and truth-seeking.
CORE IDENTITY PRINCIPLE
If wanting to move from a consumer of research.
To a producer of methods and models.
If not here to:
-
run someone else’s code
-
echo literature conclusions
-
decorate CVs with low-impact papers
But are here to: build, test, model, verify, challenge
Make sure that in each study this is discussed
Objective
Ensure rigorous statistics
Content
Red flags to master:
-
Underspecified models
-
No causal framework
-
Unjustified adjustments
-
P-hacking behaviors
-
Selective outcome reporting
-
Forking-path analyses
-
Overconfident conclusions from weak data
-
Regression worship without clinical sense
Underspecified models
Meaning:
The authors ran a model without including important variables or structure — they used a “thin” model for a complex problem.
Example:
A paper models:
steroid use → reduced mortality
But does NOT include:
-
disease severity
-
oxygen requirement
-
ICU admission status
-
comorbidities
They adjusted for:
-
age
-
sex
…and stopped.
That’s underspecified.
Meaning:
There’s no articulated explanation for how X is supposed to cause Y. They’re just calculating associations.
Example:
“High CRP predicts mortality. Therefore CRP should be targeted.”
But:
-
Is CRP a marker?
-
A mediator?
-
A confounder?
-
A consequence?
No causal thinking.
Translation:
They found correlation and wrote destiny.
Unjustified adjustments
Meaning:
They adjusted for variables that should not be adjusted for, which actually biases results.
Example:
Studying:
effect of ventilation on survival
They adjust for:
oxygen saturation after intubation
That’s a mediator — not a confounder.
Result = distorted estimate.
Translation:
They “controlled” for the very thing they were trying to measure.
The Three Core Variable Types
1. Confounder ✅ Adjust for this
Definition:
A confounder is a variable that:
-
affects the exposure
-
affects the outcome
-
is NOT caused by the exposure
It creates a fake relationship if you ignore it.
Example:
Studying:
Steroids → Mortality in ICU
Confounder:
Disease severity
Why?
-
Sicker patients are more likely to get steroids
-
Sicker patients are more likely to die
So if you don’t adjust:
It looks like steroids cause death
because disease severity is mixed into the estimate.
Rule:
✅ Adjust for confounders.
How to recognize one:
Ask:
“Does this cause both my exposure and my outcome?”
If yes → likely a confounder.
2. Mediator ❌ Do NOT adjust
Definition:
A mediator is:
-
caused by the exposure
-
in the causal path to the outcome
It explains how the exposure works.
Example:
Studying:
Ventilator strategy → Survival
Mediator:
Oxygenation after ventilation
Ventilator → oxygenation → survival
If you adjust for oxygenation…
You erase the very effect you’re trying to measure.
Rule:
❌ Do not adjust for mediators if you want the total effect.
How to recognize one:
Ask:
“Is this result of the exposure?”
If yes → it’s a mediator.
3. Collider ❌❌ Never adjust
Definition:
A collider is:
-
caused by both exposure and outcome
Adjusting for it creates bias.
Example:
Studying:
Smoking → Lung cancer
Collider:
Hospital admission
Smoking increases admissions
Lung cancer increases admissions
Conditioning on hospital admission introduces fake inverse relationships.
Forking-path analyses
Meaning:
They made many analytic choices without accounting for uncertainty created by those choices.
Example:
-
Changed inclusion age cutoffs
-
Tried different covariate sets
-
Used different model forms
Assignment
Take 5 papers relevant to your interest area and classify:
-
What model is used?
-
What causal assumption is implied?
-
What bias is unaddressed?
-
What claim is unjustified?
MODULE 2
FROM “DATA” TO CAUSE
Objective
Move from association to explanation.
Content
Topics:
-
Directed acyclic graphs (DAGs)
-
Confounding vs mediation
-
Collider bias
-
Identification strategies
-
Exchangeability
-
Positivity
-
Target trial emulation
Assignment
Build DAGs for:
-
Vasopressors → mortality
-
Ventilator strategy → lung injury
-
Diuretics → renal outcomes
-
Steroids → survival
Label:
-
Confounders
-
Mediators
-
Colliders
-
Unmeasured bias
MODULE 3
ASSOCIATION MODELS ≠ ANSWERS
Objective
Use standard models honestly, not lazily.
Content
Tools:
-
Logistic regression
-
Cox models
-
KM curves
-
Propensity scores
-
Splines
But learn:
-
When each model lies
-
When it misleads
-
When hazard ratios distort reality
-
When matching worsens bias
-
When proportional hazards fails
Assignment
Re-analyze one “classic” study and show:
-
What assumptions were violated
-
How conclusions change under different models
MODULE 4
CAUSAL INFERENCE THAT ACTUALLY WORKS
Objective
Use methods clinicians never learn — and therefore misuse.
Content
Methods:
-
IPW / IPTW
-
Marginal structural models
-
G-methods
-
Instrumental variables
-
Sensitivity analyses
-
Negative controls
Assignment
Pick one clinical question and:
-
Emulate a target trial
-
Define exposure, outcome, eligibility
-
Identify time-zero clearly
-
Justify design choices
MODULE 5
BAYESIAN THINKING FOR CLINICIANS
Objective
Think in probability, not superstition.
Content
Topics:
-
Bayesian interpretation
-
Prior formation
-
Posterior updating
-
Credible intervals vs CI
-
Hierarchical models
-
Bayesian decision theory
Assignment
Rewrite a frequentist paper as:
-
a prior
-
a likelihood
-
a posterior belief statement
MODULE 6
INTERPRETABLE MACHINE LEARNING
Objective
Do ML that’s defensible, not embarrassing.
Content
Tools:
-
SHAP
-
calibration curves
-
decision curves
-
feature dependence plots
-
nested cross-validation
Concepts:
-
Overfitting
-
Data leakage
-
Bias amplification
-
Model stability
Assignment
Train a model and:
-
explain it
-
validate it
-
test calibration
-
demonstrate generalization
MODULE 7
SIMULATION & BOOTSTRAP THINKING
Objective
Stop relying on asymptotics and wishful thinking.
Content
-
Monte Carlo simulation
-
Bootstrap confidence
-
Power simulation
-
resampling logic
-
uncertainty propagation
Assignment
Simulate:
-
a biased dataset
-
a confounded dataset
-
a mis-specified model
and show failure modes.
MODULE 8
REPRODUCIBILITY OR NOTHING
Objective
Build research that survives close inspection.
Content
Practices:
-
Version control
-
Data pipeline discipline
-
Code hygiene
-
Pre-registration
-
Notebooks
-
Reproducible reporting
Assignment
Take one old project and:
-
rebuild it clean
-
document assumptions
-
make it rerunnable
MODULE 9
CHOOSING QUESTIONS THAT MATTER
Objective
Research should matter — or it dies quietly.
Content
Good research questions:
-
change decisions
-
reduce uncertainty
-
expose bias
-
alter treatment
-
predict reality
Bad ones:
-
are convenient
-
trend-chasing
-
hollow associations
-
CV padding
Assignment
Kill one project idea that isn’t worth doing.
Design one that is.
CAPSTONE
BUILD SOMETHING REAL
Your final project must:
✅ Answer a real clinical question
✅ Include causal reasoning
✅ Use appropriate models
✅ Produce interpretable results
✅ Be reproducible
✅ Teach you something new
Comments
Post a Comment