This blog is meant to document my learning journey, as I am studying elementary statistics and R. To be exact, I want to learn data science, more than statistics:
By way of background, I am a physician practicing clinical medicine (as opposed to various flavors of research). Like most clinicians, my prior education in stats is exceptionally poor, consisting mostly of heroic last-minute attempts at memorizing scraps of frequentist stuff needed to pass various exams.
My goals are to organize my thoughts and notes about distributions, frequentist two sample and ANOVA tests, power calculations, simple and multivariable regression, and survival analysis.
Medium-term, I am very interested in moving past frequentist reasoning to Bayesian and information theoretic approaches:
(Image credit: John D Cook).
I like this pithy summary of the frequentist-Bayesian debate, by Darren L Dahly:
Frequentist and Bayesian views of probability are both useful fictions.
Both require statistical and subject matter expertise.
Both suffer when “defaults” are used.
Both use prior knowledge.
Both are subjective.
- Thom Baguley’s book Serious Stats, his blog, and book companion. Thom Baguley is a former editor of the British Journal of Mathematical and Statistical Psychology;
- online forums like CrossValidated and StackOverflow;
- Frank Harrels’ blog Statistical Thinking and his book Regression Modeling Strategies. Frank Harrell is Professor and Founding Chair of Biostatistics, at Vanderbilt University.
- Burnham and Anderson’s “Model Selection and Multi-Model Inference: A Practical Information-Theoretic Approach“.
An Introduction to Statistical Learning: with Applications in R by James, Witten, Hastie and Tibshirani from Stanford. Jeff Blume’s