A Deep Dive on The Drunkard’s Walk
I continue my series on the mathematics of Markov chains with a deep dive on The Drunkard’s Walk. This is a set-up for more on Wald’s Sequential Analysis (a near relative of A/B Tests). A great thing...
View ArticleThe Biased Drunkard’s Walk
Our “Markov Chains leading up to A/B tests” series continues with The Biased Drunkard’s Walk. In this note we use the theory of Toeplitz matrices to analyze a variant of the drunkard’s walk that I am...
View ArticleConditioning on the Future
In both A Slightly Unfair Game and The Drunkard’s Walk In Detail we showed a fair random walk that moved up or down with 50/50 probability. Some of these walks stopped when they were absorbed at zero,...
View ArticleWhat You Should Know About Linear Markov Chains
I want to collect some “great things to know about linear Markov chains.” For this note we are working with a Markov chain on states that are the integers 0 through k (k > 0). A Markov chain is an...
View ArticleTools for Jupyter in (and near) Production
I am sharing a tutorial video showing “run Jupyter in production” tools (including the ability to remove the Jupyter dependency). The point is: how to let the analyst work in Jupyter and without great...
View ArticleUse Jupyter Notebooks Inside For-Loops
Introduction In my opinion, a number of “moving data science to production” problems are solved if one could just use a Jupyter notebook inside a for-loop. The wvpy package supplies the tools put...
View ArticleWhat Good is Analysis of Variance?
Introduction I’d like to demonstrate what “analysis of variance” (often abbreviated as “anova” or “aov”) does for you as a data scientist or analyst. After reading this note you should be able to...
View ArticleIllustrating the F-test in Action
I have a new note showing how the F-test works here. The F-test is a good way for quantifying model effectiveness. I think it pairs nicely with my earlier ANOVA article. Please check it out.
View ArticleHow Data Quantity Drives Model Quality
I’d like to share a video introduction to a new article on training set size. I am trying to explain some of the subtleties of evaluating “in sample” (on data used during the model inference procedure)...
View ArticleThe m = n Machine Learning Anomaly
In our note “How Data Quantity Drives Model Quality” we worked on how the training data size controls model quality in linear regression. At that time, to avoid some true horror, we deliberately...
View Article