Description
Course Content
Important note: Time constraints may preclude all topics being covered.
- Preliminaries
- The nature of statistical science
- Levels of measurement and implications for estimation and inference
- Discrete and continuous probability models
- Concept of sampling distributions of sample statistics
- Frequentist and Bayesian statistical paradigms
- Data Wrangling
- Getting data into and out of R – reading/writing ASCII files, XLSX files, CSV files
- Creating data frames
- Manipulating data – subsetting, reshaping, grouping, reorganizing
- Using the tidyr and dplyr packages
- Descriptive Statistics and Graphing
- Exploratory data analysis using R
- Numerical, tabular and graphical summaries
- Histograms – with density smoothers;
- Q-Q plots
- Boxplots – individual and multiple with grouping;
- Bi-plots – with smoothing and regression lines;
- Rug plots;
- Matrix plots;
- Cross-tabulation tools for categorical data and summarising quantitative data by classifying factors.
- Handling missing values
- Elegant graphics using the ggplot2 package
- Philosophy of ggplot (the grammar of graphics)
- qplot() basics
- plot geoms
- faceting
- building plots by layer
- Inferential Statistics – Preliminaries
- Basic probabilistic concepts
- Using R’s intrinsic functions for computing probability densities/functions; quantiles, generating random data.
- Sampling distribution of a sample statistic
- The central limit theorem
- Working with the normal distribution
- Inferential Statistics – Estimation
- Estimating an unknown parameter
- Properties of estimators – point and interval estimates
- Distinction between confidence, tolerance, and prediction intervals
- Small and large sample interval estimation
- Inferential Statistics – Hypothesis testing
- Key concepts: level of significance; P-value; Type I and II errors; statistical power
- Inference about a single population mean
- Multiple-comparison techniques (Dunnett’s test)
- Extension to 2-samples and testing equality of variances
- Extension to more than 2 samples – Analysis of Variance for the one-way ANOVA model
- Basic logic of one-way ANOVA model
- Critical assumptions
- Test of homogeneity of variances
- ANOVA – more complex designs
- Multi-factor designs, blocking, interactions
- Using R’s lm() function
- Interpreting the output
- Diagnostic checking
- Power and Sample-size calculations
- Key concepts – understanding what a power analysis can and can’t do
- How to compute power – writing an R function
- The non-central t-distribution
- Constructing power curves
- Using R’s intrinsic functions to compute power
- Regression Models
- Simple linear regression models and ordinary least squares
- Using R’s lm() function
- Assessing the model fit
- Residuals versus fitted values plots
- Q-Q plot for checking normality
- Scale-location plot
- Cook’s distance
- Residual versus leverage plot
- Cook’s distance versus leverage plot
- Inference about the fitted model
- Inference about the model parameters
- Inference concerning predicted values
- Statistical calibration
- Logistic Regression
- Parameter estimation