## Description

**Early bird discount until July 26!!**

# Course Content

**Important note**: Time constraints may preclude all topics being covered.

**Preliminaries**

- The nature of statistical science
- Levels of measurement and implications for estimation and inference
- Discrete and continuous probability models
- Concept of sampling distributions of sample statistics
- Frequentist and Bayesian statistical paradigms

**Data Wrangling**

- Getting data into and out of R – reading/writing ASCII files, XLSX files, CSV files
- Creating data frames
- Manipulating data – subsetting, reshaping, grouping, reorganizing
- Using the tidyr and dplyr packages

**Descriptive Statistics and Graphing**

- Exploratory data analysis using R
- Numerical, tabular and graphical summaries
- Histograms – with density smoothers;
- Q-Q plots
- Boxplots – individual and multiple with grouping;
- Bi-plots – with smoothing and regression lines;
- Rug plots;
- Matrix plots;
- Cross-tabulation tools for categorical data and summarising quantitative data by classifying factors.

- Handling missing values
- Elegant graphics using the ggplot2 package
- Philosophy of ggplot (the grammar of graphics)
- qplot() basics
- plot geoms
- faceting
- building plots by layer

**Inferential Statistics – Preliminaries**

- Basic probabilistic concepts
- Using R’s intrinsic functions for computing probability densities/functions; quantiles, generating random data.

- Sampling distribution of a sample statistic
- The central limit theorem
- Working with the normal distribution

**Inferential Statistics – Estimation**

- Estimating an unknown parameter
- Properties of estimators – point and interval estimates
- Distinction between confidence, tolerance, and prediction intervals
- Small and large sample interval estimation

**Inferential Statistics – Hypothesis testing**

- Key concepts: level of significance; P-value; Type I and II errors; statistical power
- Inference about a single population mean
- Multiple-comparison techniques (Dunnett’s test)
- Extension to 2-samples and testing equality of variances
- Extension to more than 2 samples – Analysis of Variance for the one-way ANOVA model
- Basic logic of one-way ANOVA model
- Critical assumptions
- Test of homogeneity of variances

- ANOVA – more complex designs
- Multi-factor designs, blocking, interactions
- Using R’s lm() function
- Interpreting the output
- Diagnostic checking

**Power and Sample-size calculations**

- Key concepts – understanding what a power analysis can and can’t do
- How to compute power – writing an R function
- The non-central t-distribution
- Constructing power curves
- Using R’s intrinsic functions to compute power

**Regression Models**

- Simple linear regression models and ordinary least squares
- Using R’s lm() function
- Assessing the model fit
- Residuals versus fitted values plots
- Q-Q plot for checking normality
- Scale-location plot
- Cook’s distance
- Residual versus leverage plot
- Cook’s distance versus leverage plot

- Inference about the fitted model
- Inference about the model parameters
- Inference concerning predicted values

- Statistical calibration
- Logistic Regression
- Parameter estimation