Statistical Ecotoxicology using R – Module 1: Introduction to Statistical Modelling using R July 1 – 3, 2019 (3 days)

$1,050.00 Excl. GST

Charles Darwin University – Waterfront Campus

21 Kitchener Dr, Darwin City NT

 

This is a predominantly  ‘hands-on’ course that will provide the necessary background information and tools to get started with analysing data within the R computing environment.

On successful completion, participants will be able to:

  • Understand the basics of the R environment (data frames, objects, classes, factors, lists, workspace, installing and loading packages, getting help)
  • Read in data from various sources (ASCII files; Excel spreadsheets; CSV files etc.) as well as using built-in data sets.
  • Perform basic data manipulations and re-organise raw data into standard formats with the help of in-built packages such as tidyverse and dplyr.
  • Create various R objects such as data frames and lists and know how to manipulate these using conditional selection, grouping, sorting, and list functions such as lapply and mapply.
  • Use intrinsic functions to transform data and perform basic arithmetic.
  • Use R’s intrinsic graphics functions to produce high-quality graphics.
  • Use the ggplot2 package to create more advanced data visualisations.
  • Write simple R programs to perform more complex tasks.
  • Undertake basic statistical analyses using R functions and other packages.

 

2 in stock

Category:

Description

Course Content

Important note: Time constraints may preclude all topics being covered.

  1. Preliminaries
  • The nature of statistical science
  • Levels of measurement and implications for estimation and inference
  • Discrete and continuous probability models
  • Concept of sampling distributions of sample statistics
  • Frequentist and Bayesian statistical paradigms

 

  1. Data Wrangling
  • Getting data into and out of R – reading/writing ASCII files, XLSX files, CSV files
  • Creating data frames
  • Manipulating data – subsetting, reshaping, grouping, reorganizing
  • Using the tidyr and dplyr packages

 

  1. Descriptive Statistics and Graphing
  • Exploratory data analysis using R
  • Numerical, tabular and graphical summaries
    • Histograms – with density smoothers;
    • Q-Q plots
    • Boxplots – individual and multiple with grouping;
    • Bi-plots – with smoothing and regression lines;
    • Rug plots;
    • Matrix plots;
    • Cross-tabulation tools for categorical data and summarising quantitative data by classifying factors.
  • Handling missing values
  • Elegant graphics using the ggplot2 package
    • Philosophy of ggplot (the grammar of graphics)
    • qplot() basics
    • plot geoms
    • faceting
    • building plots by layer

 

  1. Inferential Statistics – Preliminaries
  • Basic probabilistic concepts
    • Using R’s intrinsic functions for computing probability densities/functions; quantiles, generating random data.
  • Sampling distribution of a sample statistic
  • The central limit theorem
  • Working with the normal distribution

 

  1. Inferential Statistics – Estimation
  • Estimating an unknown parameter
  • Properties of estimators – point and interval estimates
  • Distinction between confidence, tolerance, and prediction intervals
  • Small and large sample interval estimation

 

  1. Inferential Statistics – Hypothesis testing
  • Key concepts: level of significance; P-value; Type I and II errors; statistical power
  • Inference about a single population mean
  • Multiple-comparison techniques (Dunnett’s test)
  • Extension to 2-samples and testing equality of variances
  • Extension to more than 2 samples – Analysis of Variance for the one-way ANOVA model
    • Basic logic of one-way ANOVA model
    • Critical assumptions
    • Test of homogeneity of variances
  • ANOVA – more complex designs
    • Multi-factor designs, blocking, interactions
    • Using R’s lm() function
    • Interpreting the output
    • Diagnostic checking

 

  1. Power and Sample-size calculations
  • Key concepts – understanding what a power analysis can and can’t do
  • How to compute power – writing an R function
  • The non-central t-distribution
  • Constructing power curves
  • Using R’s intrinsic functions to compute power

 

  1. Regression Models
  • Simple linear regression models and ordinary least squares
  • Using R’s lm() function
  • Assessing the model fit
    • Residuals versus fitted values plots
    • Q-Q plot for checking normality
    • Scale-location plot
    • Cook’s distance
    • Residual versus leverage plot
    • Cook’s distance versus leverage plot
  • Inference about the fitted model
    • Inference about the model parameters
    • Inference concerning predicted values
  • Statistical calibration
  • Logistic Regression
    • Parameter estimation