Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
I. Introduction and Preliminaries
1. Overview
- Making R more accessible: R and available GUIs
- RStudio
- Related software and documentation
- R and statistics
- Using R interactively
- An introductory session
- Getting help with functions and features
- R commands, case sensitivity, etc.
- Recalling and correcting previous commands
- Executing commands from or redirecting output to a file
- Data persistence and removing objects
- Good programming practices: Self-contained scripts, readability (e.g., structured scripts, documentation, markdown)
- Installing packages: CRAN and Bioconductor
2. Reading Data
- Text files (read.delim)
- CSV files
3. Simple Manipulations: Numbers, Vectors, and Arrays
- Vectors and assignment
- Vector arithmetic
- Generating regular sequences
- Logical vectors
- Missing values
- Character vectors
- Index vectors: selecting and modifying subsets of a data set
- Arrays
- Array indexing: Subsections of an array
- Index matrices
- The array() function and simple operations on arrays (e.g., multiplication, transposition)
- Other types of objects
4. Lists and Data Frames
- Lists
- Constructing and modifying lists
- Concatenating lists
- Data Frames
- Creating data frames
- Working with data frames
- Attaching arbitrary lists
- Managing the search path
5. Data Manipulation
- Selecting, subsetting observations and variables
- Filtering and grouping
- Recoding and transformations
- Aggregation and combining data sets
- Forming partitioned matrices: cbind() and rbind()
- The concatenation function, c(), with arrays
- Character manipulation using the stringr package
- Introduction to grep and regexpr
6. Advanced Data Reading
- XLS and XLSX files
- readr and readxl packages
- SPSS, SAS, Stata, and other data formats
- Exporting data to TXT, CSV, and other formats
6. Grouping, Loops, and Conditional Execution
- Grouped expressions
- Control statements
- Conditional execution: if statements
- Repetitive execution: for loops, repeat, and while
- Introduction to apply, lapply, sapply, and tapply
7. Functions
- Creating functions
- Optional arguments and default values
- Variable number of arguments
- Scope and its consequences
8. Basic Graphics in R
- Creating a graph
- Density plots
- Dot plots
- Bar plots
- Line charts
- Pie charts
- Boxplots
- Scatter plots
- Combining plots
II. Statistical Analysis in R
1. Probability Distributions
- R as a set of statistical tables
- Examining the distribution of a data set
2. Hypothesis Testing
- Tests concerning a Population Mean
- Likelihood Ratio Test
- One- and two-sample tests
- Chi-Square Goodness-of-Fit Test
- Kolmogorov-Smirnov One-Sample Statistic
- Wilcoxon Signed-Rank Test
- Two-Sample Test
- Wilcoxon Rank Sum Test
- Mann-Whitney Test
- Kolmogorov-Smirnov Test
3. Multiple Hypothesis Testing
- Type I Error and False Discovery Rate (FDR)
- ROC curves and AUC
- Multiple Testing Procedures (BH, Bonferroni, etc.)
4. Linear Regression Models
- Generic functions for extracting model information
- Updating fitted models
- Generalized Linear Models (GLM)
- Families
- The glm() function
- Classification
- Logistic Regression
- Linear Discriminant Analysis
- Unsupervised Learning
- Principal Components Analysis
- Clustering Methods (k-means, hierarchical clustering, k-medoids)
5. Survival Analysis (survival package)
- Survival objects in R
- Kaplan-Meier estimate, log-rank test, parametric regression
- Confidence bands
- Censored (interval-censored) data analysis
- Cox PH models with constant covariates
- Cox PH models with time-dependent covariates
- Simulation: Model comparison (comparing regression models)
6. Analysis of Variance (ANOVA)
- One-Way ANOVA
- Two-Way Classification of ANOVA
- MANOVA
III. Worked Problems in Bioinformatics
- Short introduction to the limma package
- Microarray data analysis workflow
- Data download from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
- Data processing (Quality Control, normalization, differential expression)
- Volcano plot
- Clustering examples and heatmaps
28 Hours
Testimonials (2)
knowledge of the trainer, tailor based, all topics covered
eleni - EUAA
Course - Forecasting with R
The real life applications using Statcan and CER as examples.