Покана за курс – “Introduction to applied Statistics and R for Master and PhD students”

Курсът на гост-лектора ще включва теми, с които участниците ще могат да видят нагледни примери за приложението на езика R в обработката на данни и статистическото моделиране.

Total N° Hours: 24 (3 ETCS)

General Description: The course is aimed at Master and PhD students. The course content is designed to bridge the gap between basic R coding and basic/advanced statistical modelling. The course will consist of a series of modules, designed to build required skills to perform a family of analyses that is frequently encountered in the environmental sciences.

Course Module Contents:

1 – Introduction to R and RStudio: introduction of some fundamental statistical concepts and coding practices usign R and RStudio . Data manipulation & visualization using available packages such as ggplot2.

2 – Classic Linear Models: Univariate regressions, diagnostics & plotting fits, Adding additional continuous predictors (multiple regression); quantile regression; scaling & collinearity; Adding factorial (categorical) predictors & incorporating interactions (ANCOVA)

3 – Advanced Linear Models: Generalised Linear Models (binomial and count data); Mixed effects models; Model selection & simplification (likelihood ratio tests, AIC, glmmulti)

4 – Geo-statistical linear models: An overview of model-based geostatistics; Classical parameter estimation; Empirical and Theoretical Semivariograms; geostatistical model estimation; kriging.

5 – Analysis of multivariate datasets: Cluster analysis, PCA, RDA, PCOA, nMDS, variance partition techniques.

Course Details: Each module will include practical exercises and supplementary challenges to help attendees build their understanding of the concepts. All course materials (including copies of presentations, practical exercises, data files, and example scripts) will be provided electronically to participants. The course will begin by quickly reviewing some fundamental statistical concepts and coding practices.

Used Software: R/Rstudio

R is a (free) open source programming language and software environment for statistical computing and graphics.

Equipment and software requirements: A laptop/personal computer with a working version or R and RStudio installed. R and RStudio are supported by both PC and MAC and can be downloaded for free by following these links:

-http://cran.r-project.org

-http://www.rstudio.com

Assumed quantitative knowledge: A basic understanding of statistical concepts.

Suggested Readings

1) Field A., Miles J., Field Z. 2012. Discovering Statistics using R. Sage. 992

2) Crawley, M.J. 2012. The R Book, 2nd edition. Wiley

2) Borcard, D. ,Gillet, F. ,Legendre, P. 2011. Numerical Ecology with R. Springer

3) Diggle, P. J., Ribeiro Jr, P. J. 2007. Model-based Geostatistics Series: Springer Series in Statistics., X, 230 p. (http://www.leg.ufpr.br/mbgbook/)

LECTURE SCHEDULE:

Online Lectures (on Teams – Link to the virtual classroom provided by the teacher; 1 Academic hour = 45 minutes):

23 May – from 14:00 to 18:00 (4 academic hours) – Lecture 1

25 May – from 9:00 to 13:00 (4 academic hours) – Lecture2

26 May – from 9:00 to 13:00 (4 academic hours) – Lecture 3

In Presence Lecture (Sofia University)

30 May – from 9:00 to 13:00 (4 academic hours) – Lecture 4

1 June – from 9:00 to 13:00 (4 academic hours) – Lecture 5

2 June – from 9:00 to 13:00 (4 academic hours) – Lecture 6