Course ID: 014342
In this course, we will cover classification and prediction in health research with a view to applications to screening and diagnosis of disease. This will lead to methods for evaluating the performance of various types of statistical models and learning techniques/algorithms. Cross-validation and bootstrap approaches will be introduced for model performance evaluation, and we will discuss discrimination and calibration as different components of prediction performance. We will cover variable selection techniques, including for high-dimensional data, with an emphasis on regularization techniques such as the LASSO and its variants. Model validation, both internal and external, and model updating will be covered, and we will also discuss post-model selection inference. An important focus will be on biomarker evaluation for a given disease, potentially connected to therapy, and leading to coverage of precision/personalized medicine. Finally, there will be coverage on the importance of reproducible and replicable research. Examples from different problems in health, including genetics, will be presented, and software (e.g. R or SAS) will be used throughout the course.