Overview and objective
Sprechstunden
Lecture schedule (and downloads)
Links and online tutorials
Bibliography
How does automated character recognition work? What is an artifical neural network? How can we find out which lifestyle factors influence the chance of having a heart attack? How does a genetic algorithm optimize a mathematical model? Can we predict stock market prices?
Learning from data is an essential to every area of science. It has applications in many walks of life, including number plate recognition, automated manufacturing, astrophysical survey projects, weather or climate prediction and diagnosis of diseases. Over the years, numerous algorithms and techniques have been invented (or reinvented) to address these kinds of issues, and come under a wide variety of names such as "machine learning", "pattern recognition", "statistical learning", "statistical data modelling" and so forth. The objective of this course is to provide a broad overview of the various statistical and mathematical methods which are used for analysing data, for inferring underlying behaviour, for understanding phenomena and for making predictions.
We shall learn the fundamental principles of modelling, see how these are implemented in various techniques and examine the similarities and differences, advantages and disadvantages of various methods. While basic mathematical concepts will be covered, formal or abstract definitions and derivations will be avoided. The emphasis will rather be on the practical use of the techniques and for this purpose numerous example applications will be covered. The course will make use of the (freely available) statistical software package R and some instruction in its use will be provided. Participants are encouraged to install this package and to use it for trying out several of the machine learning methods covered in the course.
Techniques which will be covered include (provisional list):
This is an introductory course, so prior knowledge of or experience using machine learning methods is not required. Basic prerequisites for the course are first year mathematics, in particular calculus, linear algebra and statistics. The lectures will be in English. The course is suitable for mid-term or advanced undergraduates, graduates and postdocs, or anybody interested in learning about machine learning methods and how to use them. By the end of the course the participants should have the knowledge, confidence and tools to apply machine learning methods to their own data sets.
Course topics and related issues can be discussed either individually or in groups, in English or in German. To make an appointment, please send me an email, indicating the issues you would like to discuss. (Where am I?)
PDF and ODP files of the viewgraphs, as well as copies of the R scripts used, will be provided after each
lecture.
Note that these do not constitute a full set of
lecture notes (that's what the books and the lectures themselves are
for!).
Date | Topic | Viewgraphs | R scripts | Notes |
17 April | Introduction and basic concepts | [ODP] [PDF] | R scripts | |
24 April | Data exploration | [ODP] [PDF] | R scripts | |
1 May | No lecture (Feiertag) | |||
8 May | Linear methods (part 1) | [ODP] [PDF] | R scripts | |
15 May | Linear methods (part 2) | [ODP] [PDF] | R scripts | challenger data |
22 May | No lecture | |||
29 May | Basis expansions | [ODP] [PDF] | R scripts | |
5 June | No lecture | |||
12 June | Additive models and kernel functions | [ODP] [PDF] | R scripts | 19 June | Neural networks, search and optimization | [ODP] [PDF] |
26 June | More nonlinear stuff | [ODP] [PDF] | R scripts | |
3 July | Support vector machines | [ODP] [PDF] | R scripts | |
10 July | Model selection and combination | [ODP] [PDF] | R scripts | |
17 July | Unsupervised learning and clustering | [ODP] [PDF] | R scripts | |
24 July | The final lecture | [ODP] [PDF] |