ÐÓ°ÉÂÛ̳

 

ST444      Half Unit
Statistical Computing

This information is for the 2019/20 session.

Teacher responsible

Dr Yining Chen

Availability

This course is available on the MSc in Data Science, MSc in Operations Research & Analytics, MSc in Statistics, MSc in Statistics (Financial Statistics), MSc in Statistics (Financial Statistics) (ÐÓ°ÉÂÛ̳ and Fudan), MSc in Statistics (Financial Statistics) (Research), MSc in Statistics (Research), MSc in Statistics (Social Statistics) and MSc in Statistics (Social Statistics) (Research). This course is available with permission as an outside option to students on other programmes where regulations permit.

Course content

An introduction to the use of numerical linear algebra, optimisation, numerical integration and simulation in statistical computation, with their applications in statistical methods, including least squares, maximum likelihood, principle component analysis, LASSO, etc. If time permits, more advanced topics such as kernel methods and graphical LASSO will also be covered. Throughout the course, students will gain practical experience of implementing these computational methods in a programming language. Learning support will be provided for at least one programming language, such as R, Python or C++, but the choice of language supported may vary between years, depending on judged benefits to students, whether in terms of pedagogy or resulting skills. This year, the default choice is Python.

Teaching

20 hours of lectures and 10 hours of computer workshops in the MT.

Lectures will cover:

(1) Introduction to Tools in Numerical Analysis: linear algebra (Gaussian elimination, Cholesky decomposition, matrix inversion and condition); numerical optimization (bi-section, steepest descent, Newton’s method, Quasi-Newton methods, stochastic search); convex optimization (coordinate descent, ADMM); numerical integration.

(2) Introduction to Tools in Numerical Simulation: random number generation (inverse CDF, rejection, Box-Muller, etc); Introduction to Monte-Carlo methods.

(3) Applications in Statistics: linear regression and least squares; generalised linear models; principle component analysis (PCA); Page rank;  LASSO.

(4) Other advanced topics if time allows: bootstrapping; kernel density estimation; Graphical models and Graphical LASSO.

Formative coursework

Students will be expected to produce 4 problem sets in the MT.

Bi-weekly exercises, involving computer programming and some theory.

Indicative reading

Computational Statistics by Givens and Hoeting

Statistical computing in C++ and R by Eubank and Kupresanin

The Art of R Programming: A Tour of Statistical Software Design by Matloff

Think Python: How to Think Like a Computer Scientist by Downey

Assessment

Exam (70%, duration: 2 hours) in the summer exam period.
Coursework (30%) in the MT.

Student performance results

(2015/16 - 2017/18 combined)

Classification % of students
Distinction 25.3
Merit 44.6
Pass 25.3
Fail 4.8

Key facts

Department: Statistics

Total students 2018/19: 19

Average class size 2018/19: 19

Controlled access 2018/19: No

Value: Half Unit

Personal development skills

  • Self-management
  • Team working
  • Problem solving
  • Application of information skills
  • Communication
  • Application of numeracy skills