NUIM CS 401: Machine Learning: 2010

2010-11-16

HMM Cheat Sheet!

2010-10-12

Assignment One: Multiple Choices

This assignment involved multiple choices.

R or whatever other language you want: your choice.
Work alone or in pairs, choose a partner: your choice.

And what should you do? One of the following:

Use a k-means clustering on the digits in the MNIST dataset, label the clusters according to their most common member, and measure classification performance on the training and testing sets. (Also show the cluster centres, when not using kernel trick).
Train an SVM to classify the training set (ten SVMs, one SVM for each digit class, with one of them outputting, e.g., +1 for "is a 3" and -1 for "not a 3"), and measure performance on the training and testing sets.

Your choice!

For both the SVM and the k-means above, you should do it twice: once with no kernel trick, and once with the kernel trick. When using the kernel trick, what kernel(s) should you try? Your choice!

Each team should turn in their report either via email to barak+cs401@cs.nuim.ie or on paper: your choice.

Due: before class, Mon 1-Nov-2010.

R Reference Materials

This post contains pointers to R reference materials, and will be updated.

Google search for R Tutorial
An Introduction to R (pdf), from the central R web site, something between a tutorial and a manual
Jo Hardin's R Tutorial (pdf), shorter
Notes from Thomas Lumley's two-day short course in R
Video R Tutorial
Short html R Tutorial with implausibly poor colour choice
Short well-organised html R Tutorial with that National Geographic look
Yet another html R Tutorial, one big page with nice graphics and examples

Lecture Notes

More lecture notes, thanks to Paul Murray (!)
notes3 (w/ annotations)

2010-10-10

Learn R Because it is Hot

R is so hot right now.

Functional programming Schemers like myself have spent years whining about how just because something like Cobol or Fortran or C or C++ or Java or perl is "hot" or "standard" or "used by highly profitable companies" or "easy to get a job if you know" does not mean it is actually good or worth learning. Well now the shoe's on the other foot, suckers: R is Hot!
(Update: machine learning is so hot.)

2010-10-08

Max vs Min

Oops! In the lecture of 5-Oct-2010 on finding the maximum margin hyperplane, I wrote max ||w||² where I should have written min ||w||². (Thanks to Thomas Whelan for spotting it.)

2010-10-05

Support Vector Machines

Wikipedia has a reasonably good entry on Support Vector Machines. The original paper proposing the technique is also a good resource, quite readable with good motivation: Corinna Cortes and Vladimir N. Vapnik (1995, Support-Vector Networks, Machine Learning 20).

2010-10-04

Lecture Notes

Here are some lecture notes, thanks to Paul Murray.
notes1 (w/ annotations)
notes2 (w/ annotations)

Lecture Notes

If you volunteer to take good notes in class and send them to me for posting on this blog, you will be rewarded with excellent karma and be bathed in warm feelings of good fellowship from the tips of your toes to the top of your head.

(And also extra credit.)

2010-10-01

Machine Learning Competition: the Hearst Challenge

New Machine Learning competition with a prize of $25,000 for the system best able to predict magazine sales: the Hearst Challenge. (If you win, I'll also give you extra credit for this course. In fact, if you are part of a team that puts together a serious entry, I'll give you extra credit for this course.)

2010-09-29

Machine Learning Textbooks: Excellent and Online

This is a list of textbooks about machine learning which are (a) really good, and (b) free on the web.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, by Trevor Hastie, Robert Tibshirani, and Jerome Friedman (deep coverage of an important subset of the field)
Information Theory, Inference, and Learning Algorithms, by David MacKay (very mathematical approach, excellent but hard core)
Reinforcement Learning: An Introduction, by Richard S. Sutton and Andrew G. Barto (dated, but the bible of reinforcement learning)
Bayesian Reasoning and Machine Learning, by David Barber (incomplete coverage of the field, but solid and accessible and includes Octave/Matlab code)
Machine Learning by Simon Rogers (I will be using some of the chapters as lecture notes)
Introduction to Machine Learning: Draft of Incomplete Notes by Nils J. Nilsson (excellent writer; his earlier book on AI saved my bacon; slightly dated)

If you know of others (criteria: of general Machine Learning interest, not something highly specific like Gaussian Processes) post a comment and I'll add above.

(Updated 13-Oct-2010)

NUIM CS 401: Machine Learning

2010-11-16

HMM Cheat Sheet!

2010-10-12

Assignment One: Multiple Choices

R Reference Materials

Lecture Notes

2010-10-10

Learn R Because it is Hot

2010-10-08

Max vs Min

2010-10-05

Support Vector Machines

2010-10-04

Lecture Notes

Lecture Notes

2010-10-01

Machine Learning Competition: the Hearst Challenge

2010-09-29

Machine Learning Textbooks: Excellent and Online

2010-09-28

Digits Dataset

R

Welcome to NUIM CS 401: The Blog!