Sharing a piece of reply I have on the Kaggle forum here. If you like to see REAL machine learning code, try starting from this solution collection I have.

Here are my suggestions for people who are not very mathematical (i.e. you don’t breathe very difficult mathematics day-to-day), but more from a software engineer background. If you’re a very math guy (who can read without any code and start doing magic), this might not be the best approach for learning machine learning.


With absolutely no background, I highly recommend to begin with Andrew’s Ng “Machine Learning” course on Coursera, which is taught and geared towards an audience with no machine learning background.


Recently I found that the Python machine learning library Sci-kit learn User Guide is very informative, with *runnable code*, which you can immediately see to the effects of what you are doing. I also recommend that a lot if you would prefer coding to learn (I need code to learn). Also, this library is also very easy to use and very powerful, which is the one I am using now primarily. However, you should approach only picking code examples and not aim to understand everything just reading once. Dive deeper when you need it on certain topics.


Next, for Kaggle-specific competitions, Zygmunt’s (Foxtrot) blog at FastML also contains a lot of examples (though it might not run now due to the new submission rules) which brings you up to speed. You may also find a lot of code that other players shared in various competitions this various Kaggle competition forums, which is also good to get up to speed.


By now, you should be more than just unable to submit (I was there). I would suggest that you also start adding and fortifying theoretical and mathematical basics from Machine Learning by Tom Mitchell, which is a rather readable book (compared to other books, see below).

I know that the posters above recommended books like “The Elements of Statistical Learning”, or even I have books like “Bayesian Reasoning and Machine Learning”, “Pattern Recognition and Machine Learning”, I felt these are more mathematical and unless you are really natural at it, I believe you will find it less readable than Machine Learning by Tom Mitchell. Do note that this book is a classic and has not been updated for a bit (planned to), some recent adopted approaches like Random Forests are not included.

Machine Learning – Tom Mitchell


If you prefer a programmer’s introduction to machine learning, I believe you can try Programming Collective Intelligence which is geared towards software developers, less mathematics.

Programming Collective Intelligence


Like any other principles like web application development, this needs practice and deep understanding, and I think you will need datasets to start with for different examples, Sci-kit Learn has some sample datasets included, but you may also find it UCI Machine Learning Repository, if you feel adventurous.

UCI Machine Learning Repository


I have been in this position of no help and honestly the introduction is not easy. I really owe Zygmunt’s (Foxtrot) for the solution code to start with, I strongly suggest you to go check it out along side having Andrew Ng’s course. Thank you Zygmunt again (if you happen to read this). =]