Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Fundamentals of machine learning for predictive data analytics : algorithms, worked examples, and case studies
Kelleher J., Mac Namee B., D’Arcy A., The MIT Press, Cambridge, MA, 2015. 624 pp. Type: Book (978-0-262029-44-5)
Date Reviewed: Sep 28 2016

This is an interesting book that presents the fundamentals of machine learning for predictive analytics in an intuitive and highly educational way. The book is structured in 11 chapters and three appendices. Each chapter finishes with a useful summary, references for further reading, and exercises.

Chapter 1, “Machine Learning for Predictive Data Analytics,” introduces the concept of machine learning, justifying the interest of several predictive models and the key concept of inductive bias; in addition, it describes the cross-industry standard process for data mining (CRISP-DM). The following chapters can be considered as structured in three parts, corresponding with phases of the CRISP-DM process.

The first part would correspond with the business understanding, data understanding, and data preparation phases of CRISP-DM. In particular, chapter 2, “Data to Insights to Decisions,” explains how to convert business problems into data analytic solutions, introducing concepts such as the analytics base table (ABT), domain concepts, descriptive features, derived features, proxy features, and target features; it also covers legal issues with the use of data. Chapter 3, “Data Exploration,” presents techniques to get to know the data available (for example, data distributions and visualization techniques such as scatter plots or small multiples, to cite a couple of examples), data quality issues that may arise (for example, outliers, missing values, irregular cardinality, and so on) as well as potential solutions to solve those issues, and other data preparation tasks (normalization, binning, sampling); it considers the data quality report as the main tool.

The second part concerns the modeling phase of CRISP-DM. Each chapter in this part focuses on a different family of machine learning algorithms. Chapter 4, “Information-based Learning,” tackles techniques that use concepts from information theory to build prediction models; it specifically covers decision trees as the fundamental data structure. It also presents model ensembles (boosting and bagging). Chapter 5, “Similarity-based Learning,” focuses on approaches that identify similar previous cases to predict new data; in particular, it focuses on the nearest neighbor algorithm and its extensions and variations (for example, approaches to handle noise in data and imbalanced datasets, and feature selection techniques). It presents different distance metrics and similarity measures, as well as the k-d tree for efficient searching. Chapter 6, “Probability-based Learning,” deals with techniques based on probability theory, which exploit probabilities to determine the most likely predictions; it covers the naive Bayes model (and its variants) and Bayesian networks. Finally, chapter 7, “Error-based Learning,” describes approaches that try to minimize prediction error; it focuses on the multivariable linear regression with gradient descend as well extensions and related approaches (for example, logistic regression and support vector machines). The summary section of these chapters includes an analysis of weaknesses and advantages of the different techniques.

The third part of the book considers the evaluation and deployment phases of CRISP-DM. Chapter 8, “Evaluation,” focuses on evaluation techniques; it describes different evaluation techniques and a range of performance metrics, including the misclassification rate, confusion matrix-based metrics, the average class accuracy, profit and loss, the receiver operating characteristic (ROC) index and ROC curve, gain and lift, and the R2 coefficient, among others. Chapters 9, “Case Study: Customer Churn,” and 10, “Case Study: Galaxy Classification,” describe two interesting case studies that illustrate the concepts explained throughout the book.

Chapter 11, “The Art of Machine Learning for Predictive Data Analytics,” presents alternative taxonomies of machine learning techniques (for example, parametric versus nonparametric models and generative versus discriminative models) and provides guidelines for choosing an appropriate machine learning technique to solve a given problem.

Finally, three appendices provide basic concepts useful to understand the book. Appendix A, “Descriptive Statistics and Data Visualization for Machine Learning,” introduces basic statistical measures and visualization techniques, and Appendix B, “Introduction to Probability for Machine Learning,” presents the basic concepts of probability theory. Appendix C, “Differentiation Techniques for Machine Learning,” covers derivatives of continuous functions, the chain rule, and partial derivatives.

The book could be particularly interesting in a teaching context, as it includes very clear and useful explanations that will help students to understand the concepts and techniques covered. One of its strongest points is the inclusion of examples that clearly illustrate the explanations. Some concepts are explained or introduced in a very original way, in comparison with other books that follow a more traditional and formal approach. The book focuses only on supervised machine learning, and techniques such as artificial neural networks and deep learning are not described. While practical, the book concerns the theoretical concepts and techniques related to machine learning for predictive analytics, and therefore the use of specific tools is not considered. The book is supported by a website (www.machinelearningbook.com) that offers additional material, such as slides, datasets, sample chapters, and solutions to some exercises. Moreover, the website announces an upcoming release of code for the examples contained in the book. It is also possible for teachers to request the complete set of solutions for the exercises in the book.

More reviews about this item: Amazon

Reviewer:  Sergio Ilarri Review #: CR144789 (1612-0877)
Bookmark and Share
  Reviewer Selected
 
 
Learning (I.2.6 )
 
 
Data Mining (H.2.8 ... )
 
 
Content Analysis And Indexing (H.3.1 )
 
 
Pattern Recognition (I.5 )
 
Would you recommend this review?
yes
no
Other reviews under "Learning": Date
Learning in parallel networks: simulating learning in a probabilistic system
Hinton G. (ed) BYTE 10(4): 265-273, 1985. Type: Article
Nov 1 1985
Macro-operators: a weak method for learning
Korf R. Artificial Intelligence 26(1): 35-77, 1985. Type: Article
Feb 1 1986
Inferring (mal) rules from pupils’ protocols
Sleeman D.  Progress in artificial intelligence (, Orsay, France,391985. Type: Proceedings
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy