Applied Predictive Modeling
Applied Predictive Modeling book cover

Applied Predictive Modeling

1st ed. 2013, Corr. 2nd printing 2018 Edition

Price
$48.96
Format
Hardcover
Pages
613
Publisher
Springer
Publication Date
ISBN-13
978-1461468486
Dimensions
6.14 x 1.31 x 9.21 inches
Weight
22.9 pounds

Description

"There are a wide variety of books available on predictive analytics and data modeling around the web. We've carefully selected the following 10 books, based on relevance, popularity, online ratings, and their ability to add value to your business. xa01. Applied Predictive Modeling ." xa0(Timothy King, Business Intelligence Solutions Review , solutions-review.com, June, 2015) "I used this as a supplement in teaching a data science course that I use a range of different resources because I need to cover working with data, model evaluation, and machine learning methods. The next time I teach this course, I will use only this book because it covers all of these aspects of the field." xa0(Louis Luangkesorn, lugerpitt.blogspot.com, June, 2015)"This is such a good book it has taken me a while to work through the book. xa0All the while finding examples of why people should read the book. Well thought out examples with the R packages and example code. Take your time and work through this book." xa0(Mary Anne, Cats and Dogs with Data, maryannedata.com, February, 2015)"This monograph presents a very friendly, practical course on prediction techniques for regression and classification models. The authors are recognized experts in modeling and forecasting, as well as developers of R packages and statistical methodologies. It is a well-written book very useful to students and practitioners who need an immediate and helpful way to apply complex statistical techniques." xa0(Stan Lipovetsky, Technometrics , Vol. 56 (3), August, 2014)"There are hundreds of books that have something worthwhile to say about predictive modeling. However, in my judgment, Applied Predictive Modeling by Max Kuhn and Kjell Johnson (Springer 2013) ought to be at the very top of the reading list. They come across like coaches who really, really want you to be able to do this stuff. They write simply and with great clarity. Applied Predictive Modeling is a remarkable text. It is the succinct distillation of years of experience of two expert modelers." xa0(Joseph Rickert, blog.revolutionanalytics.com, June, 2014) "This strong, technical, hands-on treatment clearly spells out the concepts, and illustrates its themes tangibly with the language R, the most popular open source analytics solution." (Eric Siegel, Ph.D. Founder, Predictive Analytics World, Author,xa0Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die) This text is intended for a broad audience as both an introduction to predictive models as well as a guide to applying them. Non-mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics. Max Kuhn , Ph.D., is a Senior Director in Research & Development at Pfizer in Groton, CT. xa0He has been applying predictive models in the pharmaceutical and diagnostic industries for over 15 years. xa0 Dr. Kuhn has made many of contributions to statistical computing. xa0He is the author of eight R packages for techniques in machine learning (notably caret ) and reproducible research and is an Associate Editor for the Journal of Statistical Software . Kjell Johnson , Ph.D, has over 15 years of predictive modeling and statistical consulting experience in pharmaceutical research and development and other industries. xa0He is a former Director of Statistics at Pfizer R&D, xa0and is a co-founder of Arbor Analytics, a firm that specializes in predictive modeling and statistical consulting and currently serves the pharmaceutical, medical devices, finance, and insurance industries. xa0His scholarly work centers on the application and development of statistical methodology and learning algorithms.Drs. Kuhn and Johnson have taught numerous short-courses on predictive modeling for organizations such as useR!, Predictive Analytics World, Eastern North American Region, American Chemical Society, Society for Biomolecular Screening, Deming Conference, and individual corporations. Read more

Features & Highlights

  • Winner of the 2014
  • Technometrics
  • Ziegel Prize for Outstanding Book
  • Applied Predictive Modeling
  • covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning.  The text then provides intuitive explanations of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems.  Addressing practical concerns extends beyond model fitting to topics such as handling class imbalance, selecting predictors, and pinpointing causes of poor model performance―all of which are problems that occur frequently in practice. The text illustrates all parts of the modeling process through many hands-on, real-life examples.  And every chapter contains extensive R code for each step of the process.  The data sets and corresponding code are available in the book's companion AppliedPredictiveModeling R package, which is freely available on the CRAN archive. This multi-purpose text can be used as an introduction to predictive models and the overall modeling process, a practitioner's reference handbook, or as a text for advanced undergraduate or graduate level predictive modeling courses.  To that end, each chapter contains problem sets to help solidify the covered concepts and uses data available in the book's R package. Readers and students interested in implementing the methods should have some basic knowledge of R.  And a handful of the more advanced topics require some mathematical knowledge.

Customer Reviews

Rating Breakdown

★★★★★
60%
(185)
★★★★
25%
(77)
★★★
15%
(46)
★★
7%
(22)
-7%
(-22)

Most Helpful Reviews

✓ Verified Purchase

Solid

I read "Applied predictive modeling" (which I will shorten to APM) shortly after I read "Introduction to statistical learning" (ISL) by James, Witten, Hastie and Tibshirani, and find that book both closest to APM, and helpful in highlighting APM's strengths.

The two books cover the same broad subject. If you google "kuhn caret", you will find Max Kuhn's (very informative) presentation of his "caret" R package, and its first slide will tell you that he uses "predictive modeling" as a synonym of "machine learning" - what Hastie and Tibshirani call "statistical learning". Adopting H&T's terminology choice, I will say that both books combine theory of "statistical learning" with hands-on illustrations and exercises implemented in R; the get-your-hands-dirty, try-it-out element is, in fact, ISL's key difference from the earlier, venerable "Elements of statistical learning".

Both books, inevitably, go over a catalog of statistical-learning techniques. The shorter ISL, in my opinion, is superior at explaining the concepts and communicating the principles, while APM takes the more straightforward approach of "beefing up" the catalog, by spending more pages on each item and including more items. While ISL is by design very accessible, APM can be more technical - the detail will surely be appreciated by any practitioner - and, as it talks about the various methods, it can and does discuss recent extensions, offering an extensive and "fresh" bibliography. R-wise, APM's advantage is not decisive (if you look at content, not line count) but big; the book naturally favors "caret" - which has a useful role, "wrapping" a plethora of third-party R packages, and providing a common interface, plus helpful utilities - but both references and uses the specialist packages as well.

If you are wondering why I am not giving APM five stars, it's because the book jumped into the catalog mode a bit too briskly, and delivered on the "applied" promise mostly by defining "applied" as "illustrated with R examples". I wish there were more chapters like Chapter 16, which talks about the very common problem of effective classification in highly unbalanced samples. Nonetheless, I am impressed by "Applied predictive modeling" and recommend it as a sensible follow-up, or maybe even alternative, to "Introduction to statistical learning".
148 people found this helpful
✓ Verified Purchase

Best Hands On Guide By Far

There are many fine math-oriented predictive modeling books, such as Hastie ([[ASIN:0387848576 The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics)]]). Kuhn et al consider them "sister texts" and begin immediately to differentiate-- their approach is hands on and practical, for the express purpose of demonstrating HOW to sort, structure and predict via Python or R, for the purpose of accuracy and understanding of the DATA and trends, NOT learning the underlying math.

For a couple of pharmaceutical guys, (who BTW use R extensively, I've been an analyst in that industry), you'd think the examples would be new chemical or biological entities. Not so! The cases are fun and exciting, ranging from the nontrivial compression strength of concrete (want that bridge to hold when you cross?) to fuel economy, credit scoring, success in grant applications (boy their colleagues will love that one!), and cognitive impairment. I evaluate technology for patents at payroy dot com, and we have a log likelihood model using Bayesian and Monte Carlo that their grant section helped translate seamlessly to R! We're NOT talking pie in the sky pseudo code here, but real life, real results recipes.

The authors talk about the "scholarly veil" -- meaning we general workers and researchers don't always "deserve" to see the underlying process, software and data (and, other than open source, often can't afford it). Wow, do they pop that myth! These authors are relentless in giving every detail, from design and binning to sorting and stacking to ANOVA, regressions, trees, error methods-- the whole ball of wax with live data and live R coding-- all on a shoestring budget! I guarantee you can start with basic stats and run a very well designed predictive model with the methods they detail, without having to pop for SAP/ IBM or SPSS.

One caveat-- even though they don't assume advanced partial differential equations or even probability theory, the R code and methods are at a fast clip. I'd say they are assuming you either have, or will fill in, with R basics and practice or experience. This is NOT a "how to use R" manual, even though it is in a sense-- it is a "how to apply R correctly and robustly in a way that will pass a juried look at your methods and conclusions." Again, REAL WORLD. For comparison, I'd put the math at advanced undergrad and the R at grad level/ professional practice levels. This will make the title excellent both for learning and professional reference. At this writing, the book is hard to find, and being marked up by resellers-- a tribute to its value and demand right out of the gate.

Springer is never cheap, but also never shabby-- the book is typically gorgeous, well edited, combed for errors (the code ran fine on my antique R download-- even though it's free, I'm hesitant to have to learn a new version!), and pedagogically awesome if you're considering this for a class. We recommend books for our library purchasers and of the 25 actively screened in this category (including a focus on prediction, not just data mining), this is in the top three with Hastie above! Highly recommended for research, augmentation, reference, as well as deep study. Lots of insights, too, about where big data, ML, mining and prediction are now and where they are going-- predicting prediction's future.

Library Picks reviews only for the benefit of Amazon shoppers and has nothing to do with Amazon, the authors, manufacturers or publishers of the items we review. We always buy the items we review for the sake of objectivity, and although we search for gems, are not shy about trashing an item if it's a waste of time or money for Amazon shoppers. If the reviewer identifies herself, her job or her field, it is only as a point of reference to help you gauge the background and any biases.
109 people found this helpful
✓ Verified Purchase

Wish I'd read this years ago

I wish I'd had this book 10 years ago, and the discipline to have sat down and read it thoroughly. It is well written, has beautiful plots that are worthy of a book on visualization all by themselves, has great coverage of topics, and is easy to understand.

There is a natural comparison to be made to [[ASIN:0387848576 The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics)]]. I found this book much, much better. Where ESLII was fractured and seemed to jump from point to point with no explanation, APM proceeded in a well thought-out manner. ESLII used some non-standard notation and assumptions, where APM used notation familiar to anyone with a background in statistics and linear algebra. To be fair, it may be that I'll return to ESL after having read APM and be able to bridge the leaps the authors made with material I've learned from this book.

The pros:
- Gives a solid introduction to the problem prediction is trying to solve
- Provides a framework for evaluating prediction results, using a consistent data set across all problems.
- Has citations and references for further reading
- Does a good job of contrasting machine learning black-box models and classical statistics' interpertability (see Breiman's Statistical Modeling: Two Cultures paper for some great insights into this phenomenon)

The cons:
- A bit light on theory, especially proofs and details behind the models. I feel this is a bit of a pro, though, since the citations for the work are provided, and the theorems and proofs are there if you are interested in them.
28 people found this helpful
✓ Verified Purchase

The Future is Now (it is with an accurate-enough prediction!)

After completing Introduction to Statistical Learning with applications in R, this takes the study of predictive modeling to a new level using the caret package in R. It is so much fun to read and experiment with that I carry it in my backpack, and I read it everywhere (including before going to sleep at night!).
20 people found this helpful
✓ Verified Purchase

First Rate Case Studies, Modeling Techniques, & R Code

My name is Matt. I’m an educator that focuses on data science in business applications. My background is business and mechanical engineering, not computer science. I don’t have a PhD. I’m an ordinary person that fell in love with Data Science. I’ve sense started an education business aimed at bringing applied data science courses to help business-minded people solve real world problems.

I purchased Applied Predictive Modeling after visiting a high performance hedge fund that employs a number of brilliant minds. This book appeared in most of the work spaces so I decided to pick up a copy and read it for myself.

I read the first half of APM on vacation and honestly I couldn’t put it down. The book goes into detail on a wide range of models, many of which I’d never heard of before. Beyond this, APM provides the R code showing exactly how to implement the models. For me, this application focus is valuable.

The book weaves in many case studies from pharmaceuticals, to business, to even using machine learning to find the optimal concrete formula.

I will say that this book is not for complete beginners, but as soon as you get through the basics this is a great book from two of the best minds in modeling. For beginners I recommend R For Data Science.

Hope this helps.

-Matt
17 people found this helpful
✓ Verified Purchase

An excellent book on modeling, marrying both depth and clarity

This was the best textbook in my coursework in the University of Texas' Business Analytics program. Kuhn doesn't presuppose too much knowledge of math, and the R examples make this book a 2 for 1--a great introduction to predictive modeling and a way to sharpen your R skills. I wish every modeling book was written as clearly as this one.

This is really the only book I've found that remains clear and understandable while going quite deeply into the theoretical underpinnings of popular predictive modeling techniques. It seems just about everything else out there is highly superficial and skips over the dirty guts of modeling, or is far too complicated and assumes you already have a PhD-level understanding of either stats, math, or computer science. In some ways, Kuhn has done the impossible with this book. Highly recommended.
12 people found this helpful
✓ Verified Purchase

A concise, useful book for applied predictive modeling

As a data scientist, I have read many books in the fields statistics, machine learning and deep learning. But I would say, this book is unparalleled in terms of conciseness and mathematical rigor. Some classics, like the holy bible in ML
"Elements of statistical learning", "Pattern recognition and machine learning" , they illustrate concepts clearly but rarely shed light on real-world application. Some more applied books, like "Data science for business", they give a intuitive way of understanding the concept behind the scene and are great for people who have no previous experience in data science area. For a person who has a master degree in statistics looking for a book that is neither theoretically heavy like ESL nor elementary like Foster & Tom classic, this book is what you need. Besides the clear explanation of a variety of statistical methods in each chapter, the authors guide us walk through a case study to make sense of every single line of code complied in the book.
I've learned quite a lot about data resampling, model tuning just after finishing up the Chapter 5. I am pretty sure I will benefit more as I delve into this book in subsequent chapters. I recommend you spend time reading through this great book and may you have the same feeling as mine.
9 people found this helpful
✓ Verified Purchase

A concise, useful book for applied predictive modeling

As a data scientist, I have read many books in the fields statistics, machine learning and deep learning. But I would say, this book is unparalleled in terms of conciseness and mathematical rigor. Some classics, like the holy bible in ML
"Elements of statistical learning", "Pattern recognition and machine learning" , they illustrate concepts clearly but rarely shed light on real-world application. Some more applied books, like "Data science for business", they give a intuitive way of understanding the concept behind the scene and are great for people who have no previous experience in data science area. For a person who has a master degree in statistics looking for a book that is neither theoretically heavy like ESL nor elementary like Foster & Tom classic, this book is what you need. Besides the clear explanation of a variety of statistical methods in each chapter, the authors guide us walk through a case study to make sense of every single line of code complied in the book.
I've learned quite a lot about data resampling, model tuning just after finishing up the Chapter 5. I am pretty sure I will benefit more as I delve into this book in subsequent chapters. I recommend you spend time reading through this great book and may you have the same feeling as mine.
9 people found this helpful
✓ Verified Purchase

Excellent treatment of predictive modeling techniques and pitfalls

I don't ordinarily write reviews because I don't feel as eloquent as most reviewers, but I have to say this is an excellent and accessible treatment of predictive modeling (aka machine learning) techniques. Unlike the classic, and also excellent, "The Elements of Statistical Learning" by Hastie , Tibshirani, and Friedman, this book takes a practical "how to" approach instead of the more traditional "mathematics background first" approach. Unlike the books for mathophobics, though, "Applied Predictive Modeling" does not dodge or avoid critical topics like feature selection or dimensionality reduction to avoid collinearity. I have seen machine learning books that, for example, never discuss in detail concepts like measuring the effectiveness of predictive algorithms with metrics like RMSE. This book discusses both the techniques themselves and the performance measures, along with caveats or precautions for each technique. It's almost as if the authors sat down and said, "what are the steps we take when faced with a new predictive modeling problem, from initial exploratory data analysis through final production algorithm design," and then used that as a framework for writing the book. I don't hesitate to recommend this book to beginning data scientists or more experienced practitioners - everyone can benefit from the authors' detailed treatments of each step in the process and the different approaches to regression and classification problems.

Each chapter concludes with a "Computing" section, wherein the authors provide R code to accomplish or at least illustrate each step discussed in the chapter. They use public data sets for all their work, so the reader can easily reproduce the exact illustrations used in the chapters. The code chunks are small enough, however, that the readers can easily use them in their own analytics problems. Also, they're not just code listings; there is considerable discussion of the techniques so people less fluent in R can keep up and learn. I found myself sometimes wishing that the R code was interleaved into the chapters, but this minor nit is my only critique of the book (if you could call it a critique), and it may not even be relevant because there are advantages to the "Computing" sections as well, such as ease of using them as reference while working through real-world analyses.

Sometimes I buy a machine learning book and am so overwhelmed by the formulas and formalistic approaches that I wonder if the expense of these kinds of books was worth it. I have no such questions with "Applied Predictive Modeling" - it is clearly worth the cover price. In fact, I'll be recommending it to my employees as a "must have" learning tool and reference book.
9 people found this helpful
✓ Verified Purchase

Great writing and the code is a godsend

If you want to learn how to do predictive modeling and you are not a PhD in math get this. (If you are a PhD get it anyway so you can enjoy reading the clear writing.) The prose is some of the best I have ever seen in a technical book and the code is exceptionally useful.

This is not mathematically rigorous but what it lacks in proofs and derivations it more than makes up for in superb explanations of why different methods are used. If you run into machine learning algorithms at work or school and you want to get a feel for them make this your first purchase. The code that supports this book is a godsend. Between the book and the website (which has the code) you will get clean explanations and you will be able to implement what you are taught using R.
9 people found this helpful