Applied Data Mining for Forecasting Using SAS. Tim Rey
Creating Time Series Data Using Accumulation Methods
Creating Data Hierarchies Using Aggregation Methods
10.2 Statistical Forecast Reconciliation
10.3 Intermittent Demand
10.4 High-Frequency Data and Mixed-Frequency Forecasting
High-Frequency Data
Mixed-Interval Forecasting
10.5 Holdout Samples and Forecast Model Selection in Time Series
Introduction
10.6 Planning Versus Forecasting and Manual Overrides
10.7 Scenario-Based Forecasting
10.8 New Product Forecasting
Chapter 11 Model Building: Alternative Modeling Approaches
11.1 Nonlinear Forecasting Models
11.1.1 Nonlinear Modeling Features
11.1.2 Forecasting Models Based on Neural Networks
11.1.3 Forecasting Models Based on Support Vector Machines
11.1.4 Forecasting Models Based on Evolutionary Computation
11.2 More Modeling Alternatives
11.2.1 Multivariate Models
11.2.2 Unobserved Component Models (UCM)
Chapter 12 An Example of Data Mining for Forecasting
12.1 The Business Problem
12.2 The Charter
12.3 The Mind Map
12.4 Data Sources
12.5 Data Prep
12.6 Exploratory Analysis and Data Preprocessing
12.7 X Variable Imputation
12.8 Variable Reduction and Selection
12.9 Modeling
12.10 Summary
Preface
It is utterly impossible that a mathematical formula should make the future known to us, and those who think it can would once have believed in witchcraft.
Jacob Bernoulli, in Ars Conjectadi, 1713
Curiosity about “what will happen next” is part of human nature, and thus the first attempts at forecasting are found rooted in history. In the ancient and medieval times, prophets like the Oracle of Delphi or Nostradamus had the status of demigods. The situation is significantly different in the 21st century, though, when predicting the future is not divine magic anymore but a necessity in contemporary business. Thousands of professionals are building forecasts in almost all areas of human activity. Since the global recession of 2008–2009, it has been much more widely understood that reliable forecasting is necessary.
The increased demand for forecasting triggered the development of new methods in addition to the “classical” time series statistical approaches, such as exponential smoothing and the Box-Jenkins AutoRegressive Integrated Moving-Average (ARIMA) models. One fruitful direction of development is that of nonlinear time series modeling, based on various computational intelligence methods, such as neural networks, support vector machines, and genetic programming. Other developments, of special importance to industrial applications, are the efforts for improving the time series forecasts by selecting the best potential drivers using various data mining methods. A short list of such methods includes but is not limited to the following: similarity analysis, sequential pattern matching, Principal Component Analysis (PCA), decision trees, co-integration analysis, variable cluster analysis, stepwise regression, and genetic programming.
Unfortunately, the available literature for integrating data mining methods in forecasting is very limited. The existing books on the market are either focused on forecasting methods or on data mining approaches. In addition, there are very few references that discuss the numerous practical issues of applying forecasting in a business setting. The practitioner needs a book that addresses the issues of applied industrial forecasting, gives a framework for integrating data mining and time series forecasting, and gives a methodology for large-scale multivariate industrial forecasting.
Applied Data Mining for Forecasting Using SAS is one of the first books on the market that fills this need.
Purpose of the Book
The purpose of the book is to give the reader an industrial perspective concerning applying data mining for forecasting different business activities using some of the most popular software—SAS Institute's range of SAS products including Base SAS, SAS Enterprise Guide, SAS Enterprise Miner, and SAS Forecast Server. The key topics of the book are as follows:
1 What a practitioner needs to know to successfully apply data mining for forecasting – The first main topic of the book focuses on the ambitious task of giving guidelines to practitioners about building the necessary framework for effective forecasting in a business setting. It covers the issues of justifying the need for industrial forecasting, offering a work process within the popular Six Sigma platform, and discussing the necessary infrastructure and application issues.
2 How data mining improves forecasting – The second key topic of the book clarifies the important question of using data mining for forecasting. Its main focus is on presenting the key data mining methods for variable reduction and selection and their implementation in SAS.
3 How to apply data mining for forecasting in practice – The third key topic of the book covers the central point of interest: the application strategy for business forecasting. It includes a short survey of the key contemporary forecasting methods based on time series and illustrates them with appropriate examples from business practices.
Who This Book Is For
The targeted audience is much broader than the existing scientific communities in forecasting and data mining. The readers who can benefit from this book are described below:
Industrial practitioners – This group includes forecasters in a number of different traditional company departments, such as strategy, sales, marketing, finance, supply-chain, purchasing, and so on. They will benefit from the book by understanding the impact of data mining on forecasting and using the discussed forecasting methods and application methodology to broaden and improve their forecast's performance.
Data miners and modelers – This group consists of the large professional community of users of data mining technologies in different industries. This book will introduce them to contemporary forecasting methods and will demonstrate how they can leverage their data mining skills in the area of industrial forecasting.
Econometricians – This group includes the key community driving the demand for development and application of time series statistical methods, which is at the basis of industrial forecasting. The book will give them substantial information about data mining methods related to time series forecasting, as well as important feedback from industry