Applied Data Mining for Forecasting Using SAS. Tim Rey
2.5: Key blocks of data mining in Six Sigma
Figure 2.5 is from Alex Kalos and Tim Rey's paper, “Data mining in the chemical industry” (2005). The details of using this data mining process within the Six Sigma framework are also given in this paper.
Data mining in forecasting within DMAIC
The other option of integrating the proposed work process for data mining in forecasting within Six Sigma is illustrated in Figure 2.6 where we can see the corresponding links between the key blocks of both methodologies. The project definition steps, including system identification, are part of the define phase of DMAIC. The data preparation steps belong to the measure phase, and both variable selection and reduction and Forecasting model development steps are included in the analyze phase. The forecasting model deployment steps are part of the improve phase of DMAIC and the last part of the forecasting work process, the forecasting model maintenance steps, are linked to the control phase of DMAIC.
Figure 2.6: Correspondence of the Data Mining in Forecasting Work Process with DMAIC
Because of the clear link between the proposed work process (based on the requirements for developing high-performance forecasting) and a work process such as Six Sigma (that is almost universally adopted in industry), you can integrate the two processes with minimal effort and cultural change. As a result, you have greater opportunities to introduce the proposed methodology and can more efficiently manage projects and develop forecast systems.
Appendix: Project Charter
Opportunity Statement:
Current forecast is judgmental with an average Mean Average Percent Error (MAPE) of 16.5 for four quarterly forecasts.
The opportunity is to improve the forecast by using statistical methods.
The key hypothesis is that more accurate forecasts will lead to proactive business decisions that will increase consistently profit by at least 10%.
Project Goal and Objective:
The technical objective of the project is to develop, deploy, and support, for at least three years, a quarterly forecasting model that projects the price of Product A for a two-year time horizon and that outperforms the accepted statistical benchmark (naïve forecasting in this case) by 20% based on average of four consecutive quarterly forecasts.
Project Scope and Boundaries:
The project will focus on Product A price in Germany.
Deliverables:
a forecasting model with user interface in Excel
a decision scheme with proactive action items to increase profits
Timeline:
Estimated duration of the key steps of the project:
Project definition: | 40 hours |
Data preparation: | 80 hours |
Model development: | 60 hours |
Model deployment: | 20 hours |
Model maintenance: | 10 hours per year |
Team Composition:
The ideal team includes:
Management sponsor
Project owner
Project leader
Technical subject matter experts
Model developers
End users
1 A good starting point for developing mind-maps is Tony Buzan's The Mind-map Book (2003).
2 The mind-maps in this book are based on the Mindjet product MindManager 8, available from http://www.mindjet.com/.
3 A classic book about data preparation is Dorian Pyle's Data Preparation for Data Mining (1999).
4 Evans, C., Liu, C. and Pham-Kanter, G. “The 2001 recession and the Chicago Fed National Activity Index: Identifying business cycle turning points,” Economic Perspectives 26, no. 3 (2002): 26–43.
5 The FVA method is described in Michael Gilliland's book, The Business Forecasting Deal (2010).
6 A book with many examples of using different SAS solutions for data preparation is Gerhard Svolba's Data Preparation for Analytics Using SAS (2006).
7 A good explanation of X11 and X12 is given by Spyros G. Makridacis et al. in Forecasting: Methods and Applications (1997).
8 Friedman, J. H. “Greedy function approximation: A gradient boosting machine,” Annals of Statistics 29 (2001): 1189–1232.
9 A useful classification of the SAS/ETS functions is given in Table 1.1 in the book SAS for Forecasting Time Series (2003) by John Brocklebank and David Dickey.
10 A detailed description of S&OP is given in Charles Chase's Demand-Driven Forecasting: A Structured Approach to Forecasting (2009).
11 The reader can find more information about Six Sigma in Implementing Six Sigma: Smarter Solutions Using Statistical Methods (2003) by Forrest Breyfogle III.
12 January/February 2007 Issue at http://www.isixsigma-magazine.com/
Chapter 3: Data Mining for Forecasting Infrastructure
3.2.1 Personal Computers Network Infrastructure
3.2.2 Client/Server Infrastructure
3.2.3 Cloud Computing Infrastructure
3.3.1 Data Collection Software
3.3.2 Data Preparation Software
3.3.5 Software Selection Criteria