Applied Data Mining for Forecasting Using SAS. Tim Rey
Steps with SAS Tools
2.3.3 Forecasting Steps with SAS Tools
2.3.4 Model Deployment Steps with SAS Tools
2.3.5 Model Maintenance Steps with SAS Tools
2.3.6 Guidance for SAS Tool Selection Related to Data Mining in Forecasting
2.4 Work Process Integration in Six Sigma
2.4.3 Integration with the DMAIC Process
2.1 Introduction
This chapter describes a generic work process for implementing data mining in forecasting real-world applications. By work process the authors mean a sequence of steps that lead to effective project management. Defining and optimizing work processes is a must in industrial applications. Adopting such a systematic approach is critical in order to solve complex problems and introduce new methods. The result of using work processes is that productivity is increased and experience is leveraged in a consistent and effective way. One common mistake some practitioners make is jumping to real-world forecasting applications while focusing only on technical knowledge and ignoring the organizational and people-related issues. It is the authors' opinion that applying forecasting in a business setting without a properly defined work process is a clear recipe for failure.
The work process presented here includes a broader set of steps than the specific steps related to data mining and forecasting. It includes all necessary action items to define, develop, deploy, and support forecasting models. First, a generic flowchart and description of the key steps is given in the next section, followed by a specific illustration of the work process sequence when using different SAS tools. The last section is devoted to the integration of the proposed work process in one of the most popular business processes widely accepted in industry–Six Sigma.
2.2 Work Process Description
The objective of this section is to give the reader a condensed description of the necessary steps to run forecasting projects in the real world. We begin with a high-level overview of the whole sequence as a generic flowchart. Each key step in the work process is described briefly with its corresponding substeps and specific deliverables.
2.2.1 Generic Flowchart
The generic flowchart of the work process for developing, deploying, and maintenance of a forecasting project based on data mining is shown in Figure 2.1. The proposed sequence of action items includes all of the steps necessary for successful real-world applications–from defining the business objectives to organizing a reliable maintenance program to performance tracking of the applied forecasting models.
Figure 2.1: A Generic flowchart of the proposed work process
The forecasting project begins with a project definition phase. It gives a well-defined framework for approving the forecasting effort based on well-described business needs, allocated resources, and approved funding. As most practitioners already know, the next block—data preparation—often takes most of the time and the lion's share of the cost. It usually requires data extraction from internal and external sources and a lot of tricks to transfer the initial disarray in the data into a time series database acceptable for modeling and forecasting. The appropriate tricks are discussed in detail in Chapters 5 and 6.
The block for variable reduction and selection captures the corresponding activities, such as various data mining and modeling methods, that are used to take the initial broad range of potential inputs (Xs) that drive the targeted forecasting variables (outputs, Ys) to a short list of the most statistically significant factors. The next block includes the various forecasting techniques that generate the models for use. Usually, it takes several iterations along these blocks until the appropriate forecasting models are selected, reliably validated, and presented to the final user. The last step requires an effective consensus building process with all stakeholders. This loop is called the model development cycle.
The last three blocks in the generic flowchart in Figure 2.1 represent the key activities when the selected forecasting models are transferred from a development environment to a production mode. This requires automating some steps in the model development sequence, including the monitoring of data quality and forecasting performance. Of critical importance is tracking the business performance metric as defined by its key performance indicators (KPIs), and tracking the model performance metric as defined by forecasting accuracy criteria. This loop is called the model deployment cycle in which the fate of the model depends on the rate of model performance degradation. In the worst-case scenario of consistent performance degradation, the whole model development sequence, including project definition, might be revised and executed again.
2.2.2 Key Steps
Each block of the work process is described by defining the related activities and detailed substeps. In addition, the expected deliverables are discussed and illustrated with examples when appropriate.
Project definition steps
The first key step in the work process—project definition—builds the basis for forecasting applications. It is the least formalized step in the sequence and requires proactive communication skills, effective teamwork, and accurate documentation. The key objectives are to define the business motivation for starting the forecasting project and to set up as much structure as possible in the problem by effective knowledge acquisition. This is to be done well before beginning the technical work. The corresponding substeps to accomplish this goal as well as the expected deliverables from project the definition phase are described below.
Project objectives definition
This is one of the most important and most often mishandled substeps in the work process. A key challenge is defining the economic impact from the improved forecasts through KPIs such as reduced cost, increased productivity, increased market share, and so on. In the case of demand-driven forecasting, it is all about getting the right product to the right customer at the right time for the right price. Thus, the value benefits can be defined as any of the following (Chase 2009):
a reduction in the instances when retailers run out of stock
a significant reduction in customer back orders
a reduction in the finished goods inventory carrying costs
consistently high levels of customer service across all products and services
It is strongly recommended to quantify each of these benefits (for example, “a 15% reduction in customer back orders on an annual basis relative to the accepted benchmark”).
An example of an appropriate business objective for a forecasting project follows:
More accurate forecasts will lead to proactive business decisions that will consistently increase