Applied Data Mining for Forecasting Using SAS. Tim Rey

Applied Data Mining for Forecasting Using SAS - Tim Rey


Скачать книгу
at least 10% for the next three years.

      Another challenge is finding a forecasting performance metric that is measurable, can be tracked, and is appropriate for defining success. An example of an appropriate quantitative objective that satisfies these conditions is the following definition:

      The technical objective of the project is to develop, deploy, and support, for at least three years, a quarterly forecasting model that projects the price of Product A for a two-year time horizon and that out-performs the accepted statistical benchmark (naïve forecasting in this case) by 20% based on the average of the last four consecutive quarterly forecasts.

      The key challenge, however, is ensuring that the defined technical objective (improved forecasting) will lead to accomplishing the business goal (increased profitability).

      Project scope definition

      Defining the forecasting project scope also needs to be as specific as possible. It usually includes the business geography boundaries, business envelope, market segments covered, data history limits, forecasting frequency, and work process requirements. For example, the project scope might include boundaries such as the following: the developed forecasting model will predict the prices of Product A in Germany based on internal record of sales. The internal historical data to be used starts in January of 2001, uses quarterly data, and the project implementation has to be done in Six Sigma according to the standard requirements for model deployment and with the support of the Information Technologies department.

      Project roles definition

      Identifying appropriate stakeholders is another very important substep to take to ensure the success of forecasting projects. In the case of a typical large-scale business forecasting project, the following stakeholders are recommended as members of the project team:

       the management sponsor who provides the project funding

       the project owner who has the authority to allow changes in the existing business process

       the project leader who coordinates all project activities

       the model developers who develop, deploy, and maintain the models

       the subject matter experts—SMEs—who know the business process and the data

       the users (use the forecasting models on a regular basis)

      System structure and data identification

      The purpose of this substep is to capture and document the available knowledge about the system under consideration. This step provides a meaningful context for the necessary data and the data mining and forecasting steps. Knowledge acquisition usually takes several brainstorming sessions facilitated by model developers and attended by selected subject matter experts. The documentation may include process descriptions, market structure studies, system diagrams and process maps, relationship maps, etc. The authors' favorite technique for system structure and data identification is mind-mapping, which is a very convenient way of capturing knowledge and representing the system structure during the brainstorming sessions.

      Mind-mapping (or concept mapping) involves writing down a central idea and thinking up new and related ideas that radiate out from the center.1 By focusing on key topics written down in SME's words, and then defining branches and connections between the topics, the knowledge of the SMEs can be mapped in a manner that will help understanding and document the details of knowledge necessary for future data and modeling activities. An example of a mind-map2 for system structure and data identification in the case of a forecasting project for Product A is shown in Figure 2.2.

      The system structure, shown in the mind-map in Figure 2.2, includes three levels. The first level represents the key topics related to the project by radial branches from the central block named “Product A Price Forecasting.” In this case, according to the subject matter experts, the central topics are: Data, Competitors, Potential drivers, Business structure, Current price decision-making process, and Potential users. Each key topic can be structured in as many levels of detail as necessary. However, beyond three levels down, the overall system structure visualization becomes cumbersome and difficult to understand. An example of an expanded structure of the key topic Data down to the third level of detail is shown in Figure 2.2. The second level includes the two key types of data – internal and external. The third level of detail in the mind-map captures the necessary topics related to the internal and external data. All other key topics are represented in a similar way (not shown in Figure 2.2). The different levels of detail are selected by collapsing or expanding the corresponding blocks or the whole mind-map.

images

      Project definition deliverables

      The deliverables in this step are: (1) project charter, (2) team composition, and (3) approved funding. The most important deliverable in project definition is the charter. It is a critical document which in many cases defines the fate of the project. Writing a good charter is an iterative process which includes gradually reducing uncertainty related to objectives, deliverables, and available data. The common rule of thumb is this: the less fuzzy the objectives and the more specific the language, the higher the probability for success. An example of the structure of this document in the case of the Product A forecasting project is given in the Appendix at the end of this chapter.

      The ideal team composition is shown in the corresponding charter section in the Appendix. In the case of some specific work processes, such as Six Sigma, the roles and responsibilities are well defined in generic categories like green belts, black belts, master black belts, and so on.

      The most important practical deliverable in the project definition step is a committed financial support for the project since this is when the real project work begins. No funding—no forecasting. It is as simple as that.

      Data preparation steps

      Data preparation includes all necessary procedures to explore, clean, and preprocess the previously extracted data in order to begin model development with maximal possible information content in the data.3 In reality, data preparation is time consuming, nontrivial, and difficult to automate. Very often it is also the most expensive phase of applied forecasting in terms of time, effort, and cost. External data might need to be purchased, which can be a significant part of the project cost. The key data preparation substeps and deliverables are discussed briefly below. The detailed description of this step is given in Chapters 5 and 6.

      Data collection

      The initial data collection is commonly driven by the data structure recommended by the subject matter experts in the system structure and data identification step. Data collection includes identifying the internal and external data sources, downloading the data, and then harmonizing the data in a consistent time series database format.

      In the case of the example for Product A price forecasting, data collection includes the following specific actions:

       identifying the data mart that stores the internal data

       identifying the specific services and tags of the external time series available in Global Insights (GI), Chemical Market Associates, Inc. (CMAI), Bloomberg, and so on.

       collecting the internal data is generally conducted by the business data SMEs

       collecting the external data is done using local GI or CMAI service experts

       harmonizing the collected internal and external data as a consistent time series database of the prescribed


Скачать книгу