Demand Estimation and Forecasting

Analytics Process

Collect

→

Analyze

→

Insight

→

Action

The first question which arises is, what is the difference between demand estimation and demand forecasting? The answer is that estimation attempts to quantify the links between the level of demand and the variables which determine it. Forecasting, on the other hand, attempts to predict the overall level of future demand rather than looking at specific linkages. For this reason the set of techniques used may differ, although there will be some overlap between the two.In general, an estimation technique can be used to forecast demand but a forecasting technique cannot be used to estimate demand. A manager who wishes to know how high demand is likely to be in two years’ time might use a forecasting technique. A manager who wishes to know how the firm’s pricing policy could be used to generate a given increase in demand would use an estimation technique.The firm needs to have information about likely future demand in order to pursue optimal pricing strategy. It can only charge a price that the market will bear if it is to sell the product. On one hand, over-optimistic estimates of demand may lead to an excessively high price and lost sales. On the other hand, over-pessimistic estimates of demand may lead to a price which is set too low resulting in lost profits. The more accurate, information the firm has, the less likely it is to take a decision which will have a negative impact on its operations and profitability.The level of demand for a product will influence decisions, which the firm will take regarding the non-price factors that form part of its overall competitive strategy.For example, the level of advertising it carries out will be determined by the perceived need to stimulate demand for the product. As advertising expenditure represents an additional cost to the firm, unnecessary spending in this area needs to be avoided. If the firm’s expectations about demand are too low it may try to compensate by spending large sums on advertising, money which in this instance may be, at least, partly wasted. Alternatively it may decide to redesign the product in response to this, thus incurring unnecessary additional costs in the form of research and development expenditure.In the previous unit, demand analysis was introduced as a tool for managerial decision-making. For example, it was shown that knowledge of price and cross elasticities can assist managers in pricing and that income elasticities provide useful insights into how demand for a product will respond to different macroeconomic conditions. We assumed that these elasticities were known or that the data were already available to allow them to be easily computed. Unfortunately, this is not usually the case. For many business applications, the manager who desires information about elasticities must develop a data set and use statistical methods to estimate a demand equation from which the elasticities can then be calculated. This estimated equation could then, also be used to predict demand for the product, based on assumptions about prices, income, and other factors. In this unit the basic techniques of demand estimation and forecasting are introduced.

ESTIMATING DEMAND USING REGRESSION ANALYSIS The basic regression tools discussed in Block 1 can also be used to estimatedemand relationships. Consider a small restaurant chain specializing in Chinesedinners. The business has collected information on prices and the average numberof meals served per day for a random sample of eight restaurants in the chain.These data are shown below. Use regression analysis to estimate the coefficients of the demand function Qd = a + bP. Based on the estimated equation, calculate thepoint price elasticity of demand at mean values of’ the variables.

Solution : The mean values of the variables are Q = 100 and P = 160. The other data needed to calculate the coefficients of the demand equation are shown below.

As shown, the sum of the ( 2Pi − P) is 4000 and the sum of the ( Pi − P)( Q Qi − ) is –1750 Thus, using the equations for calculating bˆ and aˆ ,bˆ= –175/40 = –.4375 and aˆ = 100 – (.4375)(160) = 170 .Hence, the estimated demand equation is Qd = 170 – 4.375*P. Recall from the previous unit that the formula for point price elasticity of demand is Ep = (dQ/dP)(P/Q). Based on the estimated demand function, dQ/dP = –.4375. Thus, using the mean values for the price and quantity variables, Ep = (–.4375)(160/100) =– 0.7.

EVALUATING THE ACCURACY OF THE REGRESSION EQUATION - REGRESSION STATISTICS Once the parameters have been estimated, the strength of the relationship between the dependent variable and the independent variables can be measured in two ways. The first uses a measure called the coefficient of determination, denoted as R2, to measure how well the overall equation explains changes in the dependent variable. The second measure uses the t-statistic to test the strength of the relationship between an independent variable and the dependent variable.Testing Overall Explanatory Power : Define the squared deviation of any Yi from the mean of Y [i.e., (Yi–Y)2] as the variation in Y. The total variation is found by summing these deviations for all values of the dependent variable as total variation = S (Yi–Y)2 Total variation can be separated into two components: explained variation and unexplained variation. These concepts are explained below, for each Xi value,compute the predicted value of Yi (denoted as i Yˆ ) by substituting Xi in the estimated regression equation:i Yˆ= i X bˆ aˆ +The squared difference between the predicted value Yi and the mean value Y[i.e.,( i Yˆ –Y)2] defined as explained variation. The word explained means that the deviation of Y from its average value is Y the result of (i.e., is explained by) changes in X. For example, in the data on total output and cost used previously, one important reason the cost values are higher or lower than Yis because output rates(Xi) are higher or lower than the average output rate.Total explained variation is found by summing these squared deviations, that is,total explained variation = Σ − i Yˆ (Unexplained variation is the difference between Yi and . That is, part of the deviation of Yi from the average value (Y) is "explained" by the independent variable, X. The remaining deviation, Yi - i Yˆ , is said to be unexplained. Summing the squares of these differences yields total unexplained variation = Σ(Yi − Yˆ1)2.The three sources of variation are shown in Figure 6.1.

The coefficient of determination (R2) measures .the proportion of total’ variation in the dependent variable that is "explained" by the regression equation. That is,

The value of R2 ranges from zero to 1. If the regression equation explains none of the variation in Y (i.e., there is no relationship between the independent variables and the dependent variable), R2 will be zero. If the equation explains all the variation (i.e., total explained variation = total variation), the coefficient of determination will be 1. In general, the higher the value of R2, the "better" the regression equation. The term fit is often used to describe the explanatory power of the estimated equation. When R2 is high, the equation is said to fit the data well. A low R2 would be indicative of a rather poor fit.

How high must the coefficient of determination be in order that a regression equation be said to fit well? There is no precise answer to this question. For some relationships, such as that between consumption and income over time, one might expect R2 to be at least 0.95. In other cases, such as estimating the relationship Demand Estimation and Forecasting between output and average cost for fifty different producers during one production period, an R2 of 0.40 or 0.50 might be regarded as quite good.Based on the estimated regression equation for total cost and output, that is,i Yˆ = 87.08 + 12.21X1 the coefficient of determination can be computed using the data on sources of variation shown in Table 6.1.

The value of R2 is 0.954, which means that more than 95 percent of the variation in total cost is explained by changes in output levels. Thus the equation would appear to fit the data quite well. Evaluating the Explanatory Power of Individual Independent Variables The t-test is used to determine whether there is a significant relationship between the dependent variable and each independent variable. This test requires that the standard deviation(or standard error) of the estimated regression coefficient be computed. The relationship between a dependent variable and an independent variable is not fixed because the estimate of b will vary for different data samples. The standard error of bˆ from one of these regression equations provides anestimate of the amount of variability in b. The equation for this standard error is

where n is the number of observations. For the production-cost example used in this section, n = 7 and the standard error of bˆ is

The least-squares estimate of bˆ is said to be an estimate of the parameter b. But it is known that bˆ is subject to error and thus will differ from the true value of the parameter b. That is why bˆ is called an estimate. Because of the variability in bˆ , it sometimes is useful to determine a range or interval for the estimate of the true parameter b. Using principles of statistics, a 95 percent confidence interval estimate for b is given by the equation bˆ + tn-k-1S bˆ where tn-k-1 represents the value of a particular probability distribution known asstudent’s distribution. The subscript (n -k -1) refers to the number of degrees offreedom, where n is the number of observations or data points and k is the numberof independent variables in the equation. An abbreviated list of t-values for use inestimating 95 percent confidence intervals is shown in Table 6.4. In the examplediscussed here, n = 7 and k = 1, so there are five (i.e., 7 -1 -1) degrees of freedom,and the value of t in the table is 2.571. Thus, in repeated estimations of the outputcostrelationship, it is expected that about 95 percent of the time such that the truevalue of parameter b will lie in the interval defined by the estimated value of b plus or minus 2.571 times the standard error of b. For the output-cost data, the 95 percent confidence interval estimate would be 12.21+ 2.571(1.19) or from 9.15 to 15.27. This means that the probability that the true marginal relationship between cost and output (i.e., the value of b) within this range is 0.95. If there is no relationship between the dependent and an independent variable, the parameter b would be zero. A standard statistical test for the strength of the relationship between Y and X is to check whether the 95 percent confidence interval includes the value zero. If it does not, the relationship between X and Y as measured by bˆ is said to be statistically significant. If that interval does include zero, then 6 is said to be non significant, meaning that there does not appear to be a strong relationship between the two variables. The confidence interval for in bˆ the output-cost example did not include zero, and thus it is said that bˆ , an estimate of marginal cost, is statistically significant or that there is a strong relationship between cost and rate of output. Another way to make the same test is to divide the estimated coefficient (bˆ ) by its standard error. The probability distribution of this ratio is the same as Student’s t distribution; thus this ratio is called a t-value. If the absolute value of this ratio is equal to or greater than the tabled value of t for n - k - 1 degrees of freedom, bˆ is said to statistically significant. Using the output-cost data, the t-value is computed to be

Because the ratio is greater than 2.571, the value of the t-statistic from Table 6.2, it is concluded that there is a statistically significant relationship between cost and output. In general, if the absolute value of the ratio bˆ / bˆ S is greater than the value from the table for n -k -1 degrees of freedom, the coefficient bˆ is said to be statistically significant.

Demand Estimation and Forecasting

Continue Reading

Data Visualization

Customer Analytics

A/B Testing