Wednesday, March 29, 2017

71. PICKUP MODELS

OBJECTIVE

Forecast the demand for the next periods.


DESCRIPTION

Pickup models are advanced forecasting models used in revenue management, and they can be applied in businesses that book the service in advance (airlines, hotels, theaters, etc.). The forecast is calculated using the current bookings for a certain future period and estimating the incremental bookings (pickup) from now until this future period (i.e. departure date, check-in date, etc.).

The pickup is calculated using past data, and it can be the average pickup with a specific anticipation (x days before) if we are summing it with the current bookings or it can be an average pickup ratio (total bookings / bookings x days before) if we are multiplying it by the current bookings:



In the first formula (additive pickup), the bookings on anticipation day 0 (BDB0) are equal to the current bookings (BDBX) plus the average pickup bookings between anticipation day x and anticipation day 0 (PUDB(X,O)). In the second formula (multiplicative pickup), the forecast is calculated by multiplying the current bookings by the average pickup ratio (PURDB(X,0)).

To estimate good forecasts, it is very important to calculate carefully either the average pickup or the pickup ratio. Take as an example an airline company. If we are using the average pickup, then we should take into consideration seasonality at different levels: time of the day, day of the week, month, holidays, and so on. The amount of incremental bookings depends heavily on the demand for a specific departure time, so we should calculate the average using similar days. However, if we are using the average pickup ratio, we do not have this problem but may discover that the booking pace is different for different departure periods; for example, more people are anticipated to book during summer periods. In this case it is important to calculate the average pickup ratio from similar departure periods.

We can also take a step further and expect that the booking pace may change over time, either because the customer behavior has changed or because we have widened the booking horizon. For example, we can calculate a trend in the pickup ratio and adjust it for the prediction of future periods.



TEMPLATE



Thursday, March 16, 2017

35. DESCRIPTIVE STATISTICS

OBJECTIVE

Analyze the distribution of one or several variables in a data set.


DESCRIPTION

In statistical analysis the first step is to analyze the available data. This step is also useful to check for outliers or for the assumption of normality to use these data for a particular statistical model or test (see 36. INTRODUCTION TO REGRESSIONS). Since the analysis of these assumptions is included in the chapter introducing regressions, here I will focus on the descriptive statistics that are useful for describing numeric variables:


Statistic
Description
Mean
Arithmetic mean of the data
Standard Error
Represents the difference between the expected value and the actual value
Median
Central value (the value that divides the data in two – in the case of an even number of values, the median is the mean of the two central values)
Mode
Most frequent value
Standard Deviation
A measure of how values are spread out. Mathematically, it is the square root of the variance
Sample Variance
Average of the squared differences between each value and the mean (it is also a measure of how values are spread out)
Kurtosis
A measure of the “peakedness” and flatness of the distribution *. “0” means that the shape is that of a normal distribution, a flatter distribution has negative kurtosis, and a more peaked distribution has positive kurtosis
Skewness
A measure of the symmetry of the distribution. “0” means that the distribution is symmetrical. If the value is negative, the distribution has a long tail on the left, and if it is positive, it has a long tail on the right. As a rule of thumb, a distribution is considered to be symmetrical if the kurtosis is between 1 and -1
Range
The difference between the largest and the smallest value
Minimum
The smallest value
Maximum
The largest value
Sum
The sum of values
Count
The number of values

As shown in the template, these statistics can be calculated either using the Excel complement “Data Analysis” or using the Excel functions. The same is valid for creating a histogram, with which we can analyze the frequency of values and gain an idea of the type of distribution. In Figure 30 a sample including age data is represented in a histogram. On the right a box plot provides more information, dividing our data into quartiles (grouping the values into 4 groups containing 25% of the values). The plot shows that 50% of people are aged approximately between 33 and 46 years, while the rest are spread across a bigger range of ages (25% from 46 to 64 and 25% from 18 to 33).


Descriptive Statistics Graphs

Histogram and Box Plot

In the template we can see how the two graphs have been created. For the histogram we need to decide which age groups we want to use and fill a table with them. Then we can use the formula “=FREQUENCY” by selecting all the cells on the right of the age groups and pressing “SHIFT + CONTROL + ENTER,” and the formula will provide the frequencies. For the box plot we have to make some calculations and perform some tricks using a normal column chart if we have an older version than Excel 2016. The template and several tutorials can be consulted on the Internet.

Finally, we may have to identify which kind of distribution our data approximate the most (for example to conduct a Monte Carlo simulation). There is no specific method, but we can start by using a histogram and comparing the shape of our data with the shapes of theoretical distributions. The following URL provides 22 Excel templates with graphs and data of different distributions: http://www.quantitativeskills.com/sisa/rojo/distribs.htm.

If our variables are categorical, we can analyze them using a frequency table (count and percentage frequencies). We can also analyze the distribution of frequencies. In the case that our variables are ordinal, we should use the same method for categorical variables (for example if the categories are the answer to a satisfaction question with ordinal answers like “very bad,” “bad,” etc.). However, in some cases we may want to analyze ordinal variables with statistics used for numerical ones (for example, if we are analyzing answers to a question about the quality of services on a scale from 1 to 10, it can be interesting to calculate the average score, range, etc.).



TEMPLATE




* Even if Kurtosis has been traditionally explained in terms of peakedness/flatness it has been demonstrated that this is incorrect since it’s the tails that mostly account for it, not the central peak. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4321753/



Thursday, March 9, 2017

16. SWOT ANALYSIS

OBJECTIVE

Concerning a company, identify the main strengths to maintain, opportunities to exploit, threats to reduce, and weaknesses to manage.


DESCRIPTION

The SWOT analysis is the consolidation of internal and external analyses, and it is used for strategy definition, usually in its first stage, to establish the bases of several strategic actions:

  • -          External analysis: analysis of the external opportunities and threats resulting from the external environment (PEST, PESTEL); for example, the increasing use of mobile devices can be an opportunity to exploit or the increasing cost of energy can be a potential threat.
  • -     Internal analysis: internal strengths and weaknesses resulting from the internal analysis (MOST, resource audit, etc.) and from competitive analysis (competitive map, importanceperformance matrix). For example, a well-known brand is a strength and a poor company strategy definition is a weakness.


SWOT Analysis Template

SWOT Analysis


From this matrix the analyst can define strategic actions that exploit existing opportunities, using the company’s strengths as critical success factors and reducing the risks that can be provoked by potential threats and the company’s weaknesses. Since this is a consolidation of several other analyses, the sources of data depend on previous analyses and on several gathering techniques, such as brainstorming, the Delphi method, and surveys. The consolidation process can be performed directly by the analyst, but it is usually a good practice to consolidate the results or submit the analysis to other company members, for example through workshops or think tanks.



TEMPLATE


Wednesday, March 1, 2017

42. BAYESIAN APPROACH TO HYPOTHESIS TESTING

OBJECTIVE

Identify the probability of a hypothesis using the probability of related events.


DESCRIPTION

This probabilistic approach is often used in logic tests, which may require a statement such as the following to be solved:

0.5% of the population suffers a certain disease and those with this disease that take a clinical test are diagnosed correctly in 90% of cases. You also know that you have on average 10% of false positive tests on people who do not have the disease. A person has been diagnosed as positive; what is the probability that he actually has the disease?

This problem is solved by finding the percentage of true positives (the test is positive and the person has the disease) in the total number of positive tests, and it can be approached by drawing the following matrix and calculating the missing percentages:


Probability Matrix Bayesian hypothesis testing
Matrix of the Bayesian Inference Method


The answer is given by dividing 0.45% by 10.45%, giving a 4.31% possibility of that person being sick. More formally, the problem is resolved by the following equation:

Probability Equation
where P is the probability, A is having the disease, and B is when the test is positive; therefore, P(A|B) is the probability of having the disease when the test is positive and P(B|A) is the probability of obtaining a positive test when the patient has the disease. The template shows how the values have been calculated starting from the proposed problem.



TEMPLATE