Friday, March 20, 2015

Testing of Hypothesis in Business Analytics: An Analogy from Everyday Life

Testing of Hypothesis in Business Analytics: An Analogy from Everyday Life


One advanced technique you need to learn in business analytics is how to test your hypotheses. Learning how to test a hypothesis is important for analysts because they will use the process in many situations, such as when testing correlation, testing regression coefficients, testing parameter estimates in time-series analysis, testing the goodness of fit in logistic regression, and so on.
Let’s use a simple real-life example to conduct a test. Say you want to buy a 50-pound cake for a big party. You walk into a cake shop and ask for one. The store manager says it’s ready, and she shows it to you. You might get suspicious about its taste and quality. Fifty pounds is a giant cake, and obviously you don’t want to take any risks, even if the store manager assures you that it’s the best quality. In fact, you may want to test the cake. In other words, you would like to test the statement made by the store manager that the cake is of good quality. Obviously, you can’t eat the whole cake and claim you are just testing. So, you will ask the manager to cut a small piece out of the cake give it to you for testing. You might want to cut this test sample randomly from the cake. The following are the possibilities that might result from your test:

·         The test piece is awesome and tastes like the best cake you have ever had. It may be an instant buy decision.
·         The test piece is contradictory to your expectations. You will definitely not buy it in that case.
·         The quality is not the best, but it is still satisfactory. You may want to buy it if nothing better is available.

You had an assumption to begin with, you then took a sample to test it, and you made a conclusion based on a simple test. In statistical terms, you made an inference on the whole population based on testing a random sample. This process was the essence of the testing of hypothesis, in other words, the science of confirmatory data analysis.

Let’s consider one more example. A giant e-commerce company claims that half of its customers are male and another half female. To test this statement, you take a random sample of 100 customers and count how many of them are male. Again, the following three scenarios may arise:

·         Exactly 50 percent are males, and the other 50 percent are females.
·         One gender dominates. For example, almost 90 percent are males, and only 10 percent are females.
·         One gender is near 50 percent. For example, 52 percent are males in the sample.
In the first scenario, you agree to the statement made by the e-commerce company that the count of male and female customers is the same. In the second scenario, you simply reject the company’s claim. In the third scenario, you may tend to agree with the claim. Once again, you are making an inference on the whole population based on the sample measures.

These are reasonably good examples of the process of testing a hypothesis. It is summarized as follows:
1.      You start with an assumption.
a.      The whole cake is good in the first example.
b.      Overall, the gender ratio is 50 percent in the second example.

2.      You take a sample that represents the population.
c.       You try a piece of cake in the first example.
d.      You look at 100 customers in the second example.

3.      You do some kind of test on the sample gathered in step 2.
e.      You test the piece of cake by putting it in your mouth.
f.        You actually count the number of male and female customers in the sample.

4.      You make a final interpretation and inference based on the testing of random sample.
g.      You make a decision about whether the cake is good or bad.
h.      You make an inference about whether the gender ratio is really 50 percent or not.

What Is the Process of Testing a Hypothesis?

Testing of hypothesis is a process similar to the examples discussed in the previous section. Using this process you make inferences about the overall population by conducting some statistical tests on a sample. You are making statistical inferences on the population parameter using some test statistic values from the sample.
In inferential statistics, you make an assumption about the population. That assumption is called the hypothesis (the null hypothesis to be precise). You take a sample and calculate a test statistic, and you expect this test statistic to fall within certain limits if the null hypothesis is true.
Table 1-1 contains a few more examples involving the process of testing a hypothesis.

Table 1-1. Examples of Testing a Hypothesis
Scenario
Null Hypothesis
Sample
Sample Statistic
Inference
Bank customers salary
The average income is $35,000.
You take a simple random sample of 300 customers.
The sample statistic is the average salary of 300 sampled customers.
Accept the null hypothesis if the salary of the sample falls near $35,000, or reject the null hypothesis.
Drug testing
The drug has 1.5 percent alcohol.
You take a random sample of 100 ml.
The sample statistic is the measured alcohol percentage in the sample.
Accept the null hypothesis if the sample test value is near 1.5 percent.
Product feedback
Our product customer satisfaction is 80 percent.
You take a simple random or stratified sample of users across various segments.
You conduct a survey and take the sample C-SAT score (formal customer satisfaction score).
Accept the null hypothesis if the sample C-SAT falls near 80 percent.
Student training
The training has no significant effect on students.
You take a sample of students who took the training.
Students take a test before the training and a test after the training.
If there is a significant increment in the marks, then accept the null hypothesis.
Smoking causes cancer
Smoking does not cause cancer (smoking and cancer are independent).
You take a random sample from the population (contains smokers and nonsmokers).
The sample statistic is the proportion of cancer in smokers and nonsmokers.
If the proportion of cancer is not significantly different in smokers than in nonsmokers, then accept the null hypothesis.

This article was taken from the following book of Venkat Reddy And Shailendra Kadre..

Practical Business Analytics Using SAS: A Hands-on Guide
ISBN-10: 1484200446
ISBN-13: 978-1484200445