## The Project

The project is part of the Udacity Data Analysis Nanodegree. The section of the course is a Project where we perform our own data analysis to determine whether a web-site should change their page design from and old page to a new page, based on the results of an AB test on a subset of users.

The Project aims to bring together several concepts taught to us over the duration of the course, which we can apply to the data set which will allow us to analyse the data and determine probabilities of a user converting or not using various statistical methods based on whether the user used old page or new page

The PDF report written to communicate my project and findings can also be found here

## What We Learned

• Using proportions to find probability.
• How to write Hypothesis statements and using these to Test against.
• Writing out Hypotheses and observation in accurate terminology
• Using statsmodel to simulate 10000 examples from a sample dataset, and finding differences from the mean
• Plotting differences from the mean in a plt.Hist Histogram, and adding a representation line for the actual observed difference
• Using Logistic Regression to determine probabilities for one of two possible outcomes
• Creating Dummy variables for making categorical variables usable in regression
• Creating interaction variables to better represent attributes in combination for use in regression
• Interpreting regression summary() results and accurately concluding and making observations from results

## A Simple Explanation of Bayes Theorem and Bayesian Inference

While studying through the excellent Udacity Data Analysis Nano Degree , I found myself struggling to answer the Quiz questions on Bayes Theorem. To help myself comprehend it, I did a fair bit of studying other resources too and I came to the conclusion that it might be a. helpful to write an article myself to help reinforce this difficult subject b. but also help others.

In this article I will articulate Bayes Theorem in a simple manner, and guide with some examples.

## What Is Bayes Theorem?

Bayes’ Theorem is a widely used theory in statistics and probability, making it a very important theory in the field of data science and data analysis. For example, Bayesian inference, a particular approach to statistical inference where we can determine and adjust the probability for a hypothesis as more data or information becomes available.

## What Are its Applications?

For example, it can be used to determine the likelihood that a finance transaction is fraud related, or in determining the accuracy of a medical test, or the chances of a particular return on stocks and hundreds of other examples for every industry imaginable from Finance to Sport, Medicine to Engineering, Video Games to Music.

## What does it Do?

So as we mentioned – Bayesian inference gives us the probability of an event, given certain evidence or tests.

We must keep a few things in the back of our mind first

• The test for fraud is separate from the result of it being fraud or not.
• Tests are not perfect, and so give us false positives (Tell us the transaction is fraud when it isn’t in reality), and false negatives (Where the test misses fraud that does exist.
• Bayes Theorem turns the results from your tests into the actual probability of the event.
• We start with a prior probability , combine with our evidence which results in out posterior probability

## How Does it Work? An Example

Consider the scenario of tests for cancer as an example.

Where we want to ascertain the probability of a patient having cancer given a particular test result.

• Chances a patient has this type of cancer are 1% , written as P(C) = 1% – the prior probability
• Test result is 90% Positive if you have C – written as P(Pos | C) – the sensitivity (we can take 100%-90% = 10% as the remaining Positive percentage where there is no C but the test misdiagnoses it – the false positives
• Test result is 90% Negative if you do not have C, written as P(Neg | ¬C) – the specificity (we can take 100%-90% = 10% as the percentage of negative results but there is C but the test misses it – the false negatives

Lets plot this in a table so it’s a bit more readable.

 Cancer – 1% Not Cancer – 99% Positive Test 90% 10% Negative Test 10% 90%
• Our Posterior probability is what we’re trying to predict – the chances of Cancer actually being present, given a Positive Test – written as P( C | Pos ) – that is, we take account of the chances of false positives and false negatives
• Posterior P( C | Pos ) = P ( Pos | C) x P( C ) = .9 x .001 = 0.009
• While P( ¬C | Pos) = P ( Pos | ¬C) x P(¬C) = .1 x .99 = 0.099

Lets plot this in our table.

 Cancer – 1% Not Cancer – 99% Positive Test True Pos90% * 1% = 0.009 False Pos10% * 99% = 0.099 Negative Test False Neg10% * 1% = 0.001 True Neg90% * 99% = 0.891

But of course that’s not the complete story – We need to account for the number of ways it could happen given all possible outcomes

The chance of getting a real, positive result is .009. The chance of getting any type of positive result is the chance of a true positive plus the chance of a false positive (0.009 + 0.099 = 0.108).

So, our actual posterior probability of cancer given a positive test is .009/.108 = 0.0883, or about 8.3%.

In Bayes Theorem terms, this is written as follows, where c is the chance a patent has cancer, and x is the positive result

• P(c|x) = Chance of having cancer (c) given a positive test (x). This is what we want to know: How likely is it to have cancer with a positive result? In our case it was 8.3%.
• P(x|c) = Chance of a positive test (x) given that you had cancer (c). This is the chance of a true positive, 90% in our case.
• P(c) = Chance of having cancer (1%).
• P(¬ c) = Chance of not having cancer (99%).
• P(x|¬ c) = Chance of a positive test (x) given that you didn’t have cancer (¬ c). This is a false positive, 9.9% in our case.

## What is the Pareto Principle

The Pareto Principle (also commonly known as the 80/20 principle), is an observation which states that 80 percent of outputs come from 20 percent of the inputs. It was first observed by the Italian economist Vilfredo Pareto, who observed that 80% of Italy’s wealth, came from 20% of its population. He found that this principle held roughly true in other countries and situations as well.

The Pareto Principle is a neat guide of describing distributions in real-life scenarios that holds true in a vast array of situations. That is, that each input in a scenario, is unequally distributed to the outputs of that situation.

For example;

• A common adage on Computer Science is that 20% of features contribute 80% of usage
• Microsoft also noted that 20% of bugs contribute 80% of crashes. While also finding that 20% of effort contributed 80% of features
• 20% of customers contribute to 80% of income
• 20% of workers contribute 80% of the work

It’s not simply a case of investing the same amount of input and getting an equal value out. ## Why Use the Pareto Principle

I want to propose how valuable this observation is in project management and to consider using this to gain massive return on investment by adhering to it as a principle, beyond understanding the underlying statistic, whether in your own life or in your work.

If we accept that 20% of the effort produces 80% of the results in a project or product; it conversely holds true that 80% of the effort produces only 20% of the results. In investment terms, that is a massive investment of a resource for an increasingly diminishing return on investment (law of diminishing returns) – you wouldn’t want your investment banker running those odds, so why adhere to it in life or in project management?

Instead of investing so much more in terms of effort and resource to ‘complete’ a project or product, we could focus primarily on the efforts that produce the majority of the results and forget the rest, or at least use this to make an informed decision to prioritise  investments on other projects before coming back to ‘complete’ the project.

Considering this, with the 80% of resources saved, we can invest in further projects and products and get 80% return on each of them – huge returns for the same inputs!

# ## Conclusion

As project managers, it’s our responsibility to find the most efficient way to get projects completed. There is a set of tasks that generate a disproportionate amount of work.

With this in mind, I want you to consciously make a decision on how we allocate resource, and not keep aiming for the perfect final product. You may very well want the perfect product, but the key is that we have a choice.

For Example;

• Create 5 wire-frame prototypes instead of 1 detail one
• Build 5 features with 80% of the functionality rather than 1 perfect one
• Find a solution to 5 bug that solves the issue for 80% of users rather than 1 that resolves it for everyone

That said, if we still need the final product 100% completed, it is about making an informed decision now that will optimise our investments – focus on the 20%’ers first that produce the best bang for our buck, re-prioritising as we see fit, before returning to attain 100%.

“The difference between successful people and very successful people is that very successful people say “no” to almost everything.”

— Warren Buffett