BUSINESS STATISTICS

Random Variable

A random variable is usually determined by chance. When a specific outcome is uncertain, it is treated as a random variable. The random variable is a fundamental concept in applying probability theory to decision making. The relationship between the values of a random variable and their probabilities is summarised by probability distribution.

Discrete Random Variable

A random variable that can assume a countable number of value is known as discrete random variable.

Binomial Theorem

It is an experiment conducted by a man named BIOMIAL. A binomial situation can be recognized by the following:

Characteristics of Binomial

Distribution: The following are the basic features of a binomial distribution:

1. Experiment consist of N repeated trials

2. Each trail results in an outcome that can be described as a success or failure

3. The probability of success is denoted by P and it remains constant from trial to trial

4. The repeated trail are independent

5. The probability of failure is denoted by q

6. The sum of the probability of success and the probability of success and the probability of failure is always equal to one

Question: What is Type 1 and Type 2 Error?

Answer: Type 1 error is forcefully conclude that there are significant difference when in fact the observed difference(s) are due to chance factor.

While Type 2 error consist of not rejecting the null hypothesis when it should be rejected. It is to forcefully accept the null hypothesis and conclude that no significant difference exist when in fact they do. In summary, Type 1 error: instead of accepting the null hypothesis, you reject it while type 2 error: instead of rejecting the null hypothesis, you accept it.

Question: Parametric and Non –Parametric Test

Answer: Hypothesis test are parametric tests when they assume the population follows some specific distribution (such as normal) with a set of parameters.

Non-Parametric Test on the other hand, are employed when certain assumptions cannot be made about the population. In a summary, Parametric tests require that a sample analyzed is taken from a population that meets the normality assumption while Non-Parametric tests are used when assumptions required by the parametric counterpart tests are not met or are questionable. All test involving ranked data are non-parametric.

Question: Discuss the importance of the Normal Distribution.

Answer:

Because it relate the probability that a variable x have the value between x0..x1, to several variable like the mean and standard deviation, which is required in some statistic and modelling application.

For by using normal distribution we can calculate the probability that today wind strength is between 10m/s and 20 m/s, the probability that today seawave level is between 10 cm and 20 cm, and so on.

The normal distribution is pattern for the distribution of a set of data which follows a bell shaped curve. This distribution is sometimes called the Gaussian distribution in honor of Carl Friedrich Gauss, a famous mathematician.

The bell shaped curve has several properties:

The curve concentrated in the center and decreases on either side. This means that the data has less of a tendency to produce unusually extreme values, compared to some other distributions.
The bell shaped curve is symmetric. This tells you that he probability of deviations from the mean are comparable in either direction.

When you want to describe probability for a continuous variable, you do so by describing a certain area. A large area implies a large probability and a small area implies a small probability. Some people don't like this, because it forces them to remember a bit of geometry (or in more complex situations, calculus). But the relationship between probability and area is also useful, because it provides a visual interpretation for probability.

Here's an example of a bell shaped curve. This represents a normal distribution with a mean of 50 and a standard deviation of 10.

Question:

A company mass produces electronic calculators. From past experience it knows that 90% of the calculators will be in working order and 10% will be faulty if the production process working satisfactorily. An inspector randomly selects 5 calculators from the production line every hour and carries out a rigorous check.

Required:

a. What is the probability that a random sample of 5 will contain at least 3 defective calculators

b. A sample of 5 calculators is found to contain 3 defectives; do you consider the production process to be working satisfactorily?

Answer:

Formula for Binominal Distribution under random variable

Pr(x)= C x P^x x q^n-x

= n!

(n-x)!x! x P^x x q^n-x

n= number of trails

c= combination symbol

x= sample point

p= probability of success

q= probability of failure

Note: The probability in any question will come inform of % or figure. It is the first in any question. Your (n) is always greater than your (x). Once your p is given, your q is 1-p will give you q.

p = 10%=

q=1-0.1=

n=5

x≥3= 3 or 4 or 5

(n-x)!x! x p^x x q^n-x

(5-3)!3! x 0.1³ x 0.9²

10 x 0.001 x 0.81=0.0081

=0.81%

Calculate when x = 4

(n-x)!x! x p^x x q^n-x

(5-4)!3! x 0.1⁴ x 0.9¹

5 x 0.0001 x 0.9=0.00045

=0.045%

Calculate when x = 5

(n-x)!x! x p^x x q^n-x

(5-5)!3! x 0.1⁵ x 0.9⁰

1 x 0.00001 x 1=0.00001

=0.001%

Pr(3) =0.81 + pr(4) = 0.045 + pr(5)=0.001 =0.856%

(b)

Pr(x)= C x P^x x q^n-x

₅

Pr(x)= C x 0.1³ x 0.9^5-3

(5-3)!3! x 0.1³ x 0.9²

10 x 0.001 x 0.81=0.0081

=0.81%

:- the production process is not working satisfactorily because the 81% is above the required satisfied working condition

Question: A manufacturer sets the following samples for accepting or rejecting large crates of identical items. He takes a random sample of 20 items from the crate. If he finds more than 2 defective in the sample, he rejects the entire crate, otherwise he accepts it. It is know that approximately 5% of this type of items received are defect.

Required:

(a) Calculate the proportion of create that will be rejected

(b) Calculate the mean, variance, and standard deviation of defective in the sample of 20

Answer:

BPD=

Pr(x)= C x P^x x q^n-x

₅

Pr(x)= C x (0.5)³ x (0.5)^5-3

(5-3)!3! x (0.5)³ x (0.5)²

10 x 0.125 x 0.25=0.3125

=0.3125 x 100% = 31.25%

To know the proportion of the crates to be rejected, we will compute for those that will be accepted, then subtract the result from 1.

Computing when x=0

n=20, p=0.05, q = 1-p (1-0.05 =0.95), x =0

Pr(x=0)= C x P^x x q^n-x

₂₀

Pr(x=0)= C x (0.05)⁰ x (0.95)²⁰

⁰

Pr(x=0)= 20

(20-0)!0! x (0.05)⁰ x (0.95)²⁰

= 1x1 x 0.358 = 0.358

Then compute when p(x=1) and when p (x=2). Then add the answer for 0,1,2 and then 1 minus these answers x 100 will give you the percentage for the proportion of the crates to be rejected.

If you are ask to calculate Mean of Binomial probability, variance and standard diviation:

Mean: (µ) = nxp = mean

µ = np µ = 5x0.50

= 2.5

σ² =Variance = nxpxq σ^{2 =}nxpxq

npq = 5x0.5x0.5

= 1.25

S.D = Standard deviation is taking square root of Variance answer

S.D = √ σ²

This next question has a and b part.

Question:- (a) If there are 50% chance that a patient survives a given surgical operation in Lagos State Teaching Hospital, granted that there are 5 trails. What is the probability that 3 trails will be successful?

Answer:

(a) p=0.1

q=1-p = 1-0.1 = 0.9

n=5

x≥3= (3), (4) and (5)

(b)

C x P^x x q^n-x

₅

C x P³ x q^5-3

In this above question, you will compute for 3, 4, 5.

POISSON DISTRIBUTION

Formula p(x) = µ^x x e ^{- µ}

Where µ = mean of the distribution

x= Success point

e=Natural logarithm (constant) = (2.718)=this is always the figure of e.

Example 1. P(x) = µ^x x e ^-
µ

µ = 3.4

e = 2.718

x = 0

P(x=0) 3.4⁰ x 2.718 ^-3.4

1 x 1

1 2.718^3.4

To be able to raise 2.718 to – 3.4, you have to remove the minus and put one at the top and one below the other side of the other figure as in the working above.

2.718^3.4

29.953

= 0.033856

= 0.034 x 100%

c. = Two or more = 1 – (pr(0) + pr (1) )

= 1- the answer

d. = 1- (pr (0))

= 1 – 0.034

=0.966

Past Question: Question One Compulsory

A company is building a model in order to forecast total costs based on the level of output. The following data are available for last year:

Month Output Cost

000 units N000

(X) (Y)

January 16 170

February 20 240

March 23 260

April 25 300

May 25 280

June 19 230

July 16 200

August 12 160

September 19 240

October 25 290

November 28 350

December 12 200

Required:

a. State two possible reasons for the large variations in output per month

b. Plot a graph of output and costs, and comment on the relationship observed

c. Using the least square technique, calculate the values of a and b in the equation y=a + bx in order to predict cost given the output, and explain the meaning of the calculated values.