Sample Variance & Standard Deviation | Introduction to Statistics and Data Analysis | Probability & Statistics for Engineers & Scientists | Walpole | Problem 1.7


Consider the drying time data for Exercise 1.1 on page 13. Compute the sample variance and sample standard deviation.


Solution:

We know, based on our answer in Exercise 1.1, that the sample mean is \displaystyle \overline{x}=3.787.

To compute for the sample variance, we shall use the formula

\displaystyle s^2=\sum _{i=1}^n\frac{\left(x_i-\overline{x}\right)^2}{n-1}\:

The formula states that we need to get the sum of \displaystyle \left(x_i-\overline{x}\right)^2, so we can use a table to solve \displaystyle \left(x_i-\overline{x}\right)^2 for every sample.

\displaystyle x\displaystyle x_i-\overline{x}\displaystyle \left(x_i-\overline{x}\right)^2
\displaystyle 3.4\displaystyle 3.4-3.787=-0.387\displaystyle \left(-0.387\right)^2=0.150
\displaystyle 2.5\displaystyle 2.5-3.787=-1.287\displaystyle \left(-1.287\right)^2=1.656
\displaystyle 4.8\displaystyle 4.8-3.787=1.013\displaystyle \left(1.013\right)^2=1.026
\displaystyle 2.9\displaystyle 2.9-3.787=-0.887\displaystyle \left(-0.887\right)^2=0.787
\displaystyle 3.6\displaystyle 3.6-3.787=-0.187\displaystyle \left(-0.187\right)^2=0.035
\displaystyle 2.8\displaystyle 2.8-3.787=-0.987\displaystyle \left(-0.987\right)^2=0.974
\displaystyle 3.3\displaystyle 3.3-3.787=-0.487\displaystyle \left(-0.487\right)^2=0.237
\displaystyle 5.6\displaystyle 5.6-3.787=1.813\displaystyle \left(1.813\right)^2=3.287
\displaystyle 3.7\displaystyle 3.7-3.787=-0.087\displaystyle \left(-0.087\right)^2=0.008
\displaystyle 2.8\displaystyle 2.8-3.787=-0.987\displaystyle \left(-0.987\right)^2=0.974
\displaystyle 4.4\displaystyle 4.4-3.787=0.613\displaystyle \left(0.613\right)^2=0.376
\displaystyle 4.0\displaystyle 4.0-3.787=0.213\displaystyle \left(0.213\right)^2=0.045
\displaystyle 5.2\displaystyle 5.2-3.787=1.413\displaystyle \left(1.413\right)^2=1.997
\displaystyle 3.0\displaystyle 3-3.787=-0.787\displaystyle \left(-0.787\right)^2=0.619
\displaystyle 4.8\displaystyle 4.8-3.787=1.013\displaystyle \left(1.013\right)^2=1.026
\displaystyle SUM \displaystyle 13.197

The table above shows that 

\displaystyle \sum _{i=1}^{15}=13.197

Therefore, the variance is

\displaystyle s^2=\sum _{i=1}^n\frac{\left(x_i-\overline{x}\right)^2}{n-1}=\frac{13.197}{15-1}=0.9426\:

The standard deviation is just the square root of the variance. That is

\displaystyle s=\sqrt{s^2}=\sqrt{0.9426}=0.971


Mean Computation & Dot Plot | Introduction to Statistics and Data Analysis | Probability & Statistics for Engineers & Scientists | Walpole | Problem 1.6


The tensile strength of silicone rubber is thought to be a function of curing temperature. A study was carried out in which samples of 12 specimens of the rubber were prepared using curing temperatures of 20° C and 45° C. The data below show the tensile strength values in megapascals.

20° C: 2.07 2.14 2.22 2.03 2.21 2.03
2.05 2.18 2.09 2.14 2.11 2.02
45° C: 2.52 2.15 2.49 2.03 2.37 2.05
1.99 2.42 2.08 2.42 2.29 2.01

(a) Show a dot plot of the data with both low and high temperature tensile strength values.

(b) Compute sample mean tensile strength for both samples.

(c) Does it appear as if curing temperature has an influence on tensile strength based on the plot? Comment further.

(d) Does anything else appear to be influenced by an increase in cure temperature? Explain


Solution:

Part (a)

A dot plot is shown below

5

In the figure, “×” represents the 20°C group and “◦” represents the 45°C group.


Part (b)

The mean of the 20°C group is

\displaystyle \overline{x}_{20^{\circ} C}=2.1075

The mean of the 45°C group is

\displaystyle \overline{x}_{45^{\circ} C}=2.2350


Part (c)

Based on the plot, it seems that high temperature yields more high values of tensile strength, along with a few low values of tensile strength. Overall, the temperature does have an influence on the tensile strength.


Part (d)

It also seems that the variation of the tensile strength gets larger when the cure temperature is increased.

Introduction to Statistics and Data Analysis | Probability & Statistics for Engineers & Scientists | Walpole | Problem 1.5

Twenty adult males between the ages of 30 and 40 were involved in a study to evaluate the effect of a specific health regimen involving diet and exercise on the blood cholesterol. Ten were randomly selected to be a control group and ten others were assigned to take part in the regimen as the treatment group for a period of 6 months. The following data show the reduction in cholesterol experienced for the time period for the 20 subjects:

Control Group 7 3 -4 14 2
5 22 -7 9 5
Treatment Group -6 5 9 4 4
12 37 5 3 3

(a) Do a dot plot of the data for both groups on the same graph.

(b) Compute the mean, median, and 10% trimmed means for both groups.

(c) Explain why the difference in the mean suggests one conclusion about the effect of the regimen, while the difference in medians or trimmed means suggests a different conclusion.


Solution:

Part (a)

A dot plot is shown below

4


Part (b)

The mean, median, and 10% trimmed mean of the control group are

\displaystyle \overline{x}_{control}=5.60

\displaystyle \widetilde{x}_{control}=5.00\:

\displaystyle \overline{x}_{tr\left(10\right);control}=5.13

The mean, median, and 10% trimmed mean of the treatment group are

\displaystyle \overline{x}_{treatment}=7.60

\displaystyle \widetilde{x}_{treatment}=4.50\:

\displaystyle \overline{x}_{tr\left(10\right);treatment}=5.63


Part (c)

The difference of the means is 2.0 and the differences of the medians and the trimmed means are 0.5, which are much smaller. The possible cause of this might be due to the extreme values (outliers) in the samples, especially the value of 37.


Introduction to Statistics and Data Analysis | Probability & Statistics for Engineers & Scientists | Walpole | Problem 1.4

In a study conducted by the Department of Mechanical Engineering at Virginia Tech, the steel rods supplied by two different companies were compared. Ten sample springs were made out of the steel rods supplied by each company and a measure of flexibility was recorded for each. The data are as follows:

Company A: 9.3 8.8 6.8 8.7 8.5
6.7 8.0 6.5 9.2 7.0
Company B: 11.0 9.8 9.9 10.2 10.1
9.7 11.0 11.1 10.2 9.6

(a) Calculate the sample mean and median for the data for the two companies.

(b) Plot the data for the two companies on the same line and give your impression.

Solution:

Part (a)

The mean and median of Company A are  \displaystyle \overline{x}_A=7.950\:and\:\widetilde{x}_A=8.250, respectively. 

The mean and median of Company B are  \displaystyle \overline{x}_B=10.260\:and\:\widetilde{x}_B=10.150, respectively.

Part (b)

A dot plot is shown below

3

In the figure, “×” represents company A and “◦” represents company B. The steel rods made by company B show more flexibility.