Sample Variance & Standard Deviation | Introduction to Statistics and Data Analysis | Probability & Statistics for Engineers & Scientists | Walpole | Problem 1.7

Consider the drying time data for Exercise 1.1 on page 13. Compute the sample variance and sample standard deviation.


We know, based on our answer in Exercise 1.1, that the sample mean is \displaystyle \overline{x}=3.787.

To compute for the sample variance, we shall use the formula

\displaystyle s^2=\sum _{i=1}^n\frac{\left(x_i-\overline{x}\right)^2}{n-1}\:

The formula states that we need to get the sum of \displaystyle \left(x_i-\overline{x}\right)^2, so we can use a table to solve \displaystyle \left(x_i-\overline{x}\right)^2 for every sample.

\displaystyle x\displaystyle x_i-\overline{x}\displaystyle \left(x_i-\overline{x}\right)^2
\displaystyle 3.4\displaystyle 3.4-3.787=-0.387\displaystyle \left(-0.387\right)^2=0.150
\displaystyle 2.5\displaystyle 2.5-3.787=-1.287\displaystyle \left(-1.287\right)^2=1.656
\displaystyle 4.8\displaystyle 4.8-3.787=1.013\displaystyle \left(1.013\right)^2=1.026
\displaystyle 2.9\displaystyle 2.9-3.787=-0.887\displaystyle \left(-0.887\right)^2=0.787
\displaystyle 3.6\displaystyle 3.6-3.787=-0.187\displaystyle \left(-0.187\right)^2=0.035
\displaystyle 2.8\displaystyle 2.8-3.787=-0.987\displaystyle \left(-0.987\right)^2=0.974
\displaystyle 3.3\displaystyle 3.3-3.787=-0.487\displaystyle \left(-0.487\right)^2=0.237
\displaystyle 5.6\displaystyle 5.6-3.787=1.813\displaystyle \left(1.813\right)^2=3.287
\displaystyle 3.7\displaystyle 3.7-3.787=-0.087\displaystyle \left(-0.087\right)^2=0.008
\displaystyle 2.8\displaystyle 2.8-3.787=-0.987\displaystyle \left(-0.987\right)^2=0.974
\displaystyle 4.4\displaystyle 4.4-3.787=0.613\displaystyle \left(0.613\right)^2=0.376
\displaystyle 4.0\displaystyle 4.0-3.787=0.213\displaystyle \left(0.213\right)^2=0.045
\displaystyle 5.2\displaystyle 5.2-3.787=1.413\displaystyle \left(1.413\right)^2=1.997
\displaystyle 3.0\displaystyle 3-3.787=-0.787\displaystyle \left(-0.787\right)^2=0.619
\displaystyle 4.8\displaystyle 4.8-3.787=1.013\displaystyle \left(1.013\right)^2=1.026
\displaystyle SUM \displaystyle 13.197

The table above shows that 

\displaystyle \sum _{i=1}^{15}=13.197

Therefore, the variance is

\displaystyle s^2=\sum _{i=1}^n\frac{\left(x_i-\overline{x}\right)^2}{n-1}=\frac{13.197}{15-1}=0.9426\:

The standard deviation is just the square root of the variance. That is

\displaystyle s=\sqrt{s^2}=\sqrt{0.9426}=0.971

Mean Computation & Dot Plot | Introduction to Statistics and Data Analysis | Probability & Statistics for Engineers & Scientists | Walpole | Problem 1.6

The tensile strength of silicone rubber is thought to be a function of curing temperature. A study was carried out in which samples of 12 specimens of the rubber were prepared using curing temperatures of 20° C and 45° C. The data below show the tensile strength values in megapascals.

20° C: 2.07 2.14 2.22 2.03 2.21 2.03
2.05 2.18 2.09 2.14 2.11 2.02
45° C: 2.52 2.15 2.49 2.03 2.37 2.05
1.99 2.42 2.08 2.42 2.29 2.01

(a) Show a dot plot of the data with both low and high temperature tensile strength values.

(b) Compute sample mean tensile strength for both samples.

(c) Does it appear as if curing temperature has an influence on tensile strength based on the plot? Comment further.

(d) Does anything else appear to be influenced by an increase in cure temperature? Explain


Part (a)

A dot plot is shown below


In the figure, “×” represents the 20°C group and “◦” represents the 45°C group.

Part (b)

The mean of the 20°C group is

\displaystyle \overline{x}_{20^{\circ} C}=2.1075

The mean of the 45°C group is

\displaystyle \overline{x}_{45^{\circ} C}=2.2350

Part (c)

Based on the plot, it seems that high temperature yields more high values of tensile strength, along with a few low values of tensile strength. Overall, the temperature does have an influence on the tensile strength.

Part (d)

It also seems that the variation of the tensile strength gets larger when the cure temperature is increased.

Introduction to Statistics and Data Analysis | Probability & Statistics for Engineers & Scientists | Walpole | Problem 1.5

Twenty adult males between the ages of 30 and 40 were involved in a study to evaluate the effect of a specific health regimen involving diet and exercise on the blood cholesterol. Ten were randomly selected to be a control group and ten others were assigned to take part in the regimen as the treatment group for a period of 6 months. The following data show the reduction in cholesterol experienced for the time period for the 20 subjects:

Control Group 7 3 -4 14 2
5 22 -7 9 5
Treatment Group -6 5 9 4 4
12 37 5 3 3

(a) Do a dot plot of the data for both groups on the same graph.

(b) Compute the mean, median, and 10% trimmed means for both groups.

(c) Explain why the difference in the mean suggests one conclusion about the effect of the regimen, while the difference in medians or trimmed means suggests a different conclusion.


Part (a)

A dot plot is shown below


Part (b)

The mean, median, and 10% trimmed mean of the control group are

\displaystyle \overline{x}_{control}=5.60

\displaystyle \widetilde{x}_{control}=5.00\:

\displaystyle \overline{x}_{tr\left(10\right);control}=5.13

The mean, median, and 10% trimmed mean of the treatment group are

\displaystyle \overline{x}_{treatment}=7.60

\displaystyle \widetilde{x}_{treatment}=4.50\:

\displaystyle \overline{x}_{tr\left(10\right);treatment}=5.63

Part (c)

The difference of the means is 2.0 and the differences of the medians and the trimmed means are 0.5, which are much smaller. The possible cause of this might be due to the extreme values (outliers) in the samples, especially the value of 37.