Statistical models and statistical methods play an important role in modern engineering. Phenomena such as turbulence, vibration, and the strength of fiber bundles have statistical models for some of their underlying theories. Engineers now have available to them batteries of computer programs to assist in the analysis of masses of complex data. Many textbooks are needed to cover fully all these models and methods; many are areas of specialization in themselves. On the other hand, every engineer has a need for easy-to-use, self-contained statistical methods to assist in the analysis of data and the planning of tests and experiments. The sections to follow give methods that can be used in many everyday situations, yet they require a minimum of background to use and need little, if any, calculation to obtain good results.
Statistics And Variability
One of the primary problems in the interpretation of engineering and scientific data is coping with variability. Variability is inherent in every physical process; no two individuals in any population are the same. For example, it makes no real sense to speak of the tensile strength of a synthetic fiber manufactured under certain conditions; some of the fibers will be stronger than others. Many factors, including variations of raw materials, operation of equipment, and test conditions themselves, may account for differences. Some factors may be carefully controlled, but variability will be observed in any data taken from the process. Even tightly designed and controlled laboratory experiments will exhibit variability.
Variability or variation is one of the basic concepts of statistics. Statistical methods are aimed at giving objective, quantitative, and reproducible ways of assessing the effects of variability. In particular, they aim to provide measures of the uncertainty in conclusions drawn from observational data that are inherently variable.
A second important concept is that of a random sample. To make valid inferences or conclusions from a set of observational data, the data should be able to be considered a random sample. What does this mean? In an operational sense it means that everything we are interested in seeing should have an equal chance of being represented in the observa tions we obtain. Some examples of what not to do may help. If machine setup is an important contributor to differences, then all observations should not be taken from one setup. If instrumental variation can be important, then measurements on the same item should not be taken successively—time to ‘‘forget’’ the last reading should pass. A random sample of n items in a warehouse is not the first n that you can find. It is the n that is selected by a procedure guaranteed to give each item of interest an equal chance of selection. One should be guided by generalizations of the fact that the apples on top of a basket may not be representative of all apples in the basket.