Or consider the case at the left, below, where there is no pattern – air concentrations with no trend. Again, reporting limits are required in the laboratory, and those limits decrease over time, generally a good thing. Substituting one-half the limit produces artificial values (red squares, below right) that head down over time, and the trend eventually appears significant even though there is no actual trend in the air concentrations themselves. It was added by the scientist’s unfortunate data practices. How many reported trends have resulted from practices like this?
There are better ways. Methods exist for what statisticians call ‘censored data’, where the individual value is not known, but it is known to be above or below a numerical threshold. These methods use the two types of information available in the data: the known values of detected concentrations, and the proportion of data, both detected and not, below each reporting limit. By mining the information in the proportions, statistics such as the mean and UCL95, regression equations, and hypothesis tests can all be computed. All without substituting any fabricated values for nondetects. See the book Statistics for Censored Environmental Data using Minitab and R (Helsel, 2012) for more detail on data analysis with nondetects.