Top Twelve Tip #7
Maximize the Signal to Noise Ratio

Environmental data often come from ‘uncontrolled experiments’. Scientists must collect observational data rather than controlling all variables except one and examining only the effect of changes in the one. Effects of climate, weather, and human activities (among others) produce noise that can rarely be avoided or controlled. Whether performing hypothesis tests, regression, or trend analysis, it is important to account for the effects of uncontrolled variables that may be affecting the outcome of a statistical test.

Load (mass) of sediment is plotted in the left panel below versus time. A simple regression produces a p-value of 0.15, insufficient for evidence of a linear relation between the two. Does this prove that there is no trend in load? No! Multiple regression removes the effect of streamflow, known by the scientist to be a major contributor to the pattern in load. The up and down variation in streamflow that causes some of the variation in load is removed, reducing the noise and making the trend signal easier to detect. This is pictured in the right-hand panel, where the trend signal can now be distinguished, resulting in a p-value <0.001. With environmental data, the scientist can rarely afford to do a simplistic test and stop there.

TTT7  load, Residuals vs year