Transforming data requires re-transforming solutions

Transformations have been used for 3 purposes in statistics (not just by whim):

1 to make data more like a normal distribution

2 to make data relationships more linear

3 to make data more constant in variance

These are requirements of traditional, parametric tests. Logarithms often better meet these objectives due to the skewness of environmental data. Taking logs has been popular due to their relatively simple mathematics, and their flexibility in fitting a wide range of data shapes. But what does this do to the interpretation of the test result?

Specific capacity, a standardized measure of yields of water from wells, was measured in hundreds of wells across the Appalachian region of the US in a USGS report of the late 1980s. This is a dataset we’ve used for years in our Applied Environmental Statistics course. The data come from four rock types, and there was strong interest in learning if well yields differed between the four rock units. Figure 1 shows boxplots of the original data, while Figure 2 shows boxplots of the logarithms of the same data.

Figure 1. Specific capacities of wells in four rock types

Figure 2. Natural logarithms of specific capacities of wells in four rock types

As can be seen, the boxplots of logarithms appear of about the same heights (same variability of data) and similar to a normal distribution – top and bottom portions of the boxes are about the same size, with few outliers. Boxplots of the specific capacities themselves in Figure 1 do not have these characteristics, and these data do not appear to follow normal distributions. Common parametric tests such as analysis of variance (ANOVA) require data to follow a normal distribution and each group have the same variance. Otherwise, the tests have low power – low ability to see differences that are present.

ANOVA tests differences between group means. On the original Figure 1 data the ANOVA p-value is 0.08, so group means would not be considered different. Is the non-normality causing a loss of power, pushing up the p-value, even with 50 observations in each group? ANOVA on the logarithms of Figure 2 gives a p-value of 0.007. However, this test is not a test of differences in the mean specific capacity! It tests whether the mean of the logarithms differs between groups. The mean of the logarithms is called the geometric mean when retransformed back to original units.

The geometric mean is one way to estimate the median, not the mean, of the data in original units. By computing the test in log units, we are testing the difference between geometric means -- testing the difference in medians of the groups rather than their means. A Kruskal-Wallis (nonparametric) test of group medians has a similar p-value of 0.009, another indication that medians are being tested by the ANOVA on logs.

If you transform data you must transform your idea of what is being tested with a parametric test. Means are ‘unit-specific’, and whether performing hypothesis tests, regression, or confidence intervals, what is being targeted changes once logarithms are used. Often what we actually want is a test of medians (“is one group different than the others?”). But if we specifically want to test means, transformations destroy that. There are newer methods than analysis of variance to test differences in means without assuming a normal distribution. These are called permutation tests. The permutation p-value (using the untransformed data) is 0.04 irrespective of the data’s shape, stating that group means do differ for these data.

The difference in p-value between the permutation (0.04) and classical ANOVA (0.08) tests in original units is the loss of power for classical ANOVA. As here, a better test can see something the older tests cannot. If you'd like to learn more about permutation tests, we offer both webinars and in-person courses on how they work and how they can help your data analysis come into the 21st century. See http://practicalstats.com/training for more information.

Online at: