Why Log Log

What's the purpose of log-log plots?

They make relationships of the form y = cx^m more obvious, since if Y = log y, X = log x, and C = log c, the relationship is Y = mX + C, which gives a straight line plot (whereas the relationship between x and y is usually far from linear).

(Similarly, if y = cb^x, substituting Y = log y, B = log b and C = log c gives Y = Bx + C, which is again a linear relationship.)

Some care should be taken in using logarithmic plots, since apparently small deviations from a straight line in such plots may correspond to quite large errors in the original values. Also, assessing the Y-axis intercept (the value of C) often requires extrapolating way beyond the range of the data used, giving an unreliable value for C, and so when c is calculated from C, a quite inaccurate value may be obtained.

Due to this, you should think carefully before doing interpolation or extrapolation on a log-log plot.

Comment

The above immediately makes the mathematical hackles rise for two reasons.

  1. A linear model remains linear when the axes are moved. Therefore, one should position the axes so that the Y-axis intercept is NOT distant from plotted data points (or not use that intercept as a shortcut to obtaining values for the constants.) This has nothing to do with whether the original data values were logarithmically transformed prior to being plotted.

  2. The transformed data is suitable for being entered into one's graphic calculator, which can then calculate the best-fit line by minimizing the sum of the squares of the residuals (the differences between actual values of Y and those given by the model). This immediately gives rise to the question of whether this results in a different model from that which would have been obtained by minimizing the sum of the squares of the residuals of the original y-values. Note that the theoretical basis for this minimization process relies (amongst other things) on the values of the independent variable being regularly spaced. If that was the case for the original data, it won't be true for the corresponding data after the logarithmic function has been applied.

Can you please back up the statement regarding the theory with an explanation or a reference?

If values of x (and hence X) can be predetermined and corresponding values of y (and hence Y) determined by experiment, it is conventional (subject to certain conditions) to minimize the sum of the squares of the Y residuals. In addition, it is conventional to choose X values that are uniformly spaced (or approximately so) within the applicable range. Different values for m and C are obtained if Y = mX + C is rearranged (assuming m is not zero) as X = Y/m - C/m and one then treats Y as the independent variable and chooses m and C so as to minimize the sum of the squares of the X residuals (unless the residuals can all be zero). These conventions are easily found in school text books and examination questions, so I assume they can be theoretically justified, but I am not a statistician and don't have a detailed explanation or a reference to one. The Gauss-Markov theorem (which states that in a linear model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased estimators of the coefficients are the least-squares estimators) is relevant, but not a complete explanation. Note that if both X and Y are experimentally measured, neither procedure can be regarded as best. A further convention is to view the (X,Y) points on a scatter diagram, so as to notice their overall pattern and spot any outliers which might be due to factors such as faulty recording of data, bad experimental procedures, etc. In addition, one can try to determine whether the conditions for the Gauss-Markov theorem apply - if they don't, it might be appropriate to minimize a weighted average of the squares of the residuals. If the reader is really interested in such matters, try looking for further information about the Gauss-Markov theorem or about non-linear regression.


See http://mathforum.org/library/drmath/view/55520.html, http://mathforum.org/library/drmath/sets/high_logs.html


CategoryMath


EditText of this page (last edited August 19, 2006) or FindPage with title or text search