Talk:Geostatistics

From Wikipedia, the free encyclopedia

Please start new discussions at the bottom.

"Geostatistics is a fundamentally flawed …"

Geostatistics is a fundamentally flawed variant of classical statistics because it violates the requirement of functional independence and ignores the concept of degrees of freedom. It is a scientific fact that each distance-weighted average has its own variance. Changing its name to "kriged estimate" does not alter the fact that each kriged estimate, too, has its own variance. In addition, Fisher's F-test for spatial dependence cannot be applied because kriging variances and covariances of sets of kriged estimates are simply voodoo variances. —Preceding unsigned comment added by 66.183.14.73 (talk) 22:12, December 14, 2003

Could the person who wrote these comments identify himself or herself? Michael Hardy 00:12, 15 Dec 2003 (UTC)

Geostatistics is a fundamentally flawed variant of mathematical statistics because it violates the requirement of functional independence and ignores the concept of degrees of freedom. It is a scientific fact that each distance-weighted average had its own variance before it became a kriged estimate. Its rebirth as an honorific kriged estimate does not make its variance vanish without a trace. In addition, Fisher's F-test for spatial dependence cannot be applied not only because kriging variances and covariances of sets of kriged estimates are pseudo variances and covariances but even more so because sets of kriged estimates give zero degrees of freedom. The preceding is a revision of the first comment above by User:JanWMerks: JanWMerks on 22:23, March 26, 2006 Paul August 17:39, 1 July 2006 (UTC)

I think more needs to be said than what is said above, before it can be taken seriously by anyone except those knowledgeable in the theory of geostatistics. Say enough, for example, so that anyone with a PhD in statistics can understand it. Then I might help rephrase it for a broader audience than that and incorporate it into the article. Michael Hardy 22:27, 29 March 2006 (UTC)

The comment provided in the beginning of this discussion is flawed; in particular, this is due to his outright assumption that geo-stats ignore the "requirements" of statistical methodology. I would only state that this users comments in the article do not provide any discussion of views or perspective, and only serve to damage the goal for which it is stated. Geostatistics is not perfect, but remains a viable route for the study of complex phenomena over vast areas.

In BC we use geostatistics to predict the spread of the pine beatle infestation, while results may exhibit a range of predictions, it is understood that there are a tremendous number of variables which are introduced into every equation; I might also state that any failure simply improves our methods of survey and analysis. This field is important because it serves to quantify attributes and behaviour that are extremely difficult to predict without an unimaginable amount of resources dedicated to surveys and other forms of data collection.

Perhaps those who inquire about geostatistics do not want a diatribe, provided by a disgruntled statistician, as to the flaws behind the science. For those individuals who do wish to read these issues, we should provide a subheading. Finally, I state that geostatistics is a branch of mathematical statistics which will increase our understanding of all sorts of spatially varied phenomena, from weather prediction to urban systems development; so perhaps it is the concept of functional independance that is flawed, and not any minor infraction of that rule.

Bohunk 20:47, 1 April 2006 (UTC)

Geostatistics is a fundamentally flawed variant of mathematical statistics because it violates the requirement of functional independence and ignores the concept of degrees of freedom. It is a scientific fact that each distance-weighted average had its own variance before it became a kriged estimate. Its rebirth as an honorific kriged estimate does not make its variance vanish without a trace. As a result, Fisher's F-test for spatial dependence cannot be applied because the variance of a subset of some infinite set of distance-weighted average is as invalid a measure for variability, precision and risk as its covariance is for spatial dependence. Not surprisingly because an infinite set of distance-weighted averages gives exactly zero degrees of freedom.--Iconoclast 20:42, 16 April 2006 (UTC) The preceding is a revision of the first comment above by User:JanWMerks: JanWMerks on 20:43, April 16, 2006 Paul August 18:10, 1 July 2006 (UTC)
Yeah, this is completely wrong. I just completed a Quantitative Methods in Geography course and there definitely was like 2 chapters about degrees of freedom. Sorry. —Preceding unsigned comment added by 130.85.149.221 (talk) 20:50, June 7, 2006
===========================================================

The person who made the rude remarks may not have expressed himself very well, and I agree with all of you about the lack of references. A lack of references always invalidates criticism. There was also a bit of confusion about the difference between empirical measurement of spatial patterns (what most geostatistics is designed for) and statistical tests of the patterns. Geostatistics are perfectly valid for empirical measurement, it is when we want to make statistical inferences about some of them that we run into a few problems.

I have just begun to worry about this myself (I study geographic variation in animals and genes). However I can explain the main problem and give some references to get around them. The critical references (for how to avoid it!) are:

Benjamini, Y. and D. Yekutieli. 2001. The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29, 1165-1188.

Efron, B. 2007. Size, power and false discovery rates. Annals of Statistics 35, 1351-1377.

Romano, J.P. and M. Wolf. 1007. Control of generalized error rates in multiple testing. Annals of Statistics 35, 1378-1408.

Li, Y., N. Wang, M. Hong, N.D. Turner, J.R. Lupton and R.J. Carroll. Nonparametric estimation of correlation functions in longitudinal and spatial data, with application to colon carcinogenesis experiments. Annals of Statistics 35, 1608-1643.

Storey, J.D. 2002. A direct approach to false discovery rates. Journal of the Royal Statistical Society 64, 479-498.

Briefly, unlike other kinds of data, geographical (or other spatial) data are not independent because what happens at one location is likely to happen at adjacent locations. This can happen for a variety of reasons including causes which operate over a larger spatial scale than a single sampling unit (e.g. operating over several of your sample locations), and effects which spread in space. This results in a high correlation between sample values close together, but a decreasing correlation with distance. There is a large literature on measuring this, under the term 'spatial autocorrelation' as well as under the term 'geostatistics'. Statistically it is a nuisance becuase most statistical tests assume each sample is independent--manifestly not true for spatial data. The references I gave above get at this problem. I also like the Baysian approach that one of you suggested, and this may be able to avoid the spatial autocorrelation problem with suitable design. We need to find a good statistician to describe for us the exact way to do this! As I said, I'm just learning this myself, but I thought I would share the references with you anyway.

Oddly enough the first people to attack this seem to be medical researchers because phenomena in one body or tissue location can affect adjacent cells and tissues. So don't be put off by the non-standard applications in the above references! Chlamydera 17:48, 30 November 2007 (UTC)

"To user(s) who have been editing this page"

To user(s) who have been editing this page recently: Please be civil in the editing of wikis. If you don't agree with the principles of geostats that's fine, but wikipedia is not the place for direct attacks. Thank you.

S lyster 02:35, 29 March 2006 (UTC)

"POV"

The first three paragraphs of this article were highly POV, most likely Original research, and I removed them. Geostatistical methods are not only widely accepted by the scientific community, they are required by regulatory agencies, at least in the U.S. We use them every day where I work. Antandrus (talk) 03:57, 1 April 2006 (UTC)

I edited the argument against geostatistics, now the article is accessable without having to read through a thousand words about how shitty it is. Could someone please add some info regarding the history of this discipline, and possibly expand on its socio-political, and geoscience roles, as these are increasingly evident in a wide range of applications and scientific disciplines.
Potential areas of discussion:
Weather analysis, urban planning, invasive species analysis, geological surveys, water sustainability applications, economic development and planning, military strategy, psychological and criminal profiling, socio-economic disparities across local/regional/global scales, epidemiology, anthropology, historical applications. There are a lot of applications, so you think of some more.
if confused remember the golden rule: KISS - Keep It Simple Stupid.
Bohunk 20:47, 1 April 2006 (UTC)

I also removed the POV paragraphs that were re-inserted. Basic kriging is a simple Bayesian technique, with a Gaussian Process prior (encapsulated in the kernel function) and a Gaussian posterior. See the elementary tutorial at . I don't understand the objections at all --- the estimate of the posterior does have a covariance, see equation (21) of the tutorial.

Now, I don't understand a lot of the elaborations in geostatistics. I don't know what pseudo-kriging is, it may be wrong. But, it seems excessive to dismiss an entire field because one algorithm may be incorrect.

hike395 21:28, 2 April 2006 (UTC)

I want to keep it simple because I like to understand my question. Did or didn't each distance-weighted average have its own variance before it metamorphosed into a kriged estimate or kriged estimator in the 1960s when geostatistics was hailed as a new scnience. The rest are details! --Iconoclast 16:14, 3 April 2006 (UTC)

I believe that none of us understand your objection or your question. Let's break it down into little steps for us non-geostatisticians, without the jargon, OK? Let's assume that we're estimating some quantity (say, a mineral concentration), as a function of spatial location on the surface of the Earth. As I understand the steps of kriging (cast into a Bayesian framework):

  1. Start with a Gaussian Process prior. This means that, for any N points on the surface, there is a multivariate normal distribution that is the prior distribution of the mineral concentration. The covariance of this distribution is the kernel function that is used in the kriging. In geostats terms, the kernel function is the semivariogram, although empirical semivariograms are not guaranteed to be positive semi-definite.
  2. Now, we measure the concentration at N-1 of these points. These measurements themselves are assumed to have a normal distribution. The standard kriging setup seems to be homoscedastic (equal variance for each measurement), but that isn't required.
  3. Finally, we can compute the posterior estimate of the concentration at the Nth point on the surface. This point can have arbitrary position. The posterior estimate is in the form of a normal distribution, also. The mean and variance of the posterior at any point can be computed from von Mises' formula --- it's simply an (N-1)x(N-1) inversion of the kernel matrix evaluated at the N-1 measurement points.

Please explain: which of these steps do you think is incorrect? These are the steps outlined in the Gaussian Process tutorial, corresponding to basic kriging. Or, are you objecting to some other geostatistical method, other than basic kriging?

hike395 04:50, 4 April 2006 (UTC)

I think I finally understand the controversy here --- I should have known, many heated controversies in statistics often boil down to Bayesian vs. frequentist assumptions. I tried to clarify this in a paragraph at kriging --- please check to see if I got this right. -- hike395 16:02, 7 April 2006 (UTC)

"I have temporarily attributed …"

All Wikipedians and all Krige's men cannot put the distance-weighted average and its variance together again.

"I have to say that JanWMerks arguments are extremely convoluted"

From RfC

Page refactor

Just one simple question

Cut from article

Clark and the Kriging Game

I'm Over It

Start over

Turning original research into criticism that observes Wikipedia policies

Geostatistics is not "Statistics in Geography" and not "Spatial Statistics"

Moving text to statistical geography

Remodeling of this page

Remodeling

This page seems unfinished

Related Articles

Wikiwand AI