Talk:Kriging/Archive 1
From Wikipedia, the free encyclopedia
| This is an archive of past discussions about Kriging. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
| Archive 1 |
Proposal for revision (OLD ?)
This article gives a brief overview of what Kriging is and describes it using many links to other (complex) entities. I would like to make this article more self-contained and give some insight on the ideas behind Kriging and what are it's pros and cons.
I propose the following sections:
- Idea(s) behind Kriging
- Does each kriged estimate have its own variance?
- Simple Kriging
- Best Linear Unbiased Estimator
- Pro's and Con's
- Extensions of Simple Kriging
- Software
-- Scheidtm 19:59, 15 March 2006 (UTC)
Confusing: "lost the correspondingly infinite set of variances"
I marked this article {{confusing}} because of the phrase, "lost the correspondingly infinite set of variances" in the introductory paragraph, which is not well-defined before it is used, nor wiki- or hyper-linked. I suggest that the first three paragraphs need a complete re-write as a better introduction, with less jargon and bias (2nd paragraph, hyperlinked to Geophys. web site, shows bias.) --James S. 19:16, 2 April 2006 (UTC)
I moved the two troubled paragraphs to "History" and added a {{SectPOV}} tag in front of the hyperlink. --James S. 19:20, 2 April 2006 (UTC)
- The two paragraphs seem to be pushing a POV that geostatistics is some sort of hoax. This is unlikely, considering that statisticians (other than non-geostatisticians) use Gaussian Process Regression, and have shown that it is a Bayesian technique (where the kernel function describes a Gaussian Process Prior over functions).
- I saved the list of methods named after Krige, but deleted the POV. -- hike395 21:16, 2 April 2006 (UTC)
- I think I finally understand Dr. Merks' objection --- in the Bayesian analysis, spatial dependence is an assumption, while Jan is advocating performing statistical tests on the spatial dependence before blindly using kriging. The latter is a frequentist viewpoint (as I understand it). I did some quick research on what statistical tests are commonly used in spatial statistics, found three, and cited them. -- hike395 16:00, 7 April 2006 (UTC)
In mathematical statistics, one-to-one correspondence between central values (the arithmetic mean and various weighted averages) and their variances is sine qua non. In geostatistics, however, one-to-one correspondence between distance-weighted averages-cum-kriged estimates and their variances is null and void. In other words, the infinite set of variances was lost on Krige's watch and the variance of the SINGLE distance-weighted average was replaced with the perfectly smoothed pseudo kriging variance of a SUBSET of some infinite set of kriged estimates! Geostatistics is a scientific fraud because spatial dependence between (temporally or in situ ) ordered sets is assumed! Remember Bre-X. That's all!--Iconoclast 00:53, 8 April 2006 (UTC)
- I believe I addressed your objections in a way that is NPOV and verifiable --- some people assume spatial dependence, other people test for it. Citations for both viewpoints are included in the article. -- hike395 21:29, 8 April 2006 (UTC)
latest revert
Two problems with the article, that I reverted:
- The previous version claimed that Krige knew certain facts. This is very difficult to verify: a high standard is needed. Do we have any citations to show what Krige was thinking of?
- The paragraph about Fisher's F-test. Again, this seems like original research. I can only find material about applying that particular test from Dr. Merks himself (his web site , comments at ai-geostats , comments at amazon.com) and no place else. Again, if this is supported in the common literature, I'd be happy to add it to the paragraph that lists common statistical tests applied to spatial data.
-- hike395 21:37, 8 April 2006 (UTC)
My two cents
I'm going to chime in here: while I appreciate Mr. Merk's contributions, I need to emphasise that our core policies include no original research, and in this case that means including information which is not verifiable by reference to published sources not by the contributing author. Kriging is accepted both by the scientific community and by policy makers worldwide. Continued insertion of the disputed material is in violation of our POV policy as well as NOR and V. Thanks! Antandrus (talk) 18:27, 10 April 2006 (UTC)
Fact or Fiction
Sir Ronald A Fisher was knighted in 1953 because of his work on analysis of variance, the essence of which is his F-test. It was Snedecor who called it Fisher's F-test. One might suggest that Fisher's F-test does not qualify as "original research" under WP's core policies. I don't know what Krige "knew" but what I do know is he didn't know each and every distance-weighted average had its own variance long before Fisher was knighted. It would be a lot worse if Krige did know about one-to-one correspondence between distance-weighted averages and variances but decided to ignore it. Neither do I know if Matheron and his students knew that its rebirth as an honorific kriged estimate would make its variance vanish without leaving a trace in geostatistical literature. If fact, I know very little because prominent geostatisticians rather assume, krige, smooth and rig the rules of mathematical statistics than respond to the simple question: Does or doesn't each kriged estimate have its own variance? What a pity that this question violates WP's core NPOV policy! So why not play Clark and the Kriging Game rather than waffle with weasel words? By the way, the ordered set of data in the above figure does not display a significant degree of spatial dependence. Wikipedians ought to check that out! --Iconoclast 16:17, 12 April 2006 (UTC)
- The description of the F-test is not original research, talking about Ronald A Fisher may not be. However, you yourself have said that the application of the F-test to spatial dependency is not generally accepted in geology. I can't find any other references to the use of the F-test applied to spatial dependency, other than your own work. Therefore, the application of the F-test is original research, according to the WP rules.
- Asking questions on Talk pages does not violate NPOV. WP:NPOV talks about the phrasing of the content of an article. If you say "Kriging is clearly invalid, because of blah blah blah", that's an POV phrasing. It's like journalism, you have to use "he said/she said" language. An NPOV phrasing, for example, would be:
- Kriging is a commonly applied technique to model distribution of ore.[1] However, some practitioners question the assumption that spatial dependence follows a stochastic process.[2] Other practitioners recommend using statistical tests to test the assumption of spatial dependency.[3][4][5]
- See what I mean? The article doesn't say that the field is invalid (that's a particular Point of View). Perhaps it should say that kriging is commonly used, but some people question the assumptions and/or use statistical tests to check the assumptions.
- -- hike395 09:43, 13 April 2006 (UTC)
Comments
- Sounds good, but isn't the Best Linear Unbiased Estimator a consequence of the Gauss-Markov theorem ? Do you need a whole section to explain it? -- hike395 02:22, 16 March 2006 (UTC)
- Hmmm, I am not that familiar with Gaussian processes. But "Locality" would be a good substitute anyway. -- Scheidtm 21:16, 16 March 2006 (UTC)
Can any Wikipedian tell me whether or not each distance-weighted average had its own variance before it was reborn as variance-deprived but honorific kriged estimate? That’s the crux of the matter! The rest are details! Please be concise and succinct for a change because I've been fed circular logic and opaque dogma by the geostatistical fraternity since the early 1990s.
I know spatial dependence may be assumed because Journel said so in 1992. The original reference behind Journel’s cryptic remark (“a decision rather”) ought to be posted under References where the first three seminal textbooks on geostatistical fiction should be similarly honored. Another work of sublime interest is Armstrong and Champigny's A Study of Kriging Small Blocks, in which the authors caution against oversmoothing. Apparently, the requirement of functional independence can be violated a little but not a lot. What I enjoy more than most people is fuzzy logic. Invoking WP’s vanity policy when authors refer to their own reviewed and published works reflects a subtle sense of humor.--Iconoclast 17:45, 13 April 2006 (UTC)
- We're not invoking the vanity policy, but WP:NOR. You have read it, yes?
- With a bit of reflection, you will see that it is impossible to write a collaborative encyclopedia, one which anyone can edit, without specifically disallowing original research from each contributor. By forcing all editors to provide verifiable sources, attributable to others, not themselves, and to cite them, we have in place a mechanism which avoids endless, frustrating, back-and-forth edit wars.
- Can you provide a source for your assertions, which is not written by yourself? That is the crux of the matter. Antandrus (talk) 00:45, 14 April 2006 (UTC)
A question about the variance of "samples with different weights" was posed on AI-Geostats Open Website on October 7, 2005, and the formula was posted on October 10, 2005. The webmaster didn't post the entire exchange in which several subscribers took part. Plain logic dictates that this variance formula applies not only to area, count, density, length, mass and volume-weighted averages but also to distance-weighted averages aka kriged estimates. I would have been aware if some geostatistical scholar had issued an exclusion edict for kriged estimates. However, tenets tend to change fast when common sense threathens geostatistics. Journel postulated that spatial dependence may assumed "unless proven otherwise" but was troubled that somebody would apply "Fischerian" [sic!] statistics to prove otherwise. Please let me know if more references are required. --Iconoclast 16:38, 14 April 2006 (UTC)
comments of the author of the figure
Dear all,
I think that the last version of this article has introduced confusion and inexactness. For instance, in the first paragraph, is is claimed that Krige developed Kriging. this is false. Matheron did, in the 60s, using Krige ideas published in its MSc report.
about the controversy, I would say that this is irrelevent. I do not think that this article should be the place to discuss the validity of modeling by random processes.
References are irrelevent too. Good references are Matheron's published work, Cressie, Chiles and Delfiner, Wackernagel and Stein.
At last, I would say that this is an error to think that Kriging can only be used for spatial modeling. there is not theoretical restriction to consider other types of phenomenons denpending of one, two or more factors.
Belated hello to the Author of the Figure, Please let the readers of this page know whether it makes sense to replace the variance of the single-distance-weighted average with the kriging variance of a set of kriged estimates? Is it possible that this practice violates the requirement of functional independence and ignores the concept of degrees of freedom? Does the data set in your figure display a significant degree of spatial dependence? Thanks for your response! JWM --Iconoclast 22:30, 10 July 2006 (UTC)
The Author of the Figure should peruse Matheron's introduction to Journel and Huijbregts's Mining Geostatistics to find out who coined the term geostatistics and why! It would be useful if the primary data for the Figure were posted to allow the application of a proper test for spatial dependence. JWM. --Iconoclast 18:30, 3 August 2006 (UTC)
--
Maybe we do not agree on what Kriging is exactly. Kriging starts with the hypothesis that the observations (the data) are sample values of a random process with known or unknown mean m(x) and covariance k(x,y). Note that the covariance need not to be stationary. Then, Kriging is just a linear predictor. Nothing more. The practical question is : when can we make the assumption that the observations are sample values of a random process ? The answer is, to my opinion, that it can always be done. A random process is just a model and statistics can tell us if the chosen model is probable or not.
Further revision proposal by Scheidtm
Kriging' is a regression technique used in geostatistics to approximate or interpolate data. The theory of Kriging was developed from the seminal work of its inventor, Danie G. Krige, by the French mathematician Georges Matheron in the early sixties. In the statistical community, it is also known as Gaussian process regression. Kriging is also a reproducing kernel method (like splines and support vector machines).
Figure: example of one-dimensional data interpolation by Kriging, with confidence intervals
Idea Behind Kriging
As Kriging was developed in Mining, it will be explaned in this setting here. It can and is used in other contexts, too. Please keep this in mind, when reading this article.
Kriging is often used to predict the distribution of some interesting quantity in a geological survey. For example one wants to determine the gold concentration in a mine field from a limited number of exploratory diggings.
Each of the results could be regarded as a single draw from an unkown random distribution, whose form is determined by the geological processes moving and layering the material in the neighbourhood of the place of mining. But as different places would have different geological neighourhoods and histories, the random distributions would also (slightly) differ, so that a general prediction of ore content would be difficult, because one does not know the differences between these random distributions.
Kriging escapes from these difficulties by using the prior knowledge, that these random distribution only differ slightly. It does this by treating all measurements as one draw from a single probability distribution, which is then called a random process or better a random field. The additional assumptions made about this process encode this prior knowledge, and not only allow to predict the wanted quantity, but also allow to give confidence intervalls for predictions.
Why Kriging?, why geostatistics? (a suggestion for an introduction.) I came to geostatistics from geophysics where Nyquist rules. In other words where it is assumed that you don't know anything about 'stuff' that is going on below the sample rate/distance. This is a useless approach when dealing with mining data which may consist of a scattering of boreholes and your interest is mostly in what is happening below the Nyquist sample interval. Geostatistics and Kriging can be considered as a 'best effort' at characterizing poorly sampled information. Prior information is used, for instance, sedimentologists have Walther's 'law' stating that what happens around a borehole (where there is no information) is reflected up and down the borehole in the sediments (where you do have information). So you can use the statistical properties of the vertical borehole to characterize the surrounding area. Prior knowledge may also come from geological analogs. The Nyquist/Kriging divide is manifest in what attempts to characterize the 'nugget effect' i.e. the likelihood of significant deposits being concentrated locally. Nyquist says - drill every meter to find a meter wide body. Krige says use more statistical smarts to make do with what measurements you have.
Simple Kriging
- Give assumptions of simple kriging, develop formulas for prediction, confidence intervalls.
- correlation and standard forms (gaussian, exponential, spherical).
- discontinuity at origin (Nugget Effect) => interpolating or smoothing
- differentiability at origin => roughness.
Best Linear Unbiased Estimator
- Describe features of Kriging
Pro's and Con's
- to be developed
Extensions of Simple Kriging
- Describe how assumptions are relaxed, what is predicted by each of the advanced Kriging methods.
Software implementing Kriging
- Give list (does not strive to be exhaustive).
- The Stanford Geostatistical Modeling Software ( S-GeMS )
I agree with Scheidtm's proposed reorganization of this article. However, I think it is clear that we need a better diagram that more clearly illustrates the application of the technique. Would Emmanuel be interested in producing a revised version of Example_krig.png? Matt 02:49, 22 August 2006 (UTC)
Making this page useful - Give sources or get out
The continued resistance of the one "author" here to provide additional citations to back up his beefs has rendered this entry utterly useless. Quit trying to impose your squatter's rights on the discussion and abide by the request or leave it be. Using Wikipedia to direct people to your site is crappy - this is the ONLY page I've seen this problem persist by such stubborn dogma. Dogma is opinion, not informed, collaborative dissent and disagreement. You clearly are confusing your role here as an "educator" and instead are an impediment (and frankly a parriah in my eyes) to my understanding since I can't verify what you're saying because you can't be bothered.
This comment additionally applies to all the other connected concepts that your put under the umbrella of your disagreement with kriging (do you contest variograms and semi-variograms really or jsut kriging?). Please... GET ON WITH IT, or over it.
209.116.30.220 18:13, 24 July 2006 (UTC)
I'm attempting to do what needs to be done to ensure that scientific integrity and sound science prevail on Wikipedia. I'll post more references if and when required. Wouldn't it be of interest to verify whether the primary data set for the kriging figure displays a significant degree of spatial dependence? You were talking to the undersigned, weren't you? Anonymity is somewhat confusing! JWM. --Iconoclast 16:00, 25 July 2006 (UTC)
- I do not object to the inclusion of a section, 'Controversy', that questions the validity of the statistical technique, based on referenced sources. However, I don't think this article requires 8 references to your own published works (perhaps your user page would be a more appropriate place?). Furthermore, it is my opinion that the opening paragraph of this article should introduce the topic, Kriging, in a manner that is accessible to the encyclopedia reader. Launching straight into a discussion of "what Krige, Matheron and his following did not know in those days" seems to obfuscate rather than elucidate Matt 01:19, 22 August 2006 (UTC)
Sorry, Matt, but I question the validity of the geostatistial technique of assuming spatial dependence, interpolating by kriging, smoothing pseudo kriging variances, and rigging the rules of mathematical statistics. Why not have somebody explain what kriging is really all about? And what about verifying spatial dependence between the ordered set of measured values in the above Figure1? JWM. --Iconoclast 18:47, 22 August 2006 (UTC)
- Hi Jan, I didn't mean to imply that your contributions to this article are unimportant. However, in my opinion the Kriging article should primarily be aimed at introducing the topic to readers who are unfamiliar with the technique (and possibly with geotatistics in general). It is first required to explain exactly what kriging is, before its shortcomings can be adequately addressed. A prominent and detailed Controversy section serves the purpose of warning the reader to treat the technique with caution, and not to accept its conclusions at face value. --Matt 12:50, 27 August 2006 (UTC)
could someone include usage in a sentence? I've found this useful on other WP pages that give it at the top when capitilization is a question. Didn't want to screw it up, so I'll let one of the many debating experts here decide whether to include it.
Make Information, not War
I came to the Kriging page in order to understand what kriging is, since I encountered the term in a software package (in non-geostatistical context -- it had to do with interpolating sampled elevation points). I expected to:
- learn how data are interpolated in the kriging method
- find at least one equation defining the method
- learn how kriging compares to other methods of interpolation: linear, quadratic, spline, etc.
- see a diagram of kriged data, preferably compared with diagrams of data interpolated by other means
- learn the relative strengths and shortcomings of this method of interpolation
But I was disappointed in that respect. On the other hand, I do not give a rat's fart about:
- the wickedness of prof. Krige
- the metaphysical issues of having one's own variance
- historical references
- name-calling among prominent geostatisticians
- correct capitalization of the word “kriging”
The only useful information I found was buried halfway down the page and read: “The Kriging estimate is a weighted linear combination of the data. The weights that are assigned to each known datum are determined by solving the Kriging system of linear equations, where the weights are the unknown regression parameters. The optimality criterion used to arrive at the Kriging system, as mentioned above, is a minimization of the error variance in the least-squares sense.” However, and very regrettably, the alluded-to set of linear equations was not given anywhere on the page.
Does anyone here have the discipline to adequately explain and illustrate the term in question before launching into controversies and edit wars? The article as it stands now consists of a lot of obscure discussion of abstruse side-issues, with regard to a main topic that is not even decently summarized. I do realize that the editors are all expert geostatisticians, who know kriging as the back of their hand; but most encyclopedia readers have no such prior knowledge, and expect to find it in the article. Respectfully yours, Freederick 15:16, 6 November 2006 (UTC)
Ongoing discussion with Merksmatrix about the NPOV
Dear Merksmatrix
First, I think you do not understand very well what linear prediction is and what Kriging means. To my opinion, you tend to confuse the data and the probabilistic model. Do you want to prevent people from fitting linear models because the underlying process that generated the data may not be that linear ? Anyway, if people want to use Kriging, why do you want to prevent them ?
Why do you persist to use wikipedia to diffuse your own point of view, against the NPOV ?
What I do understand is that assuming continued mineralization between boreholes does not make sense. You can do whatever you like but you ought to study Matheron's seminal work before you assume continuity between measured values in ordered sets, interpolate by kriging, select the least biased and most precise subset of some infinite set of kriged estimates, smooth its pseudo kriging variance to perfection and rig the rules of classical statistics in the process. Please do sign your message!!!--Merksmatrix 19:40, 8 February 2007 (UTC)
- First question: do you acknowledge that you are breaking the NPOV ?
To my opinion, you are breaking the NPOV, for the very reason that you are claiming that Kriging is not statistically well-founded (which, to my opinion, is not an interesting point of view).
Whether you do or do not acknowledge, I propose the article be reverted to a neutral form till a solution is settled. Any revert without justification may be consider as vandalism. If you want to modify the article, do not break the NPOV. In particular, stop using some serpentine ways, by for instance, cluttering the article with specialist-only understandable lingo.
- Second question: do you really think that Matheron's seminal work has importance to explain what Kriging is ?
I would like to point out that i have read some of his work. I personnally find a lot of his notes quite useless and besides, very difficult to read (this is my point of view). What is important for someone who wants to know about Kriging, is to understand what Kriging is, and why it is used.
- Third question : do you really consider yourself as a scientist ?
In science, if someone finds something not suited for his purpose, nobody will prevent this person from using something else. If you have better to propose, make a publication ! Be a scientist, not a religionist.
Antro5 18:16, 9 February 2007 (UTC)
Answer to first question: I’m assisting the one and only person who is trying to find some sort of missing link between the theory of kriging and the practice of polynomial curve fitting by giving references to the literature.
Answer to second question: The objective of your exercise is to provide a historical perspective of polynomial curve fitting. In your opinion, the theory of kriging plays a role in the practice of polynomial curve fitting. Agterberg, Matheron, Koch, Link, and scores of other scholars do not agree with you. Surely, you would not want push your own view on those who want to know what kriging is all about, would you? Matheron dabbled in classical statistics before drifting into geostatistics. His work remains relevant because it shows the earliest contortions of the most seminal of geostatistically gifted minds.
Answer to third question: If you really want to know what the united geostatocracy and the krigeologists of the world think about my work, you should visit my website.--Merksmatrix 23:30, 9 February 2007 (UTC)
Your answer shows precisely what is wrong with your posture, here on wikipedia. You want to defend an opinion about Kriging, which is your opinion, by the way. You do not understand that you cannot do that on wikipedia, because of the NPOV. You want to make a link to your own website. This is not possible. Your are not an institution, you do not refer to well-established publicly available information, and your website is not neutral. Please read the NPOV.
Besides, you have reverted the article without justification, and prior to any discussion. Your attitude does not respect fairness and can be assimilated to POV pushing (see http://en.wikipedia.org/wiki/WP:POVPUSH).
I propose to revert, once again, to a neutral form (ie. without your POV). If you do not agree with the content, please, do not revert. Explain precisely the changes you intend to make and give justifications about the NPOV.
You can, if you want, issue a warning (see WP:TD). But please give reasons.
About your answers. I understand you do not agree with the usage of Kriging in Geostatistics. This article should describe what Kriging is. I do not see the point of discussing on Wikipedia whether it is moral or not to use Kriging in Geostatistics. I am not a Geostatistician and I do not want to quarrel with you on this point. There is already a section in the article dealing with this point. What is very problematic, to my opinion, is that you want to clutter the first paragraph of the article with technical assertions, with unfair purpose. The other problem is the reference to your website.
A solution ?? Since your POV is not related to Kriging but on its usage in Geostatistics, maybe you should discuss your view in another article.
Antro5 12:07, 10 February 2007 (UTC)
Where is the meat?
Quoting from the article: “The Kriging estimate is a weighted linear combination of the data. The weights that are assigned to each known datum are determined by solving the Kriging system of linear equations,...”
Quoting from the last (anonymous) edit on the Talk Page: “Kriging is very computationally practical and its implementation is easy, since it consists in solving a system of linear equations.”
Where are the goddamn equations? Are they legendary? IIUC, they should be the main point of the article, which is well-nigh useless without them. Freederick 19:45, 2 December 2006 (UTC)
- Maybe you can read portuges ?
- No. Freederick 22:45, 18 January 2007 (UTC)
References to Matheronian voodoo statistics ought not to be removed!--Merksmatrix 22:21, 3 February 2007 (UTC)
- Duh? Is that slogan somehow related to my request? What I was asking is that some critical data be added, not removed. Voodoo will do, for lack of better, as long as I can write a program realistically interpolating non-gridded elevation values based on that voodoo. I'm an engineer, not a mathematician; I'm comfortable working with empirical equations. Freederick 13:45, 2 March 2007 (UTC)
Dear Mr Nick Didlick aka Merksmatrix,
First, I think you do not understand very well what linear prediction is about and what Kriging means. To my opinion, you tend to confuse the data and the probabilistic model. Do you want to prevent people from fitting linear models because the underlying process that generated the data may not be that linear ? Anyway, if people want to use Kriging, why do you want to prevent them from doing that ?
Why do you persist to use wikipedia to diffuse your own point of view, against the NPOV ?
If you have business in telling revisionist stories against Kriging, good for you. But not on Wikipedia.
