Talk:Data mining/2012

From Wikipedia, the free encyclopedia

Introduction is inconsistent and confuses data mining with kdd

In the beginning of the intro it is stated: "The goal of data mining is to extract knowledge from a data set in a human-understandable structure[2] and involves database and data management, data preprocessing, model and inference considerations"

Later on: "Neither the data collection, data preparation nor result interpretation and reporting are part of the data mining step, but do belong to the overall KDD process as additional steps." — Preceding unsigned comment added by 129.206.66.132 (talk) 12:11, 7 March 2012 (UTC)

Unfortunately, KDD and Data mining (as the analysis step of KDD) are used inconsistently throughout literature, something Wikipedia will not ultimately resolve. All the way up to the point of "CRISP-DM", which specifies the KDD process as "Data Mining process". So to resolve your confusion, just consider "data mining" to be the vague something it is, while "data mining step" and "data mining process" are slightly more precise. I'll try to improve this by using the term "data mining process" in the introduction. --Chire (talk) 17:32, 7 March 2012 (UTC)

Removed tag from Further reading section

Hello all. The Further reading section was tagged as {{Copy edit-section|for=wikification, too much literature, most covering subdomains only. Literature spam|date=September 2011}}. I've removed this in order not to draw general (non-expert) copy editors to it, because I believe only a topic expert can make good decisions about what reading to include in that section and what to exclude. (I don't think the section is too long per se, but that's up to you). If you'd like to re-tag so as to attract an editor able to deal with this issue, perhaps another tag would be more suitable? --Stfg (talk) 17:01, 1 June 2012 (UTC)

Medical Mining

I'm trying to figure out where to put an area for medical data mining. The US Supreme Court rulings authorize the data mining of Pharmacies, and that information to be re-sold under the 1st Amendment.

Link: http://articles.latimes.com/2011/jun/24/nation/la-na-court-drugs-20110624

Twillisjr (talk) 03:50, 9 September 2012 (UTC)

Should we add SEMMA to the Process section?

Should we add SEMMA to the Process section? On the one hand, yes, it is a process model, but on the other hand, it's been proposed and adopted as a standard by only one vendor (SAS). What do people think?Karl (talk) 14:04, 14 November 2012 (UTC)

I found several independent sources for SEMMA and a comparison of it and CRISP-DM, so I added it to this page. If other people feel differently about this addition, please edit it. The SEMMA page received an orphan-page tag today also, so adding a link to it on this page also helps resolve that problem. FYI, I added a link to SEMMA on the CRISP-DM page also. Karl (talk) 16:49, 15 November 2012 (UTC)

I'm wondering how much SEMMA is really about data mining in the scientific sense of the word. For all I see, it probably fits better into the Business Intelligence part that loves to label itself with the buzzword "data mining". IMHO an article on SEMMA is not needed, but it should be merged into some SAS Institute Inc. related larger article. Clearly, SAS Enterprise Miner is a superset of SEMMA, and doesn't have an article yet. This is like discussing the steering wheel without having an article on what a car is. --Chire (talk) 14:16, 16 November 2012 (UTC)

Including material about R's increasing popularity & Satisfaction ratings of tools

Related Articles

Wikiwand AI