Talk:Deep learning/Archive 1

Article created

I did about two years of work involving Deep Belief Networks. Noticing that there was no article for deep architectures in Wikipedia I made this one and hope to add more soon! Renklauf (talk) Wed Jul 20 06:29:12 UTC 2011

Please do! I greatly enjoy this topic, and I think it's the best way to Strong AI! Danuthaiduc (talk) 10:00, 10 August 2011 (UTC)

Yeah, but let's follow NPOV and refrain from asserting that this stuff really works. You can't use one researcher's claims about his own work as proof that his ideas are valid (see Reproducibility of results). --Uncle Ed (talk) 15:48, 27 May 2012 (UTC)

There is zero technical merit to this article. I suggest merging this with the article on neural networks until "definitions" of "deep" neural networks like this one are seen as laughably pathetic as they are: "A deep neural network (DNN) is defined to be an artificial neural network with at least one hidden layer of units between the input and output layers..." For crying out loud, people, the addition of at least one hidden lay was the whole idea behind neural networks as revived during the 1980s -- and this after the article's introduction says "ANNs fell out of favor in practical machine learning and simpler models such as support vector machines (SVMs) became the popular choice of the field in the 1990s and 2000s." That is to say, Deep Learning Neural Networks (as defined in this article) fell out of favor prior to the Deep Learning Neural Networks advancing the field beyond Deep Learning Neural Networks. Jim Bowery (talk) — Preceding undated comment added 20:09, 9 May 2014 (UTC)

I'm afraid I agree with Jim Bowery -- I'm not sure that deep learning is really a separate field -- it sounds like another term for an existing set of paradigms. I would support a merge. ---- CharlesGillingham (talk) 22:54, 27 September 2014 (UTC)

I also agree. As this article currently stands I don't see the difference between Deep Learning and Neural Networks. If there really IS a difference it should be made clear in the article. If there isn't a difference then this article should be merged with an article on Neural Nets. BTW, I also think it's not good style to have comments like this at the top of the article: "Alternatively, "deep learning" has been characterized as "just a buzzword for neural nets" Is it "just a buzzword for neural nets" or isn't it? That seems to me something the article should take a stand on and not just be weasely as it is now. --MadScientistX11 (talk) 18:29, 21 October 2014 (UTC)

As far as I can tell, "deep learning" refers commonly to the current neural net revival, which started when researchers figured out how to train nets with more than (say) three hidden layers, ca. 2006. But since then, as you can see in this article, anything and everything neural net-related has been sold as "deep learning", including single-layer networks. QVVERTYVS (hm?) 19:11, 21 October 2014 (UTC)

The difference between deep learning and neural networks is pretty clear in the literature and in this article. A neural network (in machine learning) is a a particular kind of architecture or mathematical model for transforming inputs to outputs. Deep learning is the name of a class of algorithms, based on representational methods and unsupervised pre-training of layers, for training models with deep architectures, that is models with multiple layers of computational units between input and output. Don't mistake a training algorithm for an architecture. There are many different ways to train artificial neural networks: see Artificial neural network#Learning algorithms for some examples. Some of theses learning algorithms, such as backpropagation and expectation-maximization, have their own standalone articles.

Deep learning as a training algorithm is more than notable enough to have its own standalone article. Criticism of article content is welcome, but it is far better to improve the content using reliable sources than to derisively declare "It's all crap!" and attempt to delete it. "It's all crap!" is also a non-neutral stance on the topic. It is clear from the literature that there are proponents of deep learning methods and undeniable successes in the competitions, there are detractors of these methods and there are people fed up with all the hype. Summarizing with due weight on all of these is the way to go. --Mark viking (talk) 19:46, 21 October 2014 (UTC)

Mark viking deep learning is not an algorithm at all. It's an umbrella term for various models, trained with novel but still very different algorithms, ranging from the supervised to the unsupervised. I actually wrote a page about the deep belief network and its training algorithm; but convolutional nets are trained in quite different ways. (Backpropagation is involved practically everywhere, though.) QVVERTYVS (hm?) 20:04, 21 October 2014 (UTC)

As I said above, Deep learning is the name of a class of algorithms. Feedforward neural nets, restricted Boltzmann machines, deep neural networks and deep belief networks are all mathematical models. How the models are trained, that is, how model parameters are chosen to optimize some loss function, is another topic entirely. People had looked at deep neural network architectures before deep learning techniques were developed, but it was the deep learning techniques--unsupervised pre-training, layer by layer--that made deep networks more practical at the time. With today's GPU based algorithms, the need for pre-training has lessened. The hype surrounding deep learning has muddied the concept to point that people like Michael Jordan claim it is just a synonym for neural network. But the backlash from current hype doesn't change the historical importance of deep learning algorithms in generating interest and good results in deep neural network models or representational learning. Schmidhuber's review has a good discussion of these issues. --Mark viking (talk) 21:55, 21 October 2014 (UTC)

Ok, agreed. Let's try to clean up the article before trying to merge it. QVVERTYVS (hm?) 09:14, 22 October 2014 (UTC)

Sounds good. I will chip in the next few days. --Mark viking (talk) 17:26, 23 October 2014 (UTC)

Good job so far guys - it's a devilish mess to unravel! That said, I have to agree with most of the negative comments - this still needs a lot of work before being reference quality. Deep learning is a very young (immature) field at a philosophical level (probably because progress is driven mostly by empirical engineering practices) and we must have faith that it will become more rational as it matures. E.g., the state of taxonomic analysis is pitiful (as substantiated out by the various comments here). As a step in the direction of rigorous philosophical treatment, I have added a fairly brief and by no means exhaustive section on *interpretation*. Clearly, the logic set forth in the added 'interpretation' section is not compatible with the subsequent (hand-wavy) 'definitions' section but I haven't got time to clear that section up at the moment. We could also do with a 'Biological Interpretation' section as well (to round things out) - if there is one. We could also do with more examples of the insights and progress that was specific to each interpretation. Furthermore, I do not support a merger with 'neural networks'. It is premature to conclude that neural networks are inherently 'deep learners' (see the 'DSP interpretation' for argumentation). In order for this field to be taken seriously, we need to step up the rigor fellas. Qazwsxedcqazws (talk) 07:54, 1 October 2015 (UTC)

How is the clean up progressing? IMHO, "Deep Learning" as a term is likely to mislead the not-so-educated audience about the "depth" of learning: the general audience does not understand that the only "deep" thing about "deep learning" is the metric depth of the neural network, and there is nothing else particularly deep about the neural networks or the associated learning methods. An extremely effective way to produce more hype! — Preceding unsigned comment added by 87.92.32.62 (talk) 07:31, 2 March 2019 (UTC)

Deep Learning Artificial Neural Networks

This article was missing a lot of stuff on deep learning neural networks. I tried to improve this a bit, but much remains to be done. (I found the so-called "official site" on deep learning pretty meager, too.) Deeper Learning (talk) 20:35, 7 December 2012 (UTC)

You could have a look at deep belief networks and deep autoencoders. p.r.newman (talk) 13:56, 20 August 2013 (UTC)

It's said in the article that neural networks with more than one hidden layer could be considered as a deep learning architecture. In this case there is at least two hidden layers (and also 3 non-linear successive transformations). Instead isn't it ONE hidden layer for two non-linear transformations ? — Preceding unsigned comment added by 163.5.218.118 (talk) 01:19, 9 March 2014 (UTC)

Streamlining

One should probably streamline repetitive statements in the introduction and the section on deep neural networks. Isn't deep learning exclusively about neural networks anyway? As far as I know, there is no other successful form of deep learning. Prof. Oundest (talk) 01:56, 11 August 2013 (UTC)

Proofreading note: "set of algorithms"

The article states that deep learning is a "set of algorithms", but it doesn't clearly identify the specific algorithms included in the set. The Transhumanist 01:40, 25 September 2013 (UTC)

How is deep learning differentiated from the family of feature learning methods it belongs to?

The article states: "Deep learning is part of a broader family of machine learning methods based on learning representations." But it doesn't explain how it differs from the rest of the feature learning family. The Transhumanist 01:40, 25 September 2013 (UTC)

"Deep learning" synonymous with "neural networks"?

The article's lead includes a blatant claim that "deep learning" is synonymous with "neural networks":

Deep learning is just a buzzword for neural nets, and neural nets are just a stack of matrix-vector multiplications, interleaved with some non-linearities. No magic there.

— Ronan Collobert

This is potentially extremely confusing, as it may cause readers to wonder why there is a separate article about deep learning. The article does not justify itself with an explanation. That is, it doesn't explain how deep learning is a type of neural networks, rather than being just another name for neural networks in general.

If it is synonymous (as per the claim included in the lead), then the article violates WP:FORK. And so far, the article doesn't make it clear that its coverage isn't a content fork.

Is there any such thing as non-deep learning neural networks? If so, what are they? And, how do deep learning neural networks differ from them? The Transhumanist 01:40, 25 September 2013 (UTC)

I agree that this is confusing, but the reason for citing Collobert (one of the foremost practitioners in the current neural nets landscape) is to counterbalance the rest of the article. As I understand deep learning, it's neural nets with more hidden layers than you can train using vanilla backpropagation (because of numerical stability, the required computing time, and/or a lack of labeled training data). See e.g. this talk by Hinton, and note the difference between the title assigned to it by UBC's YouTube moderator and the actual title of the talk. QVVERTYVS (hm?) 18:22, 25 September 2013 (UTC)

Yes that quote was weirdly situated. I added some context and moved it to the last paragraph of the lead. Bhny (talk) 19:10, 25 September 2013 (UTC)

I was about to revert your edit, but then I figured that wouldn't solve anything, and I'd just be restoring my POV. Instead, I tagged the whole page as OR. I haven't seen a source that establishes the link between the neocognitron, or for that matter Schmidthuber's work, to the recent trend of "deep learning". I'll admit that, from what I've read, it fits one of the definitions that can be gleaned from it, but as this page stands, there's no WP:COMMON here. QVVERTYVS (hm?) 19:34, 25 September 2013 (UTC)

I clarified a few things concerning the contributions of Fukushima, LeCun, Schmidhuber, Hinton, and others, with references. Schmidhuber's open peer review web page http://www.idsia.ch/~juergen/firstdeeplearner.html is a great resource for references. Its acknowledgments read like a "who is who" of neural networks. Yes deeper (talk) 19:17, 29 November 2013 (UTC)

I agree with the Transhumanist and James Bowery above. Don't make it more difficult for readers to find what they are looking for; don't WP:Fork. Who, exactly, uses the term "deep learning"? Certainly not many of the people who's work is described as "deep learning". I think Hinton would say he was working on neural networks or even connectionism. I don't like people being described as being part of research project that they never would have heard of. ---- CharlesGillingham (talk) 23:00, 27 September 2014 (UTC)

Amongst others Yoshua Bengio, Goeffrey Hinton and Yann LeCun use the term "deep learning". —Kri (talk) 21:23, 5 February 2016 (UTC)

"Deep learning" may be synonymous with "neural networks" if you limit yourself to the neural networks that are used today, which are really "deep neural networks", i.e. neural networks with many hidden layers. For a long time, however, neural networks were very difficult to train unless they were very shallow (basically just one or at the maximum two hidden layers). It is first lately that we have been able to train deep neural networks efficiently, which have made it possible for neural networks to learn much more sophisticated features and abstract concepts, which is key in perception. Hence the buzzword "deep learning". So all neural networks can definitely not be counted as forms of deep learning. —Kri (talk) 20:37, 5 February 2016 (UTC)

Definition and citation

The very first sentence appears to be a definition quoted from the given paper "Representation Learning: A Review and New Perspectives".

Deep learning is a set of algorithms in machine learning that attempt to model high-level abstractions in data by using architectures composed of multiple non-linear transformations.

I didn't find the quote, though. I'd suggest to make it clear how to verify the statement. The paper for your reference: http://www.computer.org/csdl/trans/tp/2013/08/ttp2013081798.html — Preceding unsigned comment added by 87.149.191.206 (talk) 14:54, 30 April 2014 (UTC)

It's not a quote. If it were a quote, it would be in quotation marks. QVVERTYVS (hm?) 19:27, 30 April 2014 (UTC)

Errors shrink exponentially?

The article states:

[Recurrent neural networks] are trained by unfolding them into very deep feedforward networks, where a new layer is created for each time step of an input sequence processed by the network. As errors propagate from layer to layer, they shrink exponentially with the number of layers. To overcome this problem [...]

Should this instead say that the errors increase exponentially? Why would the shrinking of errors constitute a "problem"? AxelBoldt (talk) 23:53, 24 May 2014 (UTC)

What I think is meant is the the derivative of the error function in terms of the parameters shrinks as the parameters (weights) are further from the output unit. This is the "vanishing gradient" problem: parts of the net close to the input side receive very small updates, until at some point the derivatives underflow and these parts no longer get any updates. QVVERTYVS (hm?) 15:27, 25 May 2014 (UTC)

Yes, that sounds right. Maybe we can reformulate the error propagation sentence to explain it better? AxelBoldt (talk) 15:04, 26 May 2014 (UTC)

DNN is just an MLP

A deep neural network (DNN) is defined to be an artificial neural network with at least one hidden layer of units between the input and output layers

Following this definition, almost all neural nets are deep. The source for this sentence actually describes good old (shallow) multilayer perceptrons, and says "we will see later on that there are substantial benefits to using many such hidden layers, i.e. the very premise of deep learning" (bold in the original), so it actually distinguishes single-hidden layer nets from deep ones. QVVERTYVS (hm?) 10:31, 23 June 2014 (UTC)

It seems to me then that the definition of DNNs as given here in the article is wrong. It sounds rather like a requirement (i.e. it constitutes a superset for DNNs) rather than an actual definition. —Kri (talk) 11:14, 4 October 2014 (UTC)

Use in NLP

Daniel HK just added the section "Natural language processing", which suffers from the definition problem that has plagued this article from the beginning. It states that

Neural networks have been used for implementing language models since the early 2000s. Key techniques in this field are negative sampling and word embedding. Deep neural architectures have achieved state-of-the-art results in many tasks in natural language processing.

While word embeddings have had a lot of recent years (more so than when they were first suggested in the 1980s; Elman nets were tried on language tasks from the beginning), they're not typically deep nets. Models such as word2vec are single-layer networks. They handle convolutions, limited recurrence, alright, but why call must they be mentioned in an article about deep learning if they're actually shallow? Or is it true after all that ""deep learning is just a buzzword for neural nets" (Collobert)?^[1] QVVERTYVS (hm?) 13:11, 31 August 2014 (UTC)

You're right that word embeddings are not necessarily done by deep networks, but I think their use is a key element in the application of deep learning to NLP and is worth mentioning. And you're also right that there is no clear definition for deep learning, but I think any layered or recursive structure of neural networks can be so termed. Single layer networks are not such.Daniel HK (talk) 08:05, 3 September 2014 (UTC)

I do think we should make the link more explicit, then. Maybe add some text to the word embedding page describing how to build a deep net out of them? (I personally have no idea how that's done, but I suppose you can use them for unsupervised pre-training?) QVVERTYVS (hm?) 12:46, 3 September 2014 (UTC)

Schmidhuber's excellent review (reference number 3) uses credit assignment path length as a determiner of whether an architecture deep or shallow in section 3 of the paper. It is the most sensible and reasonably object definition of what deep means that I have seen so far. --Mark viking (talk) 16:15, 3 September 2014 (UTC)

Mark viking, would you care to write up a summary of that definition? QVVERTYVS (hm?) 08:07, 24 October 2014 (UTC)

I added a paragraph describing shallow vs deep learning and (very informally) the idea of the CAP. --Mark viking (talk) 01:30, 26 October 2014 (UTC)

Section references

[1]
Ronan Collobert (May 6, 2011). "Deep Learning for Efficient Discriminative Parsing". videolectures.net. Ca. 7:45.

What is a Vector of Pixels?

I noticed @Kri: reverted an edit where an internet user changed "an image can be represented as a vector of pixels" to "vector OR pixels" I can understand why the person made that edit and that edit made sense to me in the context (distinguishing different ways to represent knowledge) but I don't understand why it was reverted. I'm probably just not understanding something but I want to make sure the text makes sense. My interpretation of what the first editor did was to distinguish between vector graphics (e.g., the way Postscript works) vs. an array of pixels. But apparently that isn't the idea. But what IS the idea? How can you have a vector of pixels? BTW, I tried googling "vector of pixels" and the things that came up all seemed to be contrasting vector graphics OR pixels. --MadScientistX11 (talk) 16:56, 23 October 2014 (UTC)

Take any N by M image of pixels and write it out as an ordered 1-D array of NM elements, forgetting the matrix structure--that is a vector of pixels. I think the point implicitly being made here is that writing an image as a vector of pixels destroys the spatial relationships among the pixels. Close pixels in 2D can be spread far apart in a 1D vector. Those spatial relationships allow for easier discovery of features, like edges, that can be used to form higher level representations of an image. As with much of this article, it could be worded more clearly. --Mark viking (talk) 17:23, 23 October 2014 (UTC)

Thanks. That makes sense, thanks for taking the time to explain it. --MadScientistX11 (talk) 17:33, 23 October 2014 (UTC)

I rephrased "a vector of pixels" to "a set of pixels"; I hope that will make it somewhat clearer what is actually intended. —Kri (talk) 21:29, 23 October 2014 (UTC)

But what goes into a neural net must be a vector, not a set. A set is unordered and doesn't distinguish between two black pixels and three. QVVERTYVS (hm?) 22:21, 23 October 2014 (UTC)

You're right. I thought of the pixels as distinguishable objects, but it's of course just the pixel values that go into the neural network, not the pixels themselves. —Kri (talk) 00:42, 24 October 2014 (UTC)

I've tried to clarify a bit further. Maybe we should expand on this example more; IIRC, image classification on raw pixels, without SIFT/SURF/whatever, was one of the first successes in deep learning and can be used to contrast it with the previous state of the art. QVVERTYVS (hm?) 08:04, 24 October 2014 (UTC)

Limitations of DNNs

The relativity of being recent

Art and AI

stupid sentence

Andrew J.R. Simpson self citations

Why are "deep neural networks" listed as a separate deep learning architecture?

In my opinion, all other of the architectures listed in the article are forms of deep neural networks, as they are all deep architectures and all neural networks. So why is "deep neural networks" listed separately? Isn't "deep neural networks" in fact rather equivalent to "deep learning"? —Kri (talk) 20:54, 5 February 2016 (UTC)

There exist other deep, multi-layer, learning systems based on elements like restricted Boltzmann machines, support vector machines, and kernel systems. The first one is considered a neural network by most (but it is not the classic feedforward design), the second by nearly none and the third, it depends on the kernel. --Mark viking (talk) 21:25, 5 February 2016 (UTC)

Okay, so why are deep neural networks listed separately to all other deep architectures in this article, of which most do seem like deep neural network architectures to me? Shouldn't these rather be listed as subsections to the Deep neural networks section, not as separate sections? —Kri (talk) 19:12, 21 March 2016 (UTC)

So basically, the only deep architecture mentioned in this article that is not a deep neural network is the multilayer kernel machine? —Kri (talk) 19:26, 21 March 2016 (UTC)

While deep neural networks are not synonymous with deep learning, I agree that DNNs are the dominant architecture these days. To reflect due weight, perhaps it would be best to refactor the "Architecture" section into two sections: "Deep neural network architectures" and "Other architectures". Then the multilayer kernel machine would go into the Other architectures section. What do you think? --Mark viking (talk) 20:08, 21 March 2016 (UTC)

Sounds like a plan. —Kri (talk) 18:12, 22 March 2016 (UTC)

Here's a thought: Instead of talking about editor views and opinions and "seem like" statements, why not find a good secondary source that teaches the material, and present it the way that expert does? I don't doubt it may be identical to the way you describe, but even so, it would give confidence to those listening in, that this is an encyclopedia that has rules and follows them, and not the blog of a few AI/deep learning devotees. 73.211.138.148 (talk) 01:24, 14 July 2016 (UTC)

Editing the copyediting

Spam/External Links

Criticism and comment on "Criticism and comment"

Pseudo Citation in Definition

Please consider and compare before reverting

Here is current prose from the lede, which is supposed to be the most accessible part of the article:

"Deep learning (also known as deep structured learning, hierarchical learning or deep machine learning) is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using a deep graph with multiple processing layers, composed of multiple linear and non-linear transformations… / Deep learning is part of a broader family of machine learning methods based on learning representations of data. An observation (e.g., an image) can be represented in many ways such as a vector of intensity values per pixel, or in a more abstract way as a set of edges, regions of particular shape, etc. Some representations are better than others at simplifying the learning task (e.g., face recognition or facial expression recognition...). One of the promises of deep learning is replacing handcrafted features with efficient algorithms for unsupervised or semi-supervised feature learning and hierarchical feature extraction."

And here is the way another teaching source opens in the same subject [where words like graphs and processing layers do not appear for some time]:

"Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. They interpret sensory data through a kind of machine perception, labeling or clustering raw input. The patterns they recognize are numerical, contained in vectors, into which all real-world data, be it images, sound, text or time series, must be translated. / Neural networks help us cluster and classify. You can think of them as a clustering and classification layer on top of data you store and manage. They help to group unlabeled data according to similarities among the example inputs, and they classify data when they have a labeled dataset to train on. (To be more precise, neural networks extract features that are fed to other algorithms for clustering and classification; so you can think of deep neural networks as components of larger machine-learning applications involving algorithms for reinforcement learning, classification and regression.):^[1]

[1]
http://deeplearning4j.org/neuralnet-overview.html

Please consider a major article overhaul, to make it readable by general audiences. Far too many maths and computing articles are simple near illegible tomes of WP editors talking to themselves, while interested readers shake their heads and walk away. 73.211.138.148 (talk) 01:18, 14 July 2016 (UTC)

Text removed as being pseudo-circular in its sourcing (back to Wikipedia)

External links modified

Artificial Neural Networks section's history

The Reference to the Mordvintsev/Olah/Tyka "Inceptionism" Post

There has been some concern that a focus in the "Criticism and Comment" section on the connection between deep learning and artistic creativity is misplaced; however, events are continuing to demonstrate that this connection is perhaps both profound and vital: the web page welcome ( https://magenta.tensorflow.org/welcome-to-magenta ) to the Google "Magenta" initiative, for example, specifically mentions the 17 June 2015 Mordvintsev/Olah/Tyka Google Research Blog post on the CNN image generation technique which they originally dubbed "Inceptionism" and which has since become knows as "DeepDreaming".

Indeed, that post may prove to be very much like the 25 April 1953 Watson/Crick announcement in "Nature" of a possible structure for DNA: a somewhat informal communication of enduring significance; and as such -- and given also that I am the Wikipedia contributor who added the reference to the "Inceptionism" post to the Deep Learning article in the first place -- I've felt justified in keeping an eye on it.

I was therefore very interested to notice that on 10 December of this past year, an "ArchiveBot" added to that reference a link to the Internet Archive version of the "Inceptionism" post -- and with an indication that the original link was dead. Well, that link is currently very much alive, and so -- in the interest of keeping things tidy -- I've restored the reference to its former state; however, I've also noticed something else of great interest: somewhere along the line, the Research Blog post as maintined by Google has lost all, or almost all, of its original comments. These numbered almost a thousand, and to which point I can attest via a screen capture created shortly after my own such comment -- the 993rd! -- became part of the dialogue:

File:G. W. Smith comment on Inceptionism post.jpg

19 June 2015 screen capture showing the 17 June 2015 Mordvintsev/Olah/Tyka Google Research Blog post on "Inceptionism" immediately after the addition of a comment by G. W. Smith -- the most recent of 933 at that point.

Can anyone shed light on either of these "happenings" -- the apparently temporary loss of the original link, and/or the fate of the original comments? Synchronist (talk) 02:40, 22 January 2017 (UTC)

Greek translation

The introductory paragraph to this article is a f***ing trainwreck.

We can do better than this. Consider the viewpoint of a layperson, reading this article. The first sentence should be clear and concise, instead it's obtuse bullet points. And yet, somehow, astonishingly, the bullet points in the first sentence are also themselves sentences that meaninglessly describe things like the "(unsupervised) learning of multiple levels of features or representations of the data." It's almost hyperbolically bad. It's a caricature. The point of an introductory section is not to present a superficial analysis of the field using as many buzzwords as possible; it's to maximize the amount of information relevant to the topic delivered to the reader who we assume is a layperson. Machine learning is not a "field of learning representations of data" any more than writing fiction is an "act of attributing intention to synthetic agents." Don't burden an inexperienced reader with useless assertions before they're even equipped to understand the words you're using. I don't know how to write this introduction, but I at least know enough to say that this one is absolute shit.

I agree. It's particularly bad, because it is copying what later appears under the Definition section. It doesn't make sense to have all of that information in the intro, and perhaps it would be best to just remove it. Pirx123 (talk) 23:13, 9 June 2017 (UTC)

This IS difficult. I made the copying. I still hold that copying was an improvement, albeit from a low level. Complaints like this do not improve the situation. Better spend the time making a little improvement. --Ettrig (talk) 09:08, 10 June 2017 (UTC)

Really, the first sentence is all that is needed. This is just an emphasis on many layers. Otherwise it is just an artificial neural network. The bulk of the explanations should be in the article about neural networks. The reason the name Deep learning caught on so strongly is that researchers who reported spectacular achievements and increases in capacity used this name. --Ettrig (talk) 15:24, 10 June 2017 (UTC)

Rewrite

This article reads like multiple articles squished together. The content is highly redundant. E.g., historic stuff is repeated in multiple places. It's no wonder, given the ridiculous length of the piece. Nobody can read the whole thing. In sum, it needs a rewrite that moves most of the content into other, more focused pieces (many of which already exist). I am doing a superficial copyedit to familiarize myself with the topic. After that, I will propose a new approach. Feedback encouraged! Lfstevens (talk) 19:04, 10 June 2017 (UTC)

Agreed. As I wrote above: The only difference between Deep learning are Artificial neural networks, except for the grammatical difference. And except that DL is restricted to ANN with manyish layers. That clarification plus some history of the use of the term should be sufficient. --Ettrig (talk) 19:29, 10 June 2017 (UTC)

OK. I propose to dramatically shorten this article, by moving most of the technical details to Artificial neural network. No need to duplicate/maintain the two in synch. Instead this article should focus on learning and applications, while the other focuses on how it works. Deep reservoir, deep belief, deep etc., will stay here. Lfstevens (talk) 00:39, 19 June 2017 (UTC)

Dear Lfstevens: in reference to your second set of June 11 edits to the "Criticism and comment" section -- and since you have invited feedback -- I must admit that the original text needed to be streamlined somewhat; however, I think that the current level of pruning eliminates an important point regarding neural networks, and one that needs to be emphasized in several contexts, namely, that they are capable of performing seemingly sophisticated discriminations and manipulations which were formerly deemed the exclusive property of top-level, rule-based processing.

This same pruning, moreover, has eliminated the "yes, but" transition from the previous paragraph, which seeks to establish exactly the opposite point, i.e., that there are many high level functions which neural nets are incapable of performing.

I am wondering, therefore, if you would be distraught if I edited the paragraph in question yet again, and by way of accomplishing all of our goals?Synchronist (talk) 04:03, 20 June 2017 (UTC)

Thanks for noticing! You are of course free to edit as you like, but may I suggest we hash out language on this talk page instead? I've really only begun my work on this monster, and appreciate your insights. Lfstevens (talk) 03:36, 21 June 2017 (UTC)

Since you seem to have just invited an even broader perspective, let me step away for the moment from a selfish focus on my own little piece of this article and talk about the big picture -- but in respect to which I will be reaching the same overall conclusion.

Based on my brief acquaintance with your own talk page, you are a much-respected senior editor -- and your idea of moving much of the material in "Deep Learning" to the more fundamental article on "Artificial Neural Networks" has been a sensible initial approach.

However, as someone who has been closely involved with the topic of deep learning and the article itself for some time, let me make two observations: 1) the deep learning family of alogorithms has been the result of an especially long and complicated evolution (in conputer science terms) in which the basic principle of simulating the behavior of neurons via software/and or hardware was but a starting point; and 2) although the article "Artifical Neural Networks" has a ten year headstart on "Deep Learning", the latter is some 2.5 times the former in size; rivals it in terms of current viewership; and surpasses it in terms of recent (past year) editing activity.

And so what does this tell us? It tells us that -- as with all evolution -- an essentially new thing has come into being, and with which a huge army of researchers and users closely identifies; i.e., although the "Deep Learning" article certainly needs a lot of work, it might be a mistake to think that it can be subordinated to another article.

And here let me provide a perhaps imperfect analogy from science history: yes, at a theoretical level, all of chemistry can be reduced to physics -- but no one would dream of attampting to pack the entire practice of chemisty into a physics suitcase.

And there is also this final point, and which, again, I share with you only because so requested: the family of deep learning/deep neural net algorithms continue to experience a remarkable fluoresence -- so it is inevitable that an up-to-date Wikipeida article will be rather messy. (But still, agreed, the basic structure needs improvement!) Synchronist (talk) 17:16, 21 June 2017 (UTC)

Adding a few key linkages to "Crticism and Comment" paragraph while preserving overall shortened structure.Synchronist (talk) 14:29, 1 July 2017 (UTC)

My thinking has evolved somewhat. I am now envisioning DL as more application-oriented, while ANN is more tech-oriented. Thus I moved the laundry list. DL is now fairly short, although I see some additional fat to suck out in the definition section. Further, I expect to parcel out much of what is in ANN into the more specific articles that already repeat much of its content. My next move is to move duplicative stuff out of ANN into RNN.

Another point is that a lot of the breathless contest stuff happened years ago, with no updates. I don't know where to look to refresh it, but in a fast-moving field we should be current.

Section references

Related Articles