Wikipedia talk:Large language models

From Wikipedia, the free encyclopedia

Chatbot to help editors improve articles

After selecting text, the control panel on the right is used to give instructions. The responses by the AI model are presented in the chat panel on the left.

I wrote a user script called WikiChatbot. It works by selecting text in an article and then clicking one of the buttons on the right to enquire about the selected text. It includes many functions. For example, it can summarize and copyedit the selected text, explain it, and provide examples. The chat panel can also be used to ask specific questions about the selected text or the topic in general. The script uses the AI model GPT 3.5. It requires an API key from OpenAI. New OpenAI accounts can use it freely for the first 3 months with certain limitations. For a more detailed description of all these issues and examples of how the script can be used, see the documentation at User:Phlsph7/WikiChatbot.

I was hoping to get some feedback on the script in general and how it may be improved. I tried to follow WP:LLM in writing the documentation of the chatbot. It would be helpful if someone could take a look to ensure that it is understandable and that the limitations and dangers are properly presented. I also added some examples of how to use edit summaries to declare LLM usage. These suggestions should be checked. Feel free to edit the documentation page directly for any minor issues. I'm also not sure how difficult it is to follow the instructions so it would be great if someone could try to set up the script, use it, and explain which steps were confusing. My OpenAI account is already older than 3 months so I was not able to verify the claims about the free period and how severe the limitations are. If someone has a younger account or is willing to open a new account to try it, that would be helpful.

Other feedback on the idea in general, on its problems, or on new features to implement is also welcome. Phlsph7 (talk) 12:45, 12 July 2023 (UTC)

I meant to reply to this sooner. This is awesome and I'm interested in this (and related ideas) related to writing / reading with ML. I'll try to have a play and give you some feedback soon. Talpedia 10:18, 17 July 2023 (UTC)
Related: see also m:ChatGPT plugin. Mathglot (talk) 07:22, 18 July 2023 (UTC)
Whilst I rather like the ability of this nifty little script to do certain things, I do have some criticism. These functions strike me as extremely risky, to the point that they should probably be disabled:
  • "is it true?" - ChatGPT likely uses Wikipedia as a source, and in any case, we want verifiability, not truth. I feel quite strongly, based on several other reasons too, that this function should be disabled and never see the light of day again.
  • "is it biased?" - ChatGPT lacks the ability to truly identify anything more than glaring "the brutal savages attacked the defenceless colonist family" level bias (i.e. something that any reasonably aware human should spot very quickly indeed). Best left to humans.
  • "is this source reliable?" - Same as the first one, this has so much potential to go wrong that it just shouldn't exist. Sure it might tell you that Breitbart or a self-published source isn't reliable, but it may also suggest that a bad source is reliable, or at least not unreliable.
I don't think that any amount of warnings would prevent misuse or abuse of these functions, since there will always be irresponsible and incompetent people who ignore all the warnings and carry on anyway. By not giving them access to these functions, it will limit the damage that these people would cause. Doing so should not be a loss to someone who is using the tool responsibly, as the output generated by these functions would have to be checked so completely that you might as well just do it without asking the bot.
The doc page also needs a big, obvious warning bar at the top, before anything else, making it clear that use of the tool should be with considerable caution.
The doc page also doesn't comment much on the specific suitability of the bot for various tasks, as it is much more likely to stuff up when using certain functions. It should mention this, and also how it may produce incorrect responses for the different tasks. It also doesn't mention that ChatGPT doesn't give wikified responses, so wikilinks and any other formatting (bolt, italics, etc) must be added manually. The "Write new article outline" function also seems to suggest unencyclopaedic styles, with a formal "conclusion", which Wikipedia articles do not have.
Also, you will need to address the issue of WP:ENGVAR, as ChatGPT uses American English, even if the input is in a different variety of English. Mako001 (C)  (T)  🇺🇦 01:14, 23 July 2023 (UTC)
You can ask it return wikified responses and it will do it with reasonable good success rate. -- Zache (talk) 03:03, 23 July 2023 (UTC)
@Mako001 and Zache: Thanks for all the helpful ideas. I removed the buttons. I gave a short explanation at Wikipedia:Village_pump_(miscellaneous)#Feedback_on_user_script_chatbot and I'll focus here on the issues with the documentation. I implemented the warning banner and add a paragraph on the limitations of the different functions. That's a good point about the English variant being American so I mentioned that as well. I also explained that the response text needs to be wikified before it can be used in the article.
Adding a function to wikify the text directly is an interesting idea. I'll experiment a little with that. The problem is just that the script is not aware of the existing wikitext. So if asked to wikify a paragraph that already contains wikilinks then it would ignore those links. This could be confusing to editors who only want to add more links. Phlsph7 (talk) 09:12, 23 July 2023 (UTC)
I made summaries/translations/etc it so that I gave wikitext as input to chatgpt instead of plaintext. However, the problem here is how to get the wikitext from page in first place. -- Zache (talk) 09:48, 23 July 2023 (UTC)
In principle, you can already do that with the current script. To do so, go to the edit page, select the wikitext in the text area, and click one of the buttons or enter your command in chat panel of the script. I got it to add wikilinks to an existing wikitext and a translation was also possible. However, it seems to have problems with reference tags and kept removing them, even when I told it explicitly not to. I tried it for the sections Harry_Frankfurt#Personhood and Extended_modal_realism#Background, both with the same issue. Maybe this can be avoided with the right prompt. Phlsph7 (talk) 12:09, 23 July 2023 (UTC)
Thanks for setting this up. I've recently had success drafting new Wikipedia articles by feeding the text of up to 5 RS into GPT4-32k through openrouter.com/playground and simply asking it to draft the article. It does a decent job with the right prompt. You can see an example at Harrison Floyd. I'll leave more details on the talk page of User:Phlsph7/WikiChatbot, but I wanted to post here for other interested parties to join the discussion. Nowa (talk) 00:02, 20 September 2023 (UTC)
Thanks for the information. I've responded to you at Talk:Harrison_Floyd#Initial_content_summarized_from_references_using_GPT4 so that we don't have several separate discussion about the same issue. Phlsph7 (talk) 07:44, 20 September 2023 (UTC)
Ran into a brick wall I thought might be helpful to know about. I've been working on the bios of people associated with Spiritual_warfare#Spiritual_Mapping_&_the_Charismatic_movement. GPT 4 and LLama refused to read the RS claiming that it was "abusive". I can see from their point of view why that is, but nonetheless, RS is RS, so I just read it manually. Between that and the challenges of avoiding copyvios I'm a bit sour on the utility of LLMs for assisting in writing new articles. It's just easier to do it manually. Having said that, the Bing chatbot does have some utility in finding RS relative to Google. Much less crap. Nowa (talk) 00:35, 9 October 2023 (UTC)

If we're going to allow LLM editing, this is a great tool to guide editors to the specific use cases that have community approval (even if those use cases are few to none at this point). I found it to be straightforward and easy to use. –dlthewave 16:06, 23 July 2023 (UTC)

There is no policy or guideline disallowing the use of LLM or other machine learning tools. No need for any approval unless that changes. MarioGom (talk) 17:29, 11 February 2024 (UTC)

What if AI keeps improving?

As AI models keep getting better and better, could this guideline eventually be removed in the future? Electron230 (talk) 22:48, 25 December 2025 (UTC)

I'd have to see a citation for 'AI models keep getting better and better'. There seems to be increasing evidence that in regards to their tendency to hallucinate, the newer versions of LLMs are no better, and may even be getting worse. Beyond that, we can't base policies etc on speculation, and have to deal with current tech, which is clearly unsuited to most of the things it currently gets used for on Wikipedia. AndyTheGrump (talk) 23:09, 25 December 2025 (UTC)
But still, AI model hallucination rates could drastically decrease in the future Electron230 (talk) 23:47, 25 December 2025 (UTC)
Given the way LLMs work, hallucination is an intrinsic property of their models. And that's only one of the problems.
For example, LLMs are trained on material that is not reliable sources according to our policies. Thus the output, even from a perfect machine, is not reliably suitable for inclusion. — rsjaffe 🗣️ 23:51, 25 December 2025 (UTC)
There has been a mathematical proof published which claims to demonstrate that hallucination in LLMs is unavoidable. As I understand it, this has broadly been accepted. Either way, we aren't going to speculate. We deal with what we see now. If you want to deal in hypotheticals, Wikipedia really isn't the place to do it. Not our purpose. AndyTheGrump (talk) 23:53, 25 December 2025 (UTC)
AI models keep getting better and better[citation needed]
But to actually answer the question Wikipedia:Consensus can change, but let's not waste time on speculation about the future. -- LWG talk (VOPOV) 23:54, 25 December 2025 (UTC)
Yes, but there is no guarantee that the obsolete LLMs will not be used concurrently with the latest and best "AI" models, as is the case now. You can still fire up any number of old LLMs on your computer or use them through various applications. You can also create a really lousy LLM. —Alalch E. 17:52, 9 January 2026 (UTC)
That's a great point. Unfortunately, for the foreseeable future, policy will have to be engineered to account for the least common denominator, so to speak. I do think we need to be start being very cognizant of our choice in nomenclature. The truth is that applying the term "AI" to LLM's is a completely idiomatic construction injected into popular usage through what was essentially the manipulative marketing coup of the century by the first companies into this new industry a few years ago. LLMs are not 'AI' in the sense that terms was previously used (and still to some degree exclusively is) by prominent researchers in the relevant fields. While it is difficult to go against the grain of an increasingly normalized piece of daily terminology, I think there is value in trying to make that a part of the culture of how we discuss these technologies on this project. Future forms of AI may indeed provide many of the benefits of LLMs with far fewer of the negative trade-offs.
Mind you, I am skeptical of that happening in anything that look remotely like the immediate future, but then again, despite being somewhat better familiar with the history of AI and generative models than the average person, I can't claim not to have been as blindsided by the LLM explosion as most, so what do I really know? Still, speculatively speaking, I think there's something to be said about not letting our brains become so poisoned by the frustration with LLM slop that we are completely dogmatic in our thinking if/when more robust and less innately problematic models come along. ANNs still have immense untapped potential. SnowRise let's rap 04:15, 11 April 2026 (UTC)

Improving Tutorials on using LLMs for Wikitext Formatting

LLMs are great for formatting Wikitext and can help accessibility tremendously. gemini-cli with Gemini 3.0 & Gemini gems are great tools to take plain text and produce wikitext .

For example, gemini-cli will automatically produce proper wikitext from just plain text and a list of citations. It adds refs, links, templates, formatting. gemini-cli takes care of wikitext boilerplate effort (formatting citation URLs into citation markup, looking up template names & shortcuts, links to relevant wiki pages, applying wikitext formatting). It can read and write using the API.

A lot of investment has been made to improve Wiki Editor accessibility, and LLMs can help overcome those barriers at a much lower cost.

Questions

  • Is there an existing WikiProject or community for utilitarian applications of LLMs?
  • Where would this content (e.g. tutorials, support groups) be most helpful?
  • What other non-controversial or less-controversial applications of LLMs fit into this category?
    • e.g. SimplePedia summaries supervised by an editor?
    • Supervised citation cleanup

Example Input to gemini-cli / Gemini 3.0 Flash

welcome to wikipedia. Here are great tools for new users. Try edits in your sandbox. Check out Village Pump. Submit a draft article for review

Output

Welcome to Wikipedia. Here are some great tools for new users:


Tonymetz 💬 01:43, 14 February 2026 (UTC)

Notice: Planning help pages for AI workflows

Still an essay?

The short description still describes it as an essay and the page was only recently converted to an info page. @Alalch E. did some good work pruning this but I think more needs to be done for it to sufficiently reflect our PAGs to be called an info page.

  • WP:NOLLM simply states, in bold, the use of LLMs to generate or rewrite article content is prohibited, then listing two limited exceptions
  • Meanwhile, this page has a section header called Specific competence is required - with accompanying redirect WP:LLMCIR - which sure looks like permission for someone who considers themselves competent to do whatever they want. And "competence" is not covered in WP:NOLLM at all. Instead, the phrase used (for copyediting) is "caution is required"
  • The third paragraph from the opening section similarly sounds like a holdover from the "ok if human-reviewed" pre-NOLLM days

This is causing confusion "in the field", see WP:LLMS#Usage being referenced by a newer editor in Special:Diff/1347898723 for an example. NicheSports (talk) 00:28, 10 April 2026 (UTC)

Thanks for the ping. I have removed those two itema with the following edit sumnary: “provisionally remove content that can be misunderstood/used-for-wikilawyering; to keep thinking about what the best thing to say about responsible use of LLMs outside of the near-blanket ban (the ban only applies to article space)”. Notably, the ban only applies to article space and that content about not misusing LLMs was intended to address LLM misuse in areas outside the ban. It is possible to violate NOT, BLP, and especially the copyrights policy by misusing an LLM in userspace for example. It was not intended to be understood to weaken the near-blanket articlespace ban whatsoever. —Alalch E. 12:05, 10 April 2026 (UTC)
Thanks Alalch, I think that change will help materially. A few other ideas
NicheSports (talk) 14:26, 10 April 2026 (UTC)
I am broadly supportive of this information page, and think it reflects the best of what has emerged from recent community discussions, with the one caveat that I am still concerned that our carve-outs for accessibility features are not as robust as they could be. However, strong support not withstanding, I want to make sure that the transition from essay to information page was the result of a formal WP:PROPOSAL discussion; although I followed a number of community discussions on this topic over the last couple of years, I was not as engaged over the last three or four months, when the rubber really seems of have hit the road. Can someone point me towards the authorizing discussions? Because if WP:PROPOSAL was not followed to the letter for this and related pages, we should do that now to eliminate any doubt about the status of this guidance. SnowRise let's rap 04:23, 11 April 2026 (UTC)
Policy does not require information pages to undergo a formal PROPOSAL discussion. (1) WP:PROPOSAL and WP:INFOPAGE do not require or even suggest that information pages be formally adopted as such (the latter only states that consensus is needed to mark a page as a supplement of a particular policy or guideline and link to it in this capacity from the P/G, but this is an information page, not a supplement, and the page does not need to be linked from the guideline to serve its purpose); (2) the vast majority of pages in category:Wikipedia information pages have not gone through the same process.
Where essay pages offer advice or opinions through viewpoints, information pages should supplement or clarify information about Wikipedia impartially. Essay pages and information pages have the same level of vetting and formal acceptance, with the difference being substantive and functional, not formal.
This page was an essay during the time when it presented certain original viewpoints. When the near-blanket ban was adopted, I changed it to make sure it contains no remaining original viewpoints and that it provides information on this topic of editorial practices impartially and in harmony with the guideline. This caused it to materially stop being an essay, so I replaced the essay tag with the information page tag. It could be a bad information page or a good one, but it’s not an essay.
Starting from there, the only thing I oppose is:
  • Marking the page as an essay
Things I support:
  • Not doing anything about its status and letting it be discussed and improved in the usual way
  • Deleting it
  • Merging it into another information page
  • Moving it to my userspace, with the sole reason that I could enforce that it does not have the essay tag (for as long as circumstances permit); it would not have any tag
  • Marking it as historical, without an essay tag
  • Running it through PROPOSAL to mark it as a supplement of WP:LLM, then marking it as a failed proposal, without an essay tag.
  • Running it through PROPOSAL to mark it as an information page then marking it as a failed proposal, without an essay tag. (I oppose this in general, for any page, but, pragmatically, here I support it, to highlight marking the page as an essay as the only thing I oppose)
Alalch E. 10:33, 11 April 2026 (UTC)
I agree with all those ideas / points for improvement (incl. renaming “Usage”)—except adding content about AI images, because LLMs, while they can natively also create images, are one in a range of technologies that are used to create what we would call AI images. When our existing P&G talk about AI images, that does not refer to LLMs in particular. —Alalch E. 10:51, 11 April 2026 (UTC)
Well, let's start here: which formally adopted guidelines would you say this page supplements? Provided those guidelines have themselves passed WP:PROPOSAL and this page is highly in conformity with them, then any concerns about it's status are reduced (although there is the usual concern about keeping that fidelity up to date). However, I must say that even in that instance, I would be in favour of some community discussion, perhaps hosted on this talk page but promoted at the village pump, to form some sort of formal community consensus of the infopage status. While IPs may not need to go through all of the formalities of a full WP:PROPOSAL process, in my experience they still tend to be authorized by community consensus--though I'm sure there are a fair few that were promoted less formally. Let me stress again that I think you have done a fine job of summarizing central tenants of consensus that I have seen for the LLM scheme in various discussions (though again, I missed some of the more recent ones), so my preference for a formal vetting is to get additional codification to solidify its guidance as PAG-adjacent. SnowRise let's rap 11:00, 11 April 2026 (UTC)
It does not supplement any in the sense of WP:SUPPLEMENT, as an WP:INFOPAGE. —Alalch E. 11:18, 11 April 2026 (UTC)
Well, I don't necessarily mean supplement in the precise sense referenced in WP:SUPPLEMENT. I mean, which guidelines on LLM's would you say this page summarizes? SnowRise let's rap 11:34, 11 April 2026 (UTC)
It does not purport to be a cannonical summary of any single guideline or policy such that it could be seen as a good idea to point to this page as a summary when the guideline or policy could be seen by someone as too long or too technical. —Alalch E. 11:37, 11 April 2026 (UTC)
Ok, work with me, Alalch--because I assure you my objective here is to validate, rather than tear down, this guidance, but so far the responses have only heightened, rather than alleviated, my concern that this page has not been vetted enough to qualify as an info page. IPs are definitely meant to summarize and contextualize policy--maybe not a particular policy in isolation, but definitely they are meant to capture some aspect of codified community consensus. So, whether its a big or narrow span of pages, can you say from where you feel you were sourcing the principles discussed on this page? That will help me frame a proposal through RfC or a VP listing to promote this to infopage status, which I believe it will easily pass, because of the care you put into it. I could try to reverse engineer which accepted bits of guidelines you were summarizing here, but as you are the primary author, I'd rather take your lead on that. SnowRise let's rap 05:31, 13 April 2026 (UTC)
Promotion to an information page? This is a fundamental error on how WP:Essays function within the community. The difference is one is for an oppnion vs supplement or clarifying. What is this page doing?
  • {{How-to}} – the cleanup/dispute tag for articles written in a "how-to" style.
  • {{Information page}} – the banner template for pages that are more just informational than directly instructional like WP:GOV.
  • {{Wikipedia how-to}} – the banner template for pages that are more directly instructional than just informational like H:EDIT
  • {{Essay|interprets=}} – the banner template for pages that are more opinionated then instructional or technical; there are several parameters and variations.
  • {{Supplement|interprets=}} – to tag a well-established page that adds something to a policy or guideline, to make up for a deficiency, and when it is referenced in the guideline or policy like WP:BRD.
So the real question is should this page cover the 4 main protocols and use essay Template:Supplement
Moxy🍁 06:25, 13 April 2026 (UTC)
To my eye, the page has the look of an information page or supplement. I certainly think we can shoot higher than essay, but my read of the relevant policy, and certainly what is the general consensus that I have observed, is that anything that purports to represent some degree of community consensus requires more than just the author unilaterally promoting it merely by swapping out the labeling template, no criticism intended to Alalch. Call me WP:BURO if you like, but I think there is value being pro forma and that we can afford to take a moment to put it to the community, to assure it's status doesn't become a bone of contention at a later date. It's good work and aligns with robust community concerns in this area and the rules that have recently been developed around LLMs; I can't imagine it wouldn't easily pass. SnowRise let's rap 07:14, 13 April 2026 (UTC)
I am one of the primary authors of Wikipedia:Information pages and Wikipedia:Essays alongside the wording of the templates above. Its very concerning that the pages aren't clear that there is only two types of pages "Protocols" and non-Protocol" pages. We have vetted pages and non-vetted pages. I tried to make this clear at Template:Supplement#Note aswell.
Best not waste the communites time trying to vet an essay classified page {that by definition includes Info and how-to pages). Wikipedia:The difference between policies, guidelines, and essays
So....what is the concern here? Do you believe it's an opinion piece over informative? This is a large talk about a banner over an improvements ideas. Moxy🍁 07:40, 13 April 2026 (UTC)
Well, I respect your role in having helped create the classification scheme, but I must tell you that in my non-trivial time on the project, I have always had the firm understanding that information pages are considered an intermediate classification of community consensus between PAGs and essays, and if that was not meant to be intention way back when, the confusion is likely down to the wording of WP:INFOPAGES itself, which seems to confirm as much. And I'm very confident this is a common interpretation today. Users tend to grant guidance labelled as an info page a significantly higher status than anything labelled merely an essay, which all but the newest users realize are mere opinions held by a discrete number of users, and perhaps as few as one.
As to the issue with this page, its that I personally feel this should be labelled an info page and benefit from the somewhat elevated status as guidance that such a designation (as far as I have always believed) conveys. But, at the same time, I think that does require some sort of community vetting. Not necessarily ginormous, but certainly an RfC does not seem out of proportion. SnowRise let's rap 08:06, 13 April 2026 (UTC)
I'd add also that insofar as this page, whatever its designation, sits on top of a very centrally relevant namespace in terms of an area of recently particularly active policy development, it should certainly be used as a summary of those guidelines. Which I believe in fact the prose does. But that underscores the need to vet, because others might disagree with me about that (much though I doubt it). SnowRise let's rap 08:15, 13 April 2026 (UTC)
I assure you the majority get it as it's clearly outlined at WP:PRJCRE just surprised at your POV that we need an RFC to specify what type of essay this is, No one else sees a problem at this point.
So to the point. We 154 editors involved in developing the page lots from Wikipedia:WikiProject AI Cleanup- your plan is to get it vetted by whom now? and vetted for what.. the type of essay banner? Sounds like a metapedian time sink. Again what is the content problem that would not classify this as an info page. Moxy🍁 08:43, 13 April 2026 (UTC)
Well, I admit I have no recollection of ever reading WP:PRJCRE before, despite it being located not so very far below WP:PROPAGES. I concede the point, and apparently there is no additional community vetting needed here--though if info pages are indeed a 'variety of essay', I do believe WP:INFOPAGES could benefit from some adjustments to make this distinction more express. SnowRise let's rap 09:56, 13 April 2026 (UTC)

Use of AI to find sources?

I occasionally use AI when writing Wikipedia articles to find academic sources about the topic when searching for them manually would be very tedious (obviously I then use my own discretion to determine whether to use them; e.g some of the Cambridge and Science direct "Handbook of..." books have been extremely useful, whereas I tend to disregard any MPDI or Frontiers articles it recommends). Surely this should be allowed (along with copyediting or translating)? Enoryt nwased lamaj (talk) 22:24, 14 April 2026 (UTC)

As long as you are reading the sources yourself and you aren't using AI to write the text or decide what citations to include where, it doesn't matter what tools you used to find the sources. The rule is using LLMs to generate or rewrite article content is prohibited. So using a chatbot to o find sources is fine, since finding sources is not generating or rewriting article content. -- LWG talk (VOPOV) 03:36, 15 April 2026 (UTC)
Thanks! Enoryt nwased lamaj (talk) 03:40, 15 April 2026 (UTC)

Related Articles

Wikiwand AI