Wikipedia talk:Large language models

From Wikipedia, the free encyclopedia

Chatbot to help editors improve articles

After selecting text, the control panel on the right is used to give instructions. The responses by the AI model are presented in the chat panel on the left.

I wrote a user script called WikiChatbot. It works by selecting text in an article and then clicking one of the buttons on the right to enquire about the selected text. It includes many functions. For example, it can summarize and copyedit the selected text, explain it, and provide examples. The chat panel can also be used to ask specific questions about the selected text or the topic in general. The script uses the AI model GPT 3.5. It requires an API key from OpenAI. New OpenAI accounts can use it freely for the first 3 months with certain limitations. For a more detailed description of all these issues and examples of how the script can be used, see the documentation at User:Phlsph7/WikiChatbot.

I was hoping to get some feedback on the script in general and how it may be improved. I tried to follow WP:LLM in writing the documentation of the chatbot. It would be helpful if someone could take a look to ensure that it is understandable and that the limitations and dangers are properly presented. I also added some examples of how to use edit summaries to declare LLM usage. These suggestions should be checked. Feel free to edit the documentation page directly for any minor issues. I'm also not sure how difficult it is to follow the instructions so it would be great if someone could try to set up the script, use it, and explain which steps were confusing. My OpenAI account is already older than 3 months so I was not able to verify the claims about the free period and how severe the limitations are. If someone has a younger account or is willing to open a new account to try it, that would be helpful.

Other feedback on the idea in general, on its problems, or on new features to implement is also welcome. Phlsph7 (talk) 12:45, 12 July 2023 (UTC)

I meant to reply to this sooner. This is awesome and I'm interested in this (and related ideas) related to writing / reading with ML. I'll try to have a play and give you some feedback soon. Talpedia 10:18, 17 July 2023 (UTC)
Related: see also m:ChatGPT plugin. Mathglot (talk) 07:22, 18 July 2023 (UTC)
Whilst I rather like the ability of this nifty little script to do certain things, I do have some criticism. These functions strike me as extremely risky, to the point that they should probably be disabled:
  • "is it true?" - ChatGPT likely uses Wikipedia as a source, and in any case, we want verifiability, not truth. I feel quite strongly, based on several other reasons too, that this function should be disabled and never see the light of day again.
  • "is it biased?" - ChatGPT lacks the ability to truly identify anything more than glaring "the brutal savages attacked the defenceless colonist family" level bias (i.e. something that any reasonably aware human should spot very quickly indeed). Best left to humans.
  • "is this source reliable?" - Same as the first one, this has so much potential to go wrong that it just shouldn't exist. Sure it might tell you that Breitbart or a self-published source isn't reliable, but it may also suggest that a bad source is reliable, or at least not unreliable.
I don't think that any amount of warnings would prevent misuse or abuse of these functions, since there will always be irresponsible and incompetent people who ignore all the warnings and carry on anyway. By not giving them access to these functions, it will limit the damage that these people would cause. Doing so should not be a loss to someone who is using the tool responsibly, as the output generated by these functions would have to be checked so completely that you might as well just do it without asking the bot.
The doc page also needs a big, obvious warning bar at the top, before anything else, making it clear that use of the tool should be with considerable caution.
The doc page also doesn't comment much on the specific suitability of the bot for various tasks, as it is much more likely to stuff up when using certain functions. It should mention this, and also how it may produce incorrect responses for the different tasks. It also doesn't mention that ChatGPT doesn't give wikified responses, so wikilinks and any other formatting (bolt, italics, etc) must be added manually. The "Write new article outline" function also seems to suggest unencyclopaedic styles, with a formal "conclusion", which Wikipedia articles do not have.
Also, you will need to address the issue of WP:ENGVAR, as ChatGPT uses American English, even if the input is in a different variety of English. Mako001 (C)  (T)  🇺🇦 01:14, 23 July 2023 (UTC)
You can ask it return wikified responses and it will do it with reasonable good success rate. -- Zache (talk) 03:03, 23 July 2023 (UTC)
@Mako001 and Zache: Thanks for all the helpful ideas. I removed the buttons. I gave a short explanation at Wikipedia:Village_pump_(miscellaneous)#Feedback_on_user_script_chatbot and I'll focus here on the issues with the documentation. I implemented the warning banner and add a paragraph on the limitations of the different functions. That's a good point about the English variant being American so I mentioned that as well. I also explained that the response text needs to be wikified before it can be used in the article.
Adding a function to wikify the text directly is an interesting idea. I'll experiment a little with that. The problem is just that the script is not aware of the existing wikitext. So if asked to wikify a paragraph that already contains wikilinks then it would ignore those links. This could be confusing to editors who only want to add more links. Phlsph7 (talk) 09:12, 23 July 2023 (UTC)
I made summaries/translations/etc it so that I gave wikitext as input to chatgpt instead of plaintext. However, the problem here is how to get the wikitext from page in first place. -- Zache (talk) 09:48, 23 July 2023 (UTC)
In principle, you can already do that with the current script. To do so, go to the edit page, select the wikitext in the text area, and click one of the buttons or enter your command in chat panel of the script. I got it to add wikilinks to an existing wikitext and a translation was also possible. However, it seems to have problems with reference tags and kept removing them, even when I told it explicitly not to. I tried it for the sections Harry_Frankfurt#Personhood and Extended_modal_realism#Background, both with the same issue. Maybe this can be avoided with the right prompt. Phlsph7 (talk) 12:09, 23 July 2023 (UTC)
Thanks for setting this up. I've recently had success drafting new Wikipedia articles by feeding the text of up to 5 RS into GPT4-32k through openrouter.com/playground and simply asking it to draft the article. It does a decent job with the right prompt. You can see an example at Harrison Floyd. I'll leave more details on the talk page of User:Phlsph7/WikiChatbot, but I wanted to post here for other interested parties to join the discussion. Nowa (talk) 00:02, 20 September 2023 (UTC)
Thanks for the information. I've responded to you at Talk:Harrison_Floyd#Initial_content_summarized_from_references_using_GPT4 so that we don't have several separate discussion about the same issue. Phlsph7 (talk) 07:44, 20 September 2023 (UTC)
Ran into a brick wall I thought might be helpful to know about. I've been working on the bios of people associated with Spiritual_warfare#Spiritual_Mapping_&_the_Charismatic_movement. GPT 4 and LLama refused to read the RS claiming that it was "abusive". I can see from their point of view why that is, but nonetheless, RS is RS, so I just read it manually. Between that and the challenges of avoiding copyvios I'm a bit sour on the utility of LLMs for assisting in writing new articles. It's just easier to do it manually. Having said that, the Bing chatbot does have some utility in finding RS relative to Google. Much less crap. Nowa (talk) 00:35, 9 October 2023 (UTC)

If we're going to allow LLM editing, this is a great tool to guide editors to the specific use cases that have community approval (even if those use cases are few to none at this point). I found it to be straightforward and easy to use. –dlthewave 16:06, 23 July 2023 (UTC)

There is no policy or guideline disallowing the use of LLM or other machine learning tools. No need for any approval unless that changes. MarioGom (talk) 17:29, 11 February 2024 (UTC)

LLM bank pages

Maybe the wrong location to post this but looking for a third opinion. A few pages on banks have been created in what I believe are using LLMs. I verified through online AI checkers which confirm but user stated they are not and moved them back to main space. Hoping for a third opinion before moving them back to draft as they are pretty bad (unsourced, promotional, etc.). See Haryana Gramin Bank, Chhattisgarh Gramin Bank, Jammu and Kashmir Grameen Bank, Rajasthan Gramin Bank, and Gujarat Gramin Bank. CNMall41 (talk) 04:50, 15 December 2025 (UTC)

They are highly formulaic, contain boilerplate promotional content, and the editor's only response is to edit-war to retain the problematic content. This is true even if it's not LLM. DMacks (talk) 04:56, 15 December 2025 (UTC)
I invited them to the discussion although I do not have high hopes. I think I may just sent them all G15.--CNMall41 (talk) 04:57, 15 December 2025 (UTC)
On the fence about whether these are AI but in future this sort of thing can be asked at WP:AINB where more people may see it. Gnomingstuff (talk) 06:55, 17 December 2025 (UTC)
That is where I should have filed but wasn't sure such a place existed until now. Thanks for pointing it out. --CNMall41 (talk) 07:27, 17 December 2025 (UTC)

What if AI keeps improving?

As AI models keep getting better and better, could this guideline eventually be removed in the future? Electron230 (talk) 22:48, 25 December 2025 (UTC)

I'd have to see a citation for 'AI models keep getting better and better'. There seems to be increasing evidence that in regards to their tendency to hallucinate, the newer versions of LLMs are no better, and may even be getting worse. Beyond that, we can't base policies etc on speculation, and have to deal with current tech, which is clearly unsuited to most of the things it currently gets used for on Wikipedia. AndyTheGrump (talk) 23:09, 25 December 2025 (UTC)
But still, AI model hallucination rates could drastically decrease in the future Electron230 (talk) 23:47, 25 December 2025 (UTC)
Given the way LLMs work, hallucination is an intrinsic property of their models. And that's only one of the problems.
For example, LLMs are trained on material that is not reliable sources according to our policies. Thus the output, even from a perfect machine, is not reliably suitable for inclusion. — rsjaffe 🗣️ 23:51, 25 December 2025 (UTC)
There has been a mathematical proof published which claims to demonstrate that hallucination in LLMs is unavoidable. As I understand it, this has broadly been accepted. Either way, we aren't going to speculate. We deal with what we see now. If you want to deal in hypotheticals, Wikipedia really isn't the place to do it. Not our purpose. AndyTheGrump (talk) 23:53, 25 December 2025 (UTC)
AI models keep getting better and better[citation needed]
But to actually answer the question Wikipedia:Consensus can change, but let's not waste time on speculation about the future. -- LWG talk (VOPOV) 23:54, 25 December 2025 (UTC)
Yes, but there is no guarantee that the obsolete LLMs will not be used concurrently with the latest and best "AI" models, as is the case now. You can still fire up any number of old LLMs on your computer or use them through various applications. You can also create a really lousy LLM. —Alalch E. 17:52, 9 January 2026 (UTC)

Disclose

The Grid, sorry but I'm struggling to make sense of your edit summary, WP:ACCUSATION has nothing to do with writing guidelines etc. At the moment we have Regardless, it is obvious that most editors prefer that users who use LLMs on Wikipedia disclose that use, and as of 2025, many users have been blocked for misusing LLMs and systematically not disclosing—including after being asked or warned about it—which made it impossible to start a constructive dialogue with them., the addition to the first paragraph (False denial of LLM-use when asked is likely to be met with sanctions.) was to state this practice clearly Kowal2701 (talk) 22:22, 16 January 2026 (UTC)

For some reason I read it as "accusing someone of using LLM when false would lead to sanctions" – The Grid (talk) 00:24, 17 January 2026 (UTC)
lol no worries Kowal2701 (talk) 01:02, 17 January 2026 (UTC)

Improving Tutorials on using LLMs for Wikitext Formatting

LLMs are great for formatting Wikitext and can help accessibility tremendously. gemini-cli with Gemini 3.0 & Gemini gems are great tools to take plain text and produce wikitext .

For example, gemini-cli will automatically produce proper wikitext from just plain text and a list of citations. It adds refs, links, templates, formatting. gemini-cli takes care of wikitext boilerplate effort (formatting citation URLs into citation markup, looking up template names & shortcuts, links to relevant wiki pages, applying wikitext formatting). It can read and write using the API.

A lot of investment has been made to improve Wiki Editor accessibility, and LLMs can help overcome those barriers at a much lower cost.

Questions

  • Is there an existing WikiProject or community for utilitarian applications of LLMs?
  • Where would this content (e.g. tutorials, support groups) be most helpful?
  • What other non-controversial or less-controversial applications of LLMs fit into this category?
    • e.g. SimplePedia summaries supervised by an editor?
    • Supervised citation cleanup

Example Input to gemini-cli / Gemini 3.0 Flash

welcome to wikipedia. Here are great tools for new users. Try edits in your sandbox. Check out Village Pump. Submit a draft article for review

Output

Welcome to Wikipedia. Here are some great tools for new users:


Tonymetz 💬 01:43, 14 February 2026 (UTC)

Notice: Planning help pages for AI workflows

Related Articles

Wikiwand AI