Wikipedia talk:WikiProject AI Tools

From Wikipedia, the free encyclopedia

This is the talk page for discussing WikiProject AI Tools and anything related to its purposes and tasks.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

WT:AITOOLSWT:AITOOLS

Looking for participants in a GenAI factuality study

Hi. I’m working with a team from Columbia University, funded by a Wikimedia Foundation rapid grant. We are seeking Wikipedia editors who are willing to participate in study on GenAI reliability, with a commitment of 10 - 20 hours in mid December - mid January 2026, and a symbolic stipend to compensate for your time.

The Research Project. Our goal is to find out if using a Wikipedia-inspired fact-checking process can increase the reliability of chatbots responding to queries related to Wikipedia’s content. The study uses open-source language models and frameworks, and our full results will be openly shared, with the aim of finding better methods for addressing AI hallucinations that are inspired by the well-established and highly successful practices of Wikimedia projects.

Please note that this project is a ‘’’pure and contained experiment’’’ for analyzing how far or close large language models are to editor-level factuality. We don’t plan on implementing any live tools at the moment.

The Task. The task required from participants is to fact-check an AI-generated response to a general knowledge question. This will be done checking whether each claim in a paragraph-long response is supported by the provided sources (each paragraph will be supported by up to 3 citations, the text of each citation is up to a few paragraphs).

Each participant will be asked to fact-check about 50 samples, with flexibility to do a bit more or less according to your availability. We recognize that this will be a demanding task, which is why we’re offering a stipend to those willing to make the time. The amount of the stipend is based on the amount of samples fact-checked.

Privacy & Security. If you choose to participate, we’re open to either crediting your efforts in our paper, or maintaining your full anonymity, whichever you prefer.

We adhere to the Wikimedia Foundation’s privacy policy. Participants may be asked to provide basic demographics for research purposes, which will be completely discarded after research concludes in early 2026.

Participation. All Wikipedia editors are eligible to participate. For methodological purposes, we may prioritize editors with expertise in specific subject matters, a higher Wikimedia project editor experience, or a focus and interest in fact-checking. If interested, please take a few minutes to submit the form! (Qualtrics external link). If you’re not comfortable filling out an external form, you may just send the answers to me directly using the EmailUser.

Happy to share the research proposal or answer any questions! –Abbad (talk) 00:27, 26 November 2025 (UTC).

@عباد ديرانية Well that seems like a major waste of time. The page you linked to says We'll build an experimental AI assistant for readers that exclusively draws answer from Wikipedia pages, and integrates an explicit and novel fact-checking step into its architecture that's inspired by Wikipedia's own fact-checking process by editors. and This assistant is not intended for public use but only as a time-bound experiment, which will be used for rigorous testing and evaluation of this model's reliability compared to Wikipedia's baseline of reliable information and using open source large language models (LLMs) as fact-checkers that can provide a reliable paraphrasing of Wikipedia's content

it won't be able to differentiate between its training data and the Wikipedia pages its supposed to use as sources.
current LLM technology can't reliably paraphrase or summarize content
training models requires copyright infringement on a massive scale, or it will be inferior to alternatives, which already have an established install base and a trillion dollars; kinda difficult to compete with.
doesn't it make more sense to actually check the sources and verify if they support the claim made in the article, instead of having yet another chatbot which can do something any chatbot can do, but worse?
2300 dollar is not enough to achieve something meaningful.
sample size is tiny
moderate agreement is a very very low bar
We'll consider this a success if more than two thirds of respondents support further experiemntation in the future. Makes no sense, of course 100% will support further experimentation, I do too, but not of this dead-end street. Having people support further experimentation does not mean this was a good idea.
It will just be another lossy unreliable vague layer between users and reliable sources, like Wikipedia often is. We need less of that (e.g. by using the |quote= parameter), not more.
This sounds like "I want to use AI, let me invent a usecase" not "I have a problem, let me fix it with whatever the best tool for the job is".
It is unclear what the results will be used for. The output will just be some numbers, which are meaningless by themselves.
It is unclear what an explicit and novel fact-checking step into its architecture that's inspired by Wikipedia's own fact-checking process by editors. means. Using MiniCheck isn't novel and "We'll ask an AI model to check the work of an AI model" leads to diminishing results. If MiniCheck can do verification, why can't the original model incorporate fact checking. The root problem is that the base model generates facts, half-truths and nonsense. Instead of trying to sort fact from fiction the goal should be to create a model that can verify its own output during generation but that is far outside the scope of the WMF.
A binary metric (true/false) is clearly inadequate when checking if the paraphrasing is any good. A good summary doesn't leave out important facts; yet the proposal only measures pure falsehoods instead of omissions of important stuff, distortions, cherry picking, loss of nuance, synthesis et cetera. Pure hallucinations are a minority of the mistakes an LLM makes, but according to the proposal they're the only ones being measured.
We already had this same discussion , for example over at Wikipedia:Village_pump_(technical)/Archive_221#Simple_summaries:_editor_survey_and_2-week_mobile_study. So when the response was universally negative, and we already know why this can't work, why try again?
Why ask for volunteers and WMF money when Wikipedia doesn't benefit from the results? Why ask Wikipedians, who have a lot of stuff to do, to volunteer to do stuff that doesn't help Wikipedia? Its not like the AI companies will improve their products based on the results, and one can't improve Wikipedia based on the outcome, so who benefits?
The proposal says We'll build an experimental AI assistant and if that was true testing it would make sense. But it also says the plan is to just mash some pre-existing stuff together. If so, why ask volunteers to check how good or bad Llama and MiniCheck are? Shouldn't Meta Platforms employees test Llama? Shouldn't Mistral AI SAS employees test Mixtral? These are commercial companies who can surely hire some people to test their stuff, if they wanted. If there is no plan to add anything new that should improve performance, why bother testing? One datapoint is no datapoint. I already know the outcome: current AI tech is not as good as humans, especially not the nerdy type who edits Wikipedia, and attempts to quantify the difference are pointless because they are just a weighted random number generator one could build a narrative around. In order to make it slightly less meaningless you'd have to keep doing it with each new version and track performance over time, but that would only help AI companies, not Wikipedia.

You can't measure success by comparing this chatbot against commercially available chatbots. The correct baseline is Wikipedia itself, which anyone can access already and read what it actually says.

Showing that this chatbot produces fewer errors than commercial LLMs only proves that it is slightly less bad than commercial LLMs, not that it is a good approach to deliver Wikipedia content to users.

If any hallucinations or distortions are added by the chatbot, then it is worse than just reading Wikipedia yourself.

The interesting variable is how many hallucinations/misrepresentations/distortions are added compared to just reading the Wikipedia articles; how the chatbot compares to commercial LLMs is irrelevant to us.

I may be stupid but I don't get it. Polygnotus (talk) 00:49, 26 November 2025 (UTC)

@DSaroyan (WMF) and FElgueretly-WMF: Please explain why this is a good idea. Which technical experts has the Review team consulted? It would be nice to hear from them as well. It is also unclear to me how a Rapid grant can be awarded to a project that is ineligible: Applications to complete proposed research related to the Wikimedia movement are not eligible. Please review the Wikimedia Research Fund for these funding opportunties. --meta:Grants:Project/Rapid#Eligibility_requirements Thanks, Polygnotus (talk) 01:06, 26 November 2025 (UTC)

This was also posted over at Wikipedia:Village_pump_(miscellaneous)#Looking_for_participants_in_a_GenAI_factuality_study. Doubleposting is generally discouraged because it wastes people's time. Polygnotus (talk) 03:29, 26 November 2025 (UTC)

@Polygnotus I appreciate the thoughtful critique. To what I interpret as your main point - yes, any hallucinations are bad. However, LLMs are already prevelant in the industry and academia, as you must know, and from our daily observations, their use almost completely lacks any sense of responsibility towards reliability. Honestly, Wikipedia itself, as a teritary soruce, shouldn't even be the ideal baseline for factuality, but we recognize that research is an incremental endavour, therefore our approach is start with introducing a methodical way to improve over the status quo of LLMs usage. Realistically, we can't even expect LLMs to improve without such experiments. Please note that because Wikipedia is our chatbot's source, it is effectively a baseline for this study well.

In-line responses:

Points 1-3: We examined the differentiation between retrieval and training data in-depth when scoping our research, and we have two consideraions: A.From our literature review, we're our aware of methods that aim exactly to differentiate when an LLM's answer is grounded in the provided context versus training data. If our resources allow, we do aim to implement the methodology from this paper in drawing this differentiation. However, this is a challenging set up, and our team is ~~100%~~ volunteer-based (or more like 90%, we've had a little budget planned for some team members, but with fiscal sponsorship + paying evaluators + computing, we now expect a couple hundred USD surplus only), so even with the humble grant we may not be able to go that far. B. The eventual purpose of this study is to evaluate the factuality of LLMs in practice. Whether they make errors due to their training data, architecture, or Wikipedia-grounded context, it's eventually an error.
Re: Point 4: 100% agreed, and honestly my original idea was to build something exactly like the Source Verification tool using the MiniCheck model, which is open source, very lightweight, and has shown imprssive accuracy in dozens of experiments that I did with it. My fellow researchers recommnded a RAG approach because it has much more impact on the irresponsible use of chatbots in the industry, which is true. Also, because I discovered now that the Source Verification tool exists, I'm not sure if this approach is any different. I do still hope to run a methodical experiment, once we're done with this project, by: A. Extracting the full-text of some citations (e.g. a book), B. Extracting instances where they're cited on Wikipedia pages, 3. Running the full-text + cited phrases through MiniCheck to see how accurate it is. I believe the results coudl be impressive.
Re: Points 5 - 6: Indeed! That's why all the researchers are ~~100%~~ volunteer. We're doing what we can with our budget, but we also understand that the community may not support pouring larger resources into experimental research at this point
Re: Point 7: This is almost exclusively the annotation baseline from other LLM research we ran across. I'll do more homework on this, but please feel free to advise if you're aware of alternatives.
Re: Point 8: This is a goal to determine the success of the grant itself, so it needed to be experiment-tied, and a user-testing goal seemed appropriate. You're right, though, and I'm open to revise. I'm hestitant to set a specific goal of factuality improvement because we won't know, obviously, until we conduct the experiment.
Re: Point 9: While I don't disagree, lossy middle layers are not only a reality, but a necessity. As you mention, Wikipedia itself is a mediator of information, simply because most people lack the depth of knowledge and/or the time to digest information directly from secondary sources. LLMs, as far as we know, are here to stay, and this is debate of that reality rather than how it can be improved.
Re: Points 10-11: This is clearly a huge use case, which is literally why we opted for it (over, as I mentioned above, what could be personally more interesting to me in terms of a tool to fact-check Wikipedia sources). For example, my company, which is not special in this in any way, pays for what's easily hundreds of millions of LLM queries a month, mainly to power chatbots. As of now, the vast majority of these chatbots on the internet barely make any attempt at truth-seeking that's analoguous to what we're proposing. The results from our study have the basic purpose of proving or disproving that the approach we're trying can have an impact on factuality. In case it does, that's an improvement on the status quo that will affect millions of users.
Re: Point 13: Yes, strictly speaking, this is a factuality-centered study. Other aspects would fall under a summarizing task.
Re: Points 14 - 16: This is very intentionally designed as an experiment of how existing tools like MiniCheck work. MiniCheck has already been developed, but how do we know if it's doing its job well? The fact that these LLMs have been developed by labs has little to do with who's using them, which extends to researchers, educators, non-profits, amd even Wikimedians. However, the commercial labs obviously don't care that much about how factual their models are in an academic sense, and have had little work in this avenue (otherwise, we would have seen way fewer hallucinations). We're volunteering our time for this because we feel like it's a critical under-researched area, and you're free to think it's worth or not worth your own time. Because this is such a small study, the impact won't be astronomical, but we believe it can be very singificant for both Wikipedia contributors, because our results will show how effective MiniCheck can be as a fact-checker. This will be evidence of whether or not it's usable for Source Verification tool, rather than the simple fact that it exists. Did anyone else systematically test whether the fact-checking framework of that tool is consistent and useable?

~~TBC - the are lots of good points here, I'll come back for the rest as soon as I have the chance!~~ Answered --Abbad (talk) 21:24, 26 November 2025 (UTC).

@عباد ديرانية ReDeEP looks cool but if I were you I would completely ignore Mixtral and stick to LLaMA. I do not think ReDeEP will be able to fix the problem that the model will mix training data and Wikipedia content.

Please correct me if I am wrong, but if I am reading between the lines I think we mostly agree on the facts (although I would recommend using a different tactic).

While LLM factuality is interesting (and annoyingly under researched by the guys with the big bucks), most Wikipedians are always gonna be more interested in using MiniCheck to determine if a claim in a Wikipedia article is supported by the source (or not).

We Wikipedians are a very simple people of humble peasant farmers like myself who just want results; not an academic study.

So while you do your thing, can you please allow others to use MiniCheck as well? You already know exactly how I want to use it.

Adding "MiniCheck was correct" and "MiniCheck was wrong" buttons is not very complicated.

If we can show the masses practical results, it is much much easier to get them to volunteer/contribute/whatever.

That way we have both academic validation and real-world testing, which benefits both.

I do not agree that our results will show how effective MiniCheck can be as a fact-checker because that is not what is being tested (and you wouldn't need such a complex pipeline to test just that).

Testing whether a complex AI pipeline produces fewer (or filters out more) hallucinations than the base model is interesting, but not relevant to Wikipedia.

I think the study needs to benefit Wikipedia, not just use it as a testbed, before you should be able to get WMF money or Wikipedia volunteers. And I don't really see it doing that at the moment. Polygnotus (talk) 07:15, 27 November 2025 (UTC)

~500 responses total need evaluation.

At least 300 of those need ≥3 evaluators.

Lets say the remaining 200 get one evaluation each.

So at least 1100 evaluation tasks.

I don't agree that a simple true/false evaluation will lead to meaningful results (point 13 above), but let's assume it is fine.

Each participant will be asked to fact-check about 50 samples (according to your comment above) so you need about 22 people.

Your comment talks about a commitment of 10 - 20 hours in mid December So 220 - 440 hours of volunteer time? Assuming an 8 hour work day we are talking 1.25-2.5 workdays per person and between 27.5 and 55 8-hour days of work sequentially... I am not sure why evaluating 50 samples should take 10-20 hours (12-24 minutes per evaluation for a simple yes/no on a short bit of text??).

The budget talks about 10 evaluators doing 100 responses each in 5 hours, so 3 minutes per evaluation. That is 100 evaluations short and doubles the workload per person. So if the budget allows for 5 hours per evaluator, why ask for 10-20 hours? The budget is $1000 dollar for 10 people doing 100 responses each so that is $1 per evaluation.

The form says The rate will be 100 USD / 30 fact-checked samples, with payment prorated according to the completed samples. but there is only 1000 dollars in the budget allocated so that does not compute. You can only buy 300 evaluations for that money, but you need 1100 evaluations. That is $3.33 per evaluation.

Did an LLM come up with these numbers? Is the plan to pay people 33% of what was promised to them, or to run out of money after 300 evaluations? What will happen if someone did 50 samples and wants the $166.67 that was promised to them?

I find it extremely difficult to outsource items on my todolist, both irl and on Wikipedia. Finding 22 Wikipedians who are willing to spend a significant amount of time doing a very boring task that does not benefit Wikipedia is gonna be real hard. I don't think a symbolic stipend is gonna do much to motivate em.

In summary, the study as proposed won't work. But installing MiniCheck somewhere and giving me an API endpoint and credentials is a good idea. Polygnotus (talk) 08:09, 27 November 2025 (UTC)

@Polygnotus The discrpancies in numbers are because we decided to increase the pay for evaluators as much as possible, at the cost of minimizing any share we take (practically none at this point). As you rightfully say, we realized that this is a difficult and boring task, and therefore thought it appropriate to increase the amount to at least 3 USD per sample, thus a total of roughly $100 / 30 samples. Indeed, this will reduce the amount of total samples we can analyze, but that's better than unappreciated labor. We will increase the eval share to 1,600, and will be able to thus fund about 500 examples. I admit that the numbers got a bit jumbled (my fault, if LLMs were used I may have gotten them more in line!).

I find it a bit confusing that you agree with this being a symbolic stipend and barely enough motivation, but also disagreed that making this hard evaluating will take 12 - 24 minutes. Anyhow, the hourly ranged are very rough, and I stretched them to be extra safe.

Re: MiniCheck, I'm more than happy to collaborate if you want. MiniCheck is available through HuggingFace, we already have the subscription (which is a negligible 9 USD / month). It's pretty easy to grant access, if all you need is the API, I'll be in touch. The hard part is actually evaluating the results, and methodically checking if they work as well as we'd like them to.

We're already committed to a chatbot experiment for this round of funding, so we do need to proceed with our current methodology in principle. I'm quite happy, though, to work together on a study dedicated for MiniCheck as a standalone (as I transparently mentioned, it is what I'm personally interested in as well!). If I manage to squeeze in any other funding, also happy to make that a proper study, if it's of interest to you --Abbad (talk) 22:05, 27 November 2025 (UTC).

@عباد ديرانية Yeah it looks like the plan evolved over time, like all good plans do.

The good news is that Wikipedians are usually pretty good with intrinsic motivation.

disagreed that making this hard evaluating will take 12 - 24 minutes. According to the budget it will take only 3 minutes. And the comment near the top of this page says paragraph-long response is supported by the provided sources (each paragraph will be supported by up to 3 citations, the text of each citation is up to a few paragraphs). and you only want a true/false. It seems very very unlikely that it would take me anywhere near 24 minutes on average to read 3x a few paragraphs and decide if they support a paragraph-long LLM response. 3 minutes on average sounds more realistic although it may be too short. I think the number will be somewhere in between. I would probably ask Claude to find the relevant text in those sources, which would speed up the human part of the equation.

it is what I'm personally interested in as well!) Exactly, so you understand why I am far more excited about playing around with MiniCheck. One of my, probably many, flaws is that I am unfit for academia. Although I am very curious if and how well the ReDeEP approach actually works (or you know, SEReDeEP, if we wanna stay up to date). Polygnotus (talk) 23:18, 27 November 2025 (UTC)

This WikiProject is just getting started (8 members already) and this page doesn't get that many pageviews. Our conversation may be confusing to potential volunteers so I unhatted (incorrect terminology but whatever) the doublepost. You know where to find me for the MiniCheck stuff. Good luck! Polygnotus (talk) 01:41, 28 November 2025 (UTC)

This study appears to have the goal of encouraging the use of LLMs, based on 'fact-checking' using Wikipedia as a source. Given that Wikipedia makes it entirely clear that it does not consider itself as a reliable source, the study is clearly ill-thought out, or at best, engaging in wishful thinking. And furthermore, any encouragement of this misleading LLM use can only make things worse for Wikipedia itself, as it faces a deluge of LLM-generated garbage, generated by a technology which routinely hallucinates (as has been demonstrated to be mathematically inherent in such software), engages in synthesis, contrary to Wikipedia policy, and mangles source citations to the extent that even if they originate from something genuine (and meeting wikipedia sourcing policy, which LLM citations routinely don't) the amount of effort required to find the actual source is totally disproportionate to their utility. I would advise anyone contemplating engaging with this study to question whether it is in the interest of Wikipedia's contributors, and perhaps more importantly its readers, to do so. AndyTheGrump (talk) — Preceding undated comment added 03:57, November 26, 2025 (UTC)

Checking offline sources

Hi @Polygnotus, I've tried you AI Source Verification tool and it works really well for online sources. Of course in many content areas the majority of sources would be paper books and so it'd nice if the tool supported offline sources too in some way. Have you planned something to make it possible? The simplest approach would be to allow the user to paste the text (I actually built a toy standalone app using this approach) or upload a source. Any other ideas how it can be tackled? My assumption is that many editors would be able to access offline sources, whether using Wikipedia Library, Google Books or some other digital library. Alaexis_¿question? 11:20, 29 November 2025 (UTC)

@Alaexis Hiya! That is a good idea, and since you forgot to copyright it I will immediately steal it.

It might also partially solve the paywall problem.

I am currently playing around with User:Polygnotus/CitationVerification.

Where can we find this standalone app of yours? Polygnotus (talk) 11:40, 29 November 2025 (UTC)

No worries at all, happy to suggest improvements :) My app is here - I've just added BYOK, hopefully it hasn't caused any issues. It's very much a beta version.

I think that an addon works much better for Wikipedia editors. I had in mind a different target audience - readers rather than editors - hence a standalone app.

One thing I couldn't find a good solution for is multiple references supporting a single claim. As far as I can see your tool also looks at each reference individually which will produce false positives if a source supports only a part of the claim. Alaexis_¿question? 12:26, 29 November 2025 (UTC)

@Alaexis Ah thats really cool! I've just added BYOK, hopefully it hasn't caused any issues. It works fine over here. Perhaps you can add it to Wikipedia:WikiProject AI Tools?

It must be possible to deal with 2 refs together supporting 1 claim, but I haven't looked at it yet. Thanks for sharing! Polygnotus (talk) 12:40, 29 November 2025 (UTC)

Repository of prompts?

The thread immediately above led me to inspect User:Polygnotus/Scripts/AI Source Verification.js to see what prompts they were giving to the APIs. Separately, I've been finessing my instructions for a "Wikipedia research assistant" for initial sanity checks, hosted by Kagi. Maybe this project could have a page for sharing or workshopping examples like this. ClaudineChionh (she/her · talk · email · global) 13:23, 29 November 2025 (UTC)

That's a good idea. The prompt I used for my citation checker app can be found here . Alaexis_¿question? 14:02, 29 November 2025 (UTC)

Personally I find it helpful to provide examples of request-response pairs though I'm not sure if it would work for your use case. Alaexis_¿question? 14:03, 29 November 2025 (UTC)

Oh yes, my file is more like a default set of instructions as a starting point. I can dig into my chat history for more specific examples of prompts and responses. ClaudineChionh (she/her · talk · email · global) 00:10, 30 November 2025 (UTC)

I've been thinking about having an API platform that could cache AI outputs used to review specific revisions (in case multiple editors send the same AI query, e.g. when patrolling recent changes) and simplify development workflow for new tools, and that could be a helpful use for it!

Beyond that, we don't have a guide yet, and "prompting tricks" would definitely be an essential part of it: feel free to start it! Chaotic Enby (talk · contribs) 20:35, 5 December 2025 (UTC)

Generating the infobox from the article text?

Has there been any exploration into using AI tools to generate the infobox from the article's existing text? Whatisbetter (talk) 11:24, 2 December 2025 (UTC)

@Whatisbetter, nothing I'm aware of but it should be pretty straightforward. What kind of articles do you have in mind? Alaexis_¿question? 22:35, 2 December 2025 (UTC)

There are absolutely no circumstances where it would be appropriate to use AI to "generate the infobox from the article's existing text". AI (or at least LLM's, which are presumably what is being referred to) cannot be trusted. They synthesize. They 'cite' things that don't remotely support the text being cited for. They routinely hallucinate. Wikipedia content (including that in infoboxes) needs to be written by contributors who can ensure that it is correct per a valid source, and are prepared to take responsibility for doing so. If you want LLM-generated content, look elsewhere. AndyTheGrump (talk) 22:42, 2 December 2025 (UTC)

@AndyTheGrump is correct: current AI technology is unable to summarize a text, or to find the interesting bits. Crafting infoboxes will remain a human task for the foreseeable future. Polygnotus (talk) 04:05, 3 December 2025 (UTC)

@AndyTheGrump, @Polygnotus That's just simply not true. A simple prompt of "1. research and understand how wikipedia infoboxes work. 2. read the article "1234" and tell me if there's any information already present in the article that could be used in the infobox. do not pull from any other resources." And then the HUMAN editor reviews the results.

It's really that simple. --skarz (talk) 13:45, 6 March 2026 (UTC)

@Skarz do not pull from any other resources. Commercially available AI models cannot do that, currently. They don't differentiate between user input and the dataset they are trained on. Polygnotus (talk) 14:17, 6 March 2026 (UTC)

@Polygnotus That's funny, ChatGPT did exactly what I asked it to (accurately) right in this conversation.

Where are you even getting your information from???

--skarz (talk) 14:20, 6 March 2026 (UTC)

@Skarz God told me in a dream. jk, check out my userspace. I am not exactly opposed to LLMs, but I think it is important to understand their limitations. Polygnotus (talk) 14:23, 6 March 2026 (UTC)

@AndyTheGrump, you're right about the hallucinations and other issues. I do not suggest or condone any violations of the policy. However, I think that it's possible to use LLMs to generate a draft which would have to be checked by a human editor.

Also, a few approaches have been suggested to control hallucinations. Here's one that might work though I haven't tried it myself AI Driven Citation: Controlling Hallucinations With Concrete Sources. It's suggested by Gavin Mendel-Gleason who is working with Peter Turchin. Alaexis_¿question? 06:54, 3 December 2025 (UTC)

@Alaexis Have you tried MiniCheck? Polygnotus (talk) 06:57, 3 December 2025 (UTC)

@Polygnotus, not yet, how do I run it? Alaexis_¿question? 15:14, 3 December 2025 (UTC)

@Alaexis

If you want to run MiniCheck on your own computer, then the answer is this, but this is a binary yes/no.

https://www.bespokelabs.ai/bespoke-minicheck gives out free API keys.

The answer to the question depends very much on how nerdy you are.

How familiar are you with Python? Do you want to run it on your own pc?

It is usually easier to just use their free API. I don't know what Operating System you use (*nix, MacOS, Windows) but usually if you Google "run python script" with the name of your operating system it should provide instructions.

The code on that page is pretty outdated btw. Polygnotus (talk) 15:23, 3 December 2025 (UTC)

Thanks @Polygnotus, I reviewed the docs and it seems pretty straightforward, though I'm not sure I'm going to use it right now - they claim that it's only marginally better than Claude Sonnet 3.5. Alaexis_¿question? 10:18, 5 December 2025 (UTC)

@Alaexis In my extremely limited testing I can't really tell if its better or worse than Claude, but Claude is clearly better for our purposes because it can (I am intentionally using a word incorrectly here) explain its thinking. Polygnotus (talk) 10:31, 5 December 2025 (UTC)

I had found that populating an infobox from Wikidata works OK (when there is data available). Major manual editing and checking will be required, but beats the starting form scratch and actually reading the Wikidata format in infobox instructions. I never tried to use the text as a starting point, but expect it to work too. The modern high-end engines (I mostly use the Google Gemini 3) do a quite decent job in very unexpected areas, search for information in a structured fashion is one of them. I haven't seen an outright hallucination for months. Ask a random question, you might get a random answer. Ask when the term Net load was coined, you will get a wrong answer, but it will point you to quite solid WP:RS, so human can make the same mistake, too - I wouldn't count this type of errors hallucinations. Викидим (talk) 08:25, 3 December 2025 (UTC)

I take it you are aware that since WikiData isn't WP:RS, you can only use it indirectly, where it actually cites a valid source? Anyway, your 'Major manual editing and checking will be required' comment points to what is likely to be a major issue with AI-assisted infobox generation, given how Wikipedia currently operates in practice - far too many people will simply assume that the AI has got it right, and not check it. AndyTheGrump (talk) 11:27, 3 December 2025 (UTC)

To me Wikidata item is like an article in foreign language - a source for translation that should be checked. AI helps to navigate through quite complex set of infobox templates, each with its own parameter quirks. AI does not do a good job populating these fields, but it is an OK way to get to the starting point. Викидим (talk) 17:19, 3 December 2025 (UTC)

Example of using AI to fix the text involving adding infobox: before / after / changes / history. Manual checking and reworks were necessary - but I would have never attempted this repair without an AI assistance (it would be too much hassle). Викидим (talk) 21:31, 4 December 2025 (UTC)

Discussion at Wikipedia:Village pump (idea lab) § Scope of AI tool use

You are invited to join the discussion at Wikipedia:Village pump (idea lab) § Scope of AI tool use, which is within the scope of this WikiProject. Cf. the previous discussion about whether generating an infobox from an article text would be acceptable. Chaotic Enby (talk · contribs) 20:37, 5 December 2025 (UTC)

Invite template

@Chaotic Enby It would be useful to have an invite template that can be posted on user talkpages to invite people. Polygnotus (talk) 08:47, 5 January 2026 (UTC)

Good idea, I'll work on it! Chaotic Enby (talk · contribs) 09:06, 5 January 2026 (UTC)

@Chaotic Enby It may be a good idea to add User:Overandoutnerd/Scripts/articleSummary and invite Overandoutnerd using the template. Polygnotus (talk) 19:13, 4 February 2026 (UTC)

Sent them the invite, great catch! Chaotic Enby (talk · contribs) 20:26, 4 February 2026 (UTC)

Maybe we should just make some insource: links with intitle:.js that search the user namespace for relevant key words like Gemini and Claude et cetera https://en.wikipedia.org/w/index.php?search=insource%3A%22Gemini%22+intitle%3A%22.js%22&title=Special%3ASearch&profile=advanced&fulltext=1&ns2=1 Polygnotus (talk) 20:41, 4 February 2026 (UTC)

Great idea! Chaotic Enby (talk · contribs) 20:55, 4 February 2026 (UTC)

[Research] Preliminary analysis of AI-assisted translation workflows

Note: To keep the conversation organized, I have primarily posted this at the Village Pump. I encourage any questions or discussions to take place directly there.

I’m sharing the results of a recent study conducted by the Open Knowledge Association (OKA), supported by Wikimedia CH, on using Large Language Models (LLMs) for article translation. We analyzed 119 articles across 10 language pairs to see how AI output holds up against Mainspace standards.

Selected findings:

LLMs were found to be significantly better than traditional tools at retaining Wikicode and templates, simplifying the "wikification" process.
26% of human edits fixed issues already present in the source article (e.g., dead links), showing that the process improves the original content too.
Human editors modified about 27% of the AI-generated text to reach publication quality.
We found a ~5.6% critical error rate (distortions or omissions). This confirms that "blind" AI publication is not suitable; human oversight is essential.
Claude and ChatGPT led in prose quality, while Gemini showed a risk of omitting text. Grok was the most responsive to structural formatting commands.

Acknowleging limitations: We consider these findings a "first look" rather than a definitive conclusion. The study has several limitations, including:

Subjectivity: Error categorization is inherently dependent on individual editor judgment.
Non-blind testing: Editors knew which models they were using, which likely influenced their prompting strategies.
Sample size: While we processed over 400,000 words, the data for specific model comparisons across all 10 language pairs is insufficient.

Our goal is to provide some data for the community as we collectively figure out the best way to handle these tools. The full report, including the error taxonomy and raw data logs, is available on Meta. 7804j (talk) 21:00, 20 January 2026 (UTC)

LLMs benchmarking

I've done some benchmarking which may be of interest to the builders of AI tools here. I wanted to know how well various LLMs can handle source verification: checking whether a given source supports the claim it's attached to (User:Alaexis/AI_Source_Verification). I tested a few open-source models hosted by PublicAI and Claude Sonnet 4.5 as a SOTA model.

No surprises - Claude was the best but the difference in performance was not huge. I had 16 "not supported/partially supported" cases in my dataset and different models found between 7 and 12 of them (still too few for statistically significant comparisons of specificity) maintaining decent false positive rates (<15% for Claude, <20% for the next best model). The full results can be found here.

Some of the "not supported" cases are quite interesting (, citation 32 - try to figure out yourselves what's wrong with it). This exercise wasn't meant to detect the rate of inaccurate citations but it does make you think about all those inaccuracies lurking out there. Alaexis_¿question? 20:45, 27 January 2026 (UTC)

Notice: Planning help pages for AI workflows

There is a discussion at the Help Project on which help page about "how to edit Wikipedia with AI assistance" to draft first. — The Transhumanist 13:58, 19 February 2026 (UTC)

Help needed with archive links

We need about 600,000 archive URLs replaced; see Wikipedia:Archive.today guidance. One of the slow bits is checking to see whether another archiving website contains the relevant content (or, e.g., if it just contains a mostly empty page or an error message). Is this something that some AI tools could help with?

Is this something that some AI tools could help with? Short version: No. Long version: Fuck no. But also, somehow, yes and maybe.

Note that Wikipedia:Archive.today_guidance#How_you_can_help is possibly a bad idea in the context that humans are messy, I would do as much as possible botwise and then use humans for the difficult stuff.

There is a dump of archived urls but its like 6mo out of date.

And CitationVerification should of course be used for a lot more than just a project like this, but it could be a start.

See User:Polygnotus/CitationVerification and that one WMF guy on Phabricator who made something similar. T414816

But the homie @DVRTed: is cooking up something magic. Also pinging the homie @Alaexis: who is also doing work in this area. Polygnotus (talk) 06:56, 23 February 2026 (UTC)

A part that AI might be helpful with is assessing the content at the alternative archive. Someone could write a bot to fetch the content from the other archive site, post it to a GPT-5-like model. That would cost some money but, I bet the task could be done with an older, less-expensive model.

I think that assessment is a good role for AI because the consequences for being wrong are pretty low. I’ll read the project page. Dw31415 (talk) 13:57, 23 February 2026 (UTC)

@Dw31415 Problem is, most people don't use |quote=. And 600,000 times a large amount of tokens is a large number, allegedly. I bet the task could be done with an older, less-expensive model. Those generally suck at "which string fully supports this claim, if any?" and if you don't know for sure that you found a string that fully supports the claim made you can't tell for sure if you've succesfully replaced the archive, if no |quote= is present (because there are quite a few cases where an archive was added a long time after a claim was added to Wikipedia, in which time the source could've changed). The other approach is taking whatever they've archived (except the offending code) and putting it elsewhere but that also has various problems (like provenance, scraping services not accepting stuff you scraped for them, and scraping probably being against a ToS somewhere, and perhaps even unethical yada yada). Polygnotus (talk) 14:38, 23 February 2026 (UTC)

@Dw31415, @Polygnotus I think that the best way to check whether the task could be done with an older, less-expensive model is to try. PublicAI host a few open-source models and they allocated enough tokens to run an experiment (e.g., would the AI make the same decisions as the editors who replaced archive.today links, whether using Netha's tool or not). I compared the performance of open-source and SOTA models for citation verification which is a similar problem and the open source models weren't that far behind .

@WhatamIdoing, I'm not entirely sure the AI is necessary here. Netha's tool somehow manages to filter out invalid snapshots (though take this with a grain of salt, my sample size is N=1). It seems like simpler heuristics would suffice. Alaexis_¿question? 20:06, 23 February 2026 (UTC)

I'm not sure that it's necessary, either, but having used Netha's tool (twice), the process looks like:

Open the tool and search for something (e.g., an article)
Open the Wikipedia article in a new tab to figure out what the source is supposed to be supporting
Open the suggested archive.org link to figure out whether the link has relevant content

and I was thinking: what if you could see all of that on one screen? WhatamIdoing (talk) 21:18, 23 February 2026 (UTC)

To be honest that's not how I understood the process. I think it's enough to verify that archive.org snapshot isn't empty/error and is identical to the archive.today snapshot. Are you sure that we need to re-check citations?

Source Verifier can help check citation accuracy but it's a user-sxcript for manual verification of individual citations. If someone is going to build a tool where you could see all of that on one screen, I'd be happy to share my experience, code and contacts. Alaexis_¿question? 22:11, 23 February 2026 (UTC)

I'm not interested in going to the archive.today websites, so I'm not checking against those, but I think that would work for people who are willing to do that. OTOH, when I've set out to fix a few links, I not infrequently find that the original source was weak, or the article significantly out of date. For example, Netha's tool flagged a source in Tracheal intubation. I fixed the archiving problem very easily thanks to her tool, but the source itself is a random website from 1998, which is nowhere near the MEDRS ideal. WhatamIdoing (talk) 22:29, 23 February 2026 (UTC)

If all we need to do is check whether the main contents of two web pages match, we don't need an LLM for that - you can extract body content with Beautiful Soup (HTML parser) and run a file comparison algorithm. I put a few more details about this concept here: Wikipedia talk:Archive.today guidance#Bot for checking for identical text. Dreamyshade (talk) 01:55, 28 February 2026 (UTC)

Doesn't really make sense to mention this here since it has nothing to do with AI (and I don't really plan on adding that), but I was pinged here, so this is what I'm currently working on: User:DVRTed/ArchiveBuster. — DVRTed (Talk) 21:33, 24 February 2026 (UTC)

I'd been planning something similar, so it's good to see this! Would you say your script is ready for others to use? ClaudineChionh (she/her · talk · email · global) 22:19, 24 February 2026 (UTC)

I wouldn't consider it "stable" by my own standard just yet. I'll make a sandbox to cover some template params edge-cases I can think of and finalize the script documentation (oh, the horror!). Bug reports would be very helpful, if there are any. — DVRTed (Talk) 22:31, 24 February 2026 (UTC)

I just replaced 42 archive.today links; this took me a bit more than 2 hours. Even if it was just me who'd replace the same amount of links daily with the same delay, we'd be done in about 3 years, 9 months and 14 days.

Don't try to shove AI into everything. sapphaline (talk) 14:48, 24 February 2026 (UTC)

@Sapphaline How much do you charge per hour? How many hours per day and days per week do you work? Polygnotus (talk) 14:50, 24 February 2026 (UTC)

"Even if it was just me who'd replace the same amount of links daily with the same delay", i.e. if I would replace 42 links every day with the same 2 hour delay. sapphaline (talk) 14:52, 24 February 2026 (UTC)

@Sapphaline 3 years, 9 months, 14 days ≈ 1,384 days = 33,216 hours. At 42 links per ~2.25 hours, that's roughly 42 × (33,216 / 2.25) ≈ 619,000 links

So now we just need to know how much you charge. Our budget is ~250 million USD. Polygnotus (talk) 14:54, 24 February 2026 (UTC)

@Sapphaline, how many of those removals 'stuck'? I've seen one report of IABot replacing them. WhatamIdoing (talk) 18:06, 24 February 2026 (UTC)

What do you mean by "stuck"? sapphaline (talk) 18:27, 24 February 2026 (UTC)

Remain unreverted. WhatamIdoing (talk) 18:32, 24 February 2026 (UTC)

Remaining archive.today links? None. sapphaline (talk) 18:33, 24 February 2026 (UTC)

Nice work @DVRTed! Any sense how long it would take 100 volunteers to use your tool to replace the 600k links?

I think a Bot could still be very useful, potentially as:

A) A compliment to ArchiveBuster (and/or the GitHub one). Maybe something like Wikipedia:FRS, a bot that assigns a list of articles to volunteers. It posts a list of links to the volunteers talk pages.

B) A bot that uses an AI pipeline to evaluate the alternatives and just replace the links.

C) A toolforge hosted Interstitial webpage. A bot that replaces the archive today links with an interstitial page (IP) link. The IP warns the user about archive today. Gives a link on how to volunteer to fix it.

I’m not free to code until Sunday at the earliest. @Poly, your thoughts? Dw31415 (talk) 08:09, 25 February 2026 (UTC)

@Dw31415 Since having a large list of articles and assigning parts of it to volunteers who should then be able to mark em as completed is a recurring need we should probably build something for it.

I was talking with DVRTed about how it would be cool to query the archive.org api for when a particular URL was scraped so that you can then send the user the most likely option. Polygnotus (talk) 08:19, 25 February 2026 (UTC)

Any ideas on marking it done? Is that really needed? As an alternative, “VolunteerAssignmentBot” could just say, “these have been assigned to you for 5 days” and then assign them to someone else if the pages still meet the search criteria. The volunteer could reply “more please” and the bot could assign more. Just brainstorming Dw31415 (talk) 08:51, 25 February 2026 (UTC)

@Dw31415 Well I would like it to be re-usable for other tasks. And with some tasks you can actually check if someone did the work using software (are all links to archive.is|today gone), but with others you cannot. For example with typofixing it is possible that they did check but that it was a false positive. Polygnotus (talk) 08:54, 25 February 2026 (UTC)

I guess the bot could post a table to a user subpage. The user could edit the table. Or the bot could create a topic on their talk page with one reply per page. The user could reply to each message with “done” Dw31415 (talk) 09:03, 25 February 2026 (UTC)

You'll be able to use Category:CS1 maint: deprecated archival service soon. WhatamIdoing (talk) 21:41, 25 February 2026 (UTC)

I was also thinking of presenting a link to the Wayback Machine with the best timestamp (close to archive-date if possible, otherwise access-date if that's available). Once the date is parsed it should be simple to send a request to the Wayback API with the best timestamp. ClaudineChionh (she/her · talk · email · global) 08:49, 25 February 2026 (UTC)

@ClaudineChionh User:Polygnotus/tmp/Archive.js Polygnotus (talk) 07:28, 26 February 2026 (UTC)

Worth mentioning, that is exactly how the IABot works. You can examine retrieveArchive() in https://github.com/internetarchive/internetarchivebot/blob/master/app/src/Core/APII.php and see that:

Pass 1: Find the Wayback snapshot closest BEFORE the citation's access date

Pass 2: If nothing found, find the closest AFTER the access date

This means if a citation says |access-date=2019-03-15, IABot asks the Wayback API for the snapshot nearest to March 15, 2019 theoretically preserving the temporal context of the citation.

--skarz (talk) 13:43, 27 February 2026 (UTC)

Need a proxy server on toolforge to access the archive.org API. Anybody know the correct venue to make a request for this? (cc @Novem Linguae?) — DVRTed (Talk) 11:36, 26 February 2026 (UTC)

@DVRTed Well my script above works fine. It uses my VPS. I can put the proxy on Toolforge if you insist, but it makes no technical difference. Polygnotus (talk) 11:38, 26 February 2026 (UTC)

Yea, that'd be great; primarily because it'd be an open source server. I guess I'll insist. ;) — DVRTed (Talk) 11:42, 26 February 2026 (UTC)

@DVRTed https://en.wikipedia.org/wiki/User:Polygnotus/Data/ArchiveProxy Polygnotus (talk) 11:44, 26 February 2026 (UTC)

Also made a cloudflare worker thing:

Polygnotus (talk) 20:25, 27 February 2026 (UTC)

@WhatamIdoing Is this something worth exploring still? Do you have an examples? Is validating archive dot org links enough to be helpful? Dw31415 (talk) 04:05, 4 March 2026 (UTC)

You may be interested in another tool I am working on - an Electron app where you can enter the name of an article and it allows you to compare cited archived pages on archive.today and its equivalent on the Wayback Machine side-by-side visually. I posted more details and a screenshot at https://en.wikipedia.org/wiki/Wikipedia_talk:Archive.today_guidance#Archival_Comparison_Tool

Let me know what you think! --skarz (talk) 15:25, 4 March 2026 (UTC)

How I use Claude Code for RC patrol

Just wanted to share something I've been experimenting with, and that's how to use Claude Code to patrol recent changes. The short version is I have a python script that grabs recent changes with various filters applied, stores the diffs as .json files, then Claude analyzes them in to priority for review. Claude does not make any changes, it just contextually analyzes the change. I used up most of my session limit writing and modifying the code; I don't think this is too token intensive because most of your time would theoretically be actually going and verifying/correcting the suspicious edits.

I actually had Claude turn the action in to its own plugin so I can just input /analyze-diffs and it will run automatically. Here's an example of the output:

Wikipedia Vandalism Review — 2026-03-05

  47 diffs analyzed. 3 confirmed, 11 suspicious, 33 OK.

  ---
  [FLAG] CONFIRMED / REVERT-WORTHY

  ---
  007 — Novoozerne | Editor: ~2026-14297-92 | No summary
  - Changed the settlement name from "Novoozerne" to "Маратий-сити" (Maratiy-siti) in the infobox name and native_name fields.
  - Clear vandalism. A Ukrainian settlement's name was replaced with what appears to be a nonsense/invented name. Temp user, zero explanation.

  ---
  027 — Peter Pan (2003 film) | Editor: ~2026-29295-2 | Summary: "Zack Gauthier"
  - Appended ", Mr. & Mrs. Darling's grandchildren." to every single Lost Boy actor entry (Slightly, Tootles, Curly, Nibs, the Twins).
  - Factually wrong: the Lost Boys are feral children in Neverland, not the Darlings' grandchildren. The Darling children are Wendy, John, and Michael. Summary is meaningless. Temp user.

  ---
  031 — Fives (sport) | Editor: ~2026-14169-17 | No summary
  - Introduced deliberate grammatical corruption into the hatnote: "the French town" → "the Frenchs town", "Fives, Nord" → "Fives, Nords", "the Irish sportsman" → "the Irishs sportsmans(GAA Handball)".
  - Textbook vandalism. Pluralized proper nouns incorrectly, inserted junk text. Temp user, no summary.

  ---
  [FLAG] SUSPICIOUS — Needs Human Review

  ---
  001 — Christianity and other religions | Editor: Augy31 | Summary: "References."
  - Appended a bare external URL from Crisis Magazine (a traditionalist Catholic publication) directly to paragraph text about Buddhist-Christian relations in Sri Lanka.
  - Not cited in proper <ref> format — just tacked onto the end of a sentence. The source appears ideologically motivated (article title: "Buddha's Fist: Persecuting Christians in Sri Lanka"), potentially a
  POV citation in an article that already notes Christian extremism in South Korea. Registered user, not clear vandalism, but questionable sourcing practice.

  ---
  008 — Al Sherrod Lambert | Editor: ~2026-14244-01 | Summary: section header only
  - Changed "backing vocals" to "vocals" for Kelly Rowland and Beyoncé.
  - Changed wikilink for "Billboard Hot 100" to point to [[Billboard Hot Gospel charts|Billboard Hot 100]] — linking the Hot 100 text to the Gospel charts article. This is a factual error embedded in a
  wikilink. Temp user.

--skarz (talk) 18:49, 5 March 2026 (UTC)

@Skarz Hiya, sounds interesting, do you have a github/lab somewhere I can look at? Thanks, Polygnotus (talk) 01:05, 6 March 2026 (UTC)

It's just one file: https://gist.github.com/comaeclipse/81917a63c8ae9d7585a2e277d71956d0IfIf you provide that to any agentic LLM with the explicit instruction to run python diffchecker.py --rc --recent 50 then analyze the diffs itself (not writing another script to use regex or something) it should perform well. For instance with Claude Code I instruct it to use its Read and Glob tool to analyze each .json file. skarz (talk) 14:39, 6 March 2026 (UTC)

@skarz Heya, I've actually been working on something similar, some things I would suggest are to not tell it temp users are more likely to vandalize (as it will just flag every temp user and I assume you are doing this based on the "temp user" responses), and I would also make sure it knows to AGF, like, you really need to drill the AGF part into it's skull. – LuniZunie_(talk) 03:00, 6 March 2026 (UTC)

Thanks for the advice! skarz (talk) 14:39, 6 March 2026 (UTC)

AI scripts not working anymore following CSP changes

There was a security incident on March 5, 2026. Following the incident, user scripts were first disabled but are now accessible again. However, it seems that the Content Security Policy has changed, see Wikipedia_talk:User_scripts#CSP_restrictions and Wikipedia:Village_pump_(technical)#Today's_outage_—_user_scripts_are_disabled. As a result, it seems that it's not possible anymore to communicate with various LLM providers. For example, trying the js code await fetch("https://api.openai.com/v1/chat/completions") now results in the error message "Refused to connect because it violates the document's Content Security Policy."

For my scripts, this affects WikiChatbot, SpellGrammarSuggestions, and SpellGrammarSuggestionsList. I would be curious to hear whether there are any feasible workarounds and whether anyone knows about planned changes to the new CSP rules to address this problem. @Alaexis and Polygnotus: I would be interested in your thoughts. Phlsph7 (talk) 10:39, 8 March 2026 (UTC)

@Phlsph7 Some thoughts:

It was claimed that they were testing how many scripts would be affected by new CSP. But importing a large amount of random scripts isn't how you do that. You just search for scripts that contain the string 'http' (and then perhaps filter out jsdeliver.net etc).
They needed to take decisive action for the optics, but this doesn't really protect us.
An attacker would always prefer local scripts and not having to deal with CSP. This wasn't an attacker but just some script someone had laying around.
Therefore CSP could only be useful if it prevented data exfiltration, but it does not.
The WMF says they now whitelist stuff but when I asked to whitelist my VPS the response was: we're not going to allowlist IP addresses, for instance, which change hands even more fluidly than domain names so they are making up the rules as they go along. They do not yet actually have a procedure to whitelist stuff.
Me buying a domain for 5 bucks a year does not enhance safety. Nor does me throwing the same code on a Cloudflare Worker.
We are not allowed to put proxies on Toolforge, which makes sense.wikitech:Wikitech:Cloud_Services_Terms_of_use#4.5_Using_WMCS_as_a_network_proxy. But we need to be able to contact external services. And there are proxies on Toolforge.
Hosting tools on Toolforge is a huge barrier to entry for most script developers. Instead of writing a bit of JS you need to jump through many bureaucratic hoops and know many things unrelated to JS development.
Hosting tools on Toolforge may not even be possible in some cases, e.g. with code you don't want to opensource or data you don't necessarily "own". For example I extracted all possible words from a Hunspell dictionary but I probably don't own that data in any meaningful sense.
I think this was a knee-jerk reaction to a silly mistake that does not help protect us.
I may be able to work around it, but many others may not. And lots of people won't even bother. I think this is a huge blow to script development.
I was told that what matters is not if you mess up once in a while, because we all do, but how you react to it when you do. I don't think this was a good way to deal with it.

Polygnotus (talk) 10:52, 8 March 2026 (UTC)

As you probably saw, we have allowlisted the requested domains, along with others that users reported were broken. Our goal in the CSP we deployed (which was based on an analysis of CSP reports over the last few months) was to minimize breakage to existing scripts. If you see other breakage of existing scripts, please do open a phab task for us to review.

Also, a couple points I want to make in response:

Just searching for links in the text of user scripts is not sufficient. Code can load other code that loads other code, and script URLs can be obfuscated (even benevolently). That was why script execution was being captured -- though which, as we've tried to be clear about, absolutely should have not been done in the environment/account/method in which it was done. It was not a good mistake to make, and we are sorry for the disruption.
The user script involved in last week's incident did in fact ping third party domains, two of which would have been stopped by the CSP we have now. But that was not the driving motivator, as we are not responding just to the behavior in that particular script - there are other many kinds of attacks possible in user scripts (including things involved in past real-world attacks on our users) that are made worse by being able to ping arbitrary third-party domains on the internet. CSP does not stop all malicious script behavior, but it is a significant and effective security control.

I recognize that new sudden restrictions are frustrating, especially when they cause some breakage and happen in the circumstances that they did. We too would certainly have preferred something more orderly and publicly explained before deploying it. But an enforcing CSP like the kind we have deployed (and are updating to avoid breakage) was planned, and is needed as a foundational step toward securing the user script system. EMill-WMF (talk) 01:21, 10 March 2026 (UTC)

@Phlsph7, I very much agree with u:Polygnotus's sentiment.

Practically speaking, adding a comment here would make it more likely that they whitelist api.openai.com sooner rather than later. Alaexis_¿question? 11:59, 8 March 2026 (UTC)

Another solution is to use a publicai-hosted open-source model via my proxy which has been whitelisted (https://github.com/alex-o-748/public-ai-proxy). Assuming that these models work for your use case and the number of requests is reasonable, this should work.

Long-term, the proper solution would be to have a "call LLM of your choice as a Wikipedia user" primitive available as a service somewhere where every developer can use it without jumping through hoops. Alaexis_¿question? 12:05, 8 March 2026 (UTC)

It is often a mistake to focus too much on the actions of one person, and forgetting to look at what institutional changes should be made to prevent stuff like this in the future.

In my world it makes sense to enforce the principle of least privilege. You want to do something that could be dangerous? You have to jump through a hoop. And you can't just do all your day to day activities with a privileged account. No sudo bash-ing.

I do not understand why there is no rate limiting on destructive API actions.

They are now talking about implementing having to re-enter your password before being able to edit a sites most important files. That is a good idea. Polygnotus (talk) 12:34, 8 March 2026 (UTC)

Thanks for the feedback and the helpful links, I left a comment at the phabricator discussion. I agree that this is frustrating. While I understand that there are security concerns, this change seems to be an overreaction. LLMs have a lot potential for helping with Wikipedia, but the difficulties of having each user bring their own API key and now of not even being able to access the API endpoints are not particularly innovation-friendly. Phlsph7 (talk) 13:23, 8 March 2026 (UTC)

Just a short update: the CSP rules were adjusted again to allow communication with some of the main LLM providers (api.anthropic.com, api.openai.com, api.publicai.co), see here. Phlsph7 (talk) 10:07, 10 March 2026 (UTC)

AI aid for citations

I just stumbled upon a use case for AI tools: improving citations. I entered this search isbn Maggiore, Michele (2008). Gravitational Waves: Volume 1: Theory and Experiments. and the AI summary started

The ISBN for Michele Maggiore's Gravitational Waves: Volume 1: Theory and Experiments (2007/2008, Oxford University Press) is 978-0198570745 (Hardcover). The book is also associated with the ISBN 978-0191717666 for the eBook version.

With the ISBN a citation can be created (tooling for that is not 100% but very good). Of course I can do this manually but integration with a tool like Wikipedia:ProveIt or even a bot would be great. Sorry if this is duplicate suggestion. Johnjbarton (talk) 16:38, 11 March 2026 (UTC)

@Johnjbarton, improving citations is definitely a potential application. When you say "a citation can be created" do you mean a proper citation template?Adding it to a tool like Proveit would be more effective than creating a standalone script I think. It would've been nice to have a bot do it, flagging unclear cases for human review. Alaexis_¿question? 22:24, 23 March 2026 (UTC)

Re: "a citation can be created". I added this to the ProveIt doc page:

ProveIt can also auto-fill a citation from a DOI, ISBN, or URL. (Under the covers it uses m:Web2Cit, a collaborative database of configurations to fine-tune Citoid, VisualEditor's citation generation service).

This is the "Generate" feature. I use it for >90% of the citations I add.

For the "Citation-improvement" use case, we have partial information which an AI tool could use to find a DOI or ISBN, then "Generate" would use conventional technology, providing one layer of check for the AI result. Human checks on the edit would confirm. After some experience we could loop over the citations in the page within the per/page tool, then with more experience work on bot application.

I could try to do some cases by hand to so we could estimate the impact if you think it is has potential. Johnjbarton (talk) 23:58, 23 March 2026 (UTC)

Wouldn't it be possible to get DOI from CrossRef? Then there are two possible flows

Combined flow

AI parses a messy citation and creates a structured query
Crossref or similar request
Human or AI confirm or select the right source if more than one has been returned
Citation created using existing capability

AI-only flow

Request to Anthropic API with web search tool enabled (or similar) containing a messy citation
ISBN/DOI returned
Citation created using existing capability

Last time I checked API with web search tool it was kinda flaky so probably the combined flow would work better, in addition to being more transparent. Alaexis_¿question? 12:14, 24 March 2026 (UTC)

Hi! Following the request for comment posted at Wikipedia talk:ProveIt#Populating ref fields automatically, I agree a good use of AI would be to get the identifier of a source (DOI/ISBN/etc) out of an "unstructured" reference (e.g. "Darwin, C. (1859) On the Origin of Species") and then let existing capability convert that identifier to a structured reference. I'll be happy to add that feature to Proveit eventually, and I just created a Phabricator task for it (phab:T422293). That being said, it should be noted that depending on how phab:T296847 is resolved, requests to external services such as LLMs may one day be blocked. Sophivorus (talk) 13:02, 4 April 2026 (UTC)

LLM-assisted filter for identifying subtle AI edits

With the new changes to WP:NEWLLM (which I interpret to mean a WP:BANREVERT-like restriction on all AI content edits?) it is now plausible (and, in my opinion, necessary) to deploy automated tools to trawl through articles and detect possible AI writing. While we have a lot of very adept and diligent editors going through articles for AISIGNS this is 1. unlikely to keep up with the rate of spam and 2. can still miss the more subtle but equally damaging errors that AI tools produce after the use of, say, Humanizer, most notably failed verification of citations. As such I want to develop a filter for automated identification of possible AI content; this filter can have a larger rate of false positives because it will not be a Cluebot-esque mass revert tool but rather simply add articles/diffs to a list for WP:AIC to look at.

The best way to do this does seem to be fighting fire with fire; a lot of AISIGNS are contextual which can't be caught with simple regex, though some are, and in particular automating the verification of sources is definitely an LLM thing. I've been able to achieve good results by just copy-pasting wikicode into Claude Sonnet 4.6 and asking it to analyse the text and the sources according to AISIGNS, source verification, and a number of training samples taken from WP:AINB and encounters from my own time around here. It takes a bit of hammering and some basic things such as not circularly referencing Wikipedia and not hallucinating need drilling into, but it does eventually get 10/10 on WP:AI or not quiz. The issue is that by the time it gets working, the context window is invariably full, and also the context cannot be transferred into a deployable tool that can be used by other people. I'm therefore experimenting with the possibility of using the skills tool as a method of specialising the LLM instead.

I'd like to post this here and get some feedback as to

What you think editors would be looking for in such a tool;
Performance requirements (false positive/false negative rates, etc?) for such a tool, if deployed
Possible issues with such a tool

(list not exhaustive) before I go diving head first into developing something which might not end up productive for the community, so any comments would be very welcome. Fermiboson (talk) 12:17, 22 March 2026 (UTC)

Automated source verification would be a tremendous asset with implications far beyond AI spam. Verified content is our core asset and frankly many of our articles fail verification. Verification is very time consuming and its a great opportunity for AI/human collaboration. Locating the source and the relevant part of the text is itself time consuming: any help here would make verification more effective. This is a difficult problem worth tackling and I strongly support this aspect of your proposal.

Attempting to detect AI content on the other hand is, in my opinion, doomed. Large teams of the world's best engineers are working to make AI content indistinguishable from human content. The early clumsy efforts can be detected, but the false positive rate will climb and with it the list the tool builds, exceeding our time and energy to investigate. We should not use our limited resources to fight fire with fire, we'll only end up with ashes. We should focus on source verification, content notability, neutral points of view, due weight and other judgement issues.

Thus a tool that flags articles for verification issues would be great. Johnjbarton (talk) 16:54, 22 March 2026 (UTC)

Source verification is fundamentally limited by what websites the LLM can access. Many websites have a robots.txt which most LLMs (at least nominally) respect. Therefore, if we limit ourselves to verification, I think it's unlikely to make a big impact; however the automated nature of the tool might still mean that it is of some value. The main difficulty in the case where the LLM can access the source would be to prevent hallucinations and the same synthesis that LLM-generated text is vulnerable to, which is surmountable; but the access failure rate is significant in my experience. LLMs are able to do surface level analysis with partial access to the source sometimes (e.g. if they notice the abstract accessed in web_search is talking about country A while the claim is about country B, they can flag that), but giving them free rein to do that also risks overinterpretation on their part.

With regard to detecting AI content as a whole, if we reach a stage where AI-generated text is entirely indistinguishable from human written text except for its epistemological quality, then we are simply put doomed anyways. Even if we had a 100% accuracy and access capability LLM-powered source verifier, ignoring the fact that some cited sources will be offline, the amount of CPU time it takes to generate bullshit is far less than the amount of CPU time it takes to verify it, and the amount of actors willing to generate such bullshit for whatever purpose will far exceed the resources we can throw at the problem. Fermiboson (talk) 13:10, 23 March 2026 (UTC)

I think that for many cases of AI (and vandalism) simply verifying that the source exists and matches the citation would be a great leap forward. The vast majority of technical sources have abstracts on Google Scholar which could be used to score the likelihood of verification. A bot based on these tests would be more robust to improvements in AI content generation improvements.

A separate tool aiding editors directly could share the "check-citation-validity" technology. Editors who are authenticated to The Wikipedia Library can access many sources without robots.txt issues. AI verification of the citations would be much less expensive by being focused by the human and limited to single sources.

To be sure I don't know what capabilities AI systems have. I know that programs that analyze web content are fragile to the vast and dynamic variety out there. Johnjbarton (talk) 14:52, 23 March 2026 (UTC)

In that case, @Alaexis' source verification tool already fits your profile quite well. I believe you have to manually extract the source and claim for that, which kind of rate limits the process of source verification, but I also believe it already performs quite well. I could possibly use specific training samples to achieve a better, but not that much better, performance.

Source existence and abstract verification is certainly possible to automate and within the purview of my tool, with quite a good performance (it managed to catch several additional citation mismatches in past AINB cases which weren't brought up there). Fermiboson (talk) 23:46, 23 March 2026 (UTC)

@Fermiboson, last time this topic arose when a policy was discussed, I checked one of the commercial AI detectors that was mentioned and was able to fool it with very little effort. I also feel that the detection task is basically doomed, especially since the line between allowed uses (refining your own text) and prohibited ones is quite blurry. That's why I focused on source verification which is a much better-defined task.

Of course I may be wrong and I'd be happy to test your algorithm once it's possible. I'd suggest preparing a dataset split into training and holdout samples (you probably already know this) and doing adversarial testing by trying to break your own algorithm. Using skills is a promising approach. Note that you can also use edit metadata and aggregate metrics (like the frequency of edits).

My own opinion is that we should focus on accuracy and rate limiting of potentially harmful actions (LLM harm reduction policy). Alaexis_¿question? 23:02, 23 March 2026 (UTC)

For sure, I am not claiming to make something that performs as a general AI detector (if I was, I would be selling it!). It is however well known that LLMs can perform much better when their role is specialised, in this case to detecting bad AI writing on Wikipedia. There are commonalities to the way LLMs reply to a prompt along the lines of "write a Wikipedia article for..." which do not generalise. With deference to Johnjbarton's comment above about the progress of AI in the future, I think distinguishing the kind of stuff that shows up at AINB in the present is achievable. As to what is permitted and what is not, that sounds like something that would be argued out at ANI and something I suspect I don't want to touch with a ten-foot pole!

I am in the process of collecting more training and testing samples, as you suggest, and may at some future point head over to WP:AIC to ask more experienced gnomes for sample suggestions. As to adversarial testing - your help would be very welcome in this regard, since I know what the skill actually says so this gives me a bit of an unfair advantage. I agree that such attacks are a concern and hence I do not plan on making the skill source files publicly available on-wiki - I'm thinking either in the form of a private filter or distributed to other editors off-wiki - along the same lines how WP:AISIGNS has been used by some as an instruction manual on how to hide AI writing.

With respect to using edit metadata and aggregate metrics, I agree these would be very useful. I currently have nowhere near the experience with Claude API to integrate this into some sort of webtool that can directly access a diff or someone's contribs - the current pipeline is wikitext in, analysis + score out - but if/when I figure that out, I certainly will be using those as well.

Otherwise, on principle, I agree that a good enforcer of verifiability is more useful than a good detector of LLM writing. The question is the relative achievability of either goal. Fermiboson (talk) 00:02, 24 March 2026 (UTC)

Yeah, having a specific use case (like the triage of AINB submissions) makes it more realistic. Good luck! Alaexis_¿question? 08:05, 24 March 2026 (UTC)

Will you do this with a local model? If not, I guess I'll have to oppose. sapphaline (talk) 18:20, 27 March 2026 (UTC)

I am doing this on a local machine, and I don't intend on sharing the skill/subagent files publicly onwiki. I am aware of the risks associated with such a tool being widely available, but frankly, any malicious actor can do exactly what I am doing but better with more samples, more tokens, and more runtime than a PhD student with a singular Claude subscription. Better security and architecture can only come with more money, which... if you know how to get the WMF to give me a grant, I'm all ears. Fermiboson (talk) 18:30, 27 March 2026 (UTC)

"with a singular Claude subscription" - I mean fully local models, ones you run with e.g. Ollama. sapphaline (talk) 18:32, 27 March 2026 (UTC)

I know, and the answer to that is no, and it is likely to remain outside my technical ability for quite some time. I have a guess as to why you oppose in this case but I’d appreciate an elaboration from you. Fermiboson (talk) 18:45, 27 March 2026 (UTC)

Wikipedia-Compliant AI Copyediting Prompt — Community Resource

About This Tool

This prompt system was created by a Wikipedia contributor who works in IT services and consulting. I am not a Wikipedia editor in any meaningful capacity — I suggest word changes a handful of times per year at most. But when I read the WP:NEWLLM guideline adopted on March 20, 2026, I recognized something I could contribute: a structured, compliant prompt that helps editors use AI tools within the boundaries the community just established. I am willing to verify my identity directly with editors who have questions about this tool's origins or intent.

Why I built this: The new policy creates a clear copyediting exception, but has not yet established a practical tool or implementation guide for editors to use, benefit from, and enforce compliance within its stated boundaries. AI tools are powerful for grammar, clarity, and style work — exactly the kind of copyediting Wikipedia allows — but without constraints, they will drift into content generation, which is exactly what the policy prohibits. This prompt is designed to prevent that drift while preserving the value AI brings to editing.

How it was built: This tool was designed entirely around my ideas and specifications, with Claude (Anthropic, Claude Opus 4.6) used as the development partner to research the policy landscape, pressure-test design decisions, refine the workflow, and draft the final prompt language. The article text I submitted during development, and the prompt itself, were built through that collaborative process. Claude suggested refinements and raised edge cases that improved the design, but every architectural decision — the markup system, the acceptance/rejection workflow, the stateless-per-section architecture, the reproducible parameter block — originated from my requirements.

Tool-agnostic by design: I deliberately built this as a plain-text prompt rather than a platform-specific tool. Any editor can paste it into whatever AI tool they prefer — ChatGPT, Claude, Gemini, or whatever comes next. This means it adapts to future tools without requiring maintenance, and no editor is locked into a specific platform.

Open use, no restrictions: This tool is intended to evolve based on the community's actual use. No permission from me is needed to use, modify, adapt, or redistribute it. If an editor wants to take this and build it into a dedicated tool, userscript, or gadget, I have no concerns and expect no approval requirement. Consider it a starting point that belongs to the community the moment it's shared.

Version: 1.0 — March 2026

Document Overview

This document contains four sections:

About This Tool — Intent, authorship disclosure, and open-use terms
The Prompt — Copy/paste this into any AI chat tool to begin a compliant copyediting session
Editor Instructions — How to use the system, interpret markup, and manage the review workflow
Worked Example — A sample Wikipedia paragraph through the full editing cycle

The Prompt

Copy everything between the START and END markers below and paste it as your first message in a new AI chat session.

--- START PROMPT ---

You are a Wikipedia copyediting assistant operating under the English Wikipedia guideline WP:NEWLLM (Writing articles with large language models), adopted March 20, 2026. Your function is strictly limited to suggesting copyedits to text the editor has written or is responsible for. You must never generate new article content, introduce new claims, fabricate or modify citations, or rewrite passages beyond what copyediting requires.

CORE CONSTRAINTS (NON-NEGOTIABLE)

You may only suggest changes to grammar, spelling, punctuation, clarity, tone, word choice, sentence structure, and readability.
You must never introduce facts, claims, interpretations, or content not already present in the source text.
You must never alter, add, remove, or fabricate citations or references. All citations must be preserved exactly as they appear.
If a suggested edit could change the meaning of a sentence relative to its cited source, you must flag this as a [CAUTION] note instead of making the edit inline.
You must never rewrite passages. Edits must be surgical: specific words, phrases, or sentence structures only.
Every section you process is independent. You have no memory of previously edited sections. Do not reference or rely on context from other sections.
If the submitted text exceeds approximately 1,500 words, respond with: "This section is too long for reliable processing. Please break it into smaller subsections and submit each separately."

INPUT FORMAT

After this prompt, the editor will paste a single section of Wikipedia article text. The text may be in wikitext markup, plain text, or a mix. Preserve whatever format the input uses.

OUTPUT FORMAT

For each paragraph in the submitted text, produce the following:

Paragraph numbering: Number every paragraph sequentially (P1, P2, P3, etc.).

Visual separation: Place a horizontal rule (---) before each ORIGINAL P[n] block. Each paragraph pair (ORIGINAL + REVISED + any bracket notes) must be visually separated from the next paragraph pair by a horizontal rule. This creates clear visual correlation between each source paragraph and its corresponding edit suggestions.

Side-by-side display: Present the original paragraph labeled "ORIGINAL P[n]" and the revised paragraph labeled "REVISED P[n]".

Inline edit markup: In the REVISED version, show every change using this exact format:

Removed words: show in strikethrough inside parentheses — (~~removed word~~)
Added/replacement words: show in bold italic immediately after the struck text — new word
If only adding a word (no removal), show: added word
If only removing a word (no replacement), show: (~~removed word~~)

No-change paragraphs: If a paragraph requires no edits, display it once under ORIGINAL P[n] and write "REVISED P[n]: No changes suggested."

Bracket notes: Below each revised paragraph, include bracket notes ONLY if there is something meaningful to flag. Use exactly these five categories:

[COPYEDIT] — Grammar, clarity, or style suggestions that cannot be expressed as simple inline word swaps (e.g., suggesting a sentence be restructured).
[PRE-EXISTING CONCERN] — A factual, sourcing, or neutrality issue that already exists in the original text. Label clearly: "This concern exists in the original text, not introduced by this editing session."
[CAUTION] — An inline edit that may subtly shift meaning relative to a cited source. Explain the risk.
[SOURCE-CHECK] — A specific citation that may need human verification (e.g., citation does not appear to relate to the claim it supports, or citation may not exist).
[STRUCTURAL] — A suggestion to merge, split, or reorder paragraphs. For merges: note at the bottom of both affected paragraphs "Suggested merge with P[n]." For splits: show the single original paragraph on the left and two proposed paragraphs in the revised column.

Do not include bracket notes if there is nothing to flag. Empty brackets are not permitted.

Text states: The editor will manage these states during their review. You do not need to apply them, but you must understand them if text is resubmitted:

Normal text = open for editing suggestions
Italic text = editor-locked, do not suggest word-level changes to this text
If you believe locked (italic) text still needs revision, suggest a full sentence rework in a [COPYEDIT] bracket note only. Do not touch the locked words inline.

PARAGRAPH ACCEPTANCE AND CLEANUP

When the editor indicates a paragraph is accepted (they will state "Accept P[n]"), reproduce that paragraph in clean form: all strikethrough markup removed, all bold/italic edit markers removed, all bracket notes removed. The accepted text becomes the new baseline in normal formatting. The editor may then resubmit it for another pass.

CLEAN EXPORT

When the editor states "Export final," produce the complete text with all accepted paragraphs in clean form, all remaining markup stripped, all bracket notes removed. The output must be ready to paste directly into a Wikipedia editing window. Preserve all original citations and wikitext formatting exactly.

DISCLOSURE HEADER AND FOOTER

Every output you produce must begin and end with the following:

HEADER:

AI Disclosure: This revision was prepared using AI-assisted copyediting in compliance with WP:NEWLLM. Model: [state your model name and version]. Mode: Copyedit only. No new content was generated.

FOOTER:

Prompt Parameters (reproducible): Mode: Copyedit only | Constraints: WP:NEWLLM, WP:V, WP:NPOV, WP:NOR, WP:BLP | Scope: Single section, stateless (no cross-section memory) | Input preservation: All citations and references locked | Bracket categories: COPYEDIT, PRE-EXISTING CONCERN, CAUTION, SOURCE-CHECK, STRUCTURAL | Max input: ~1,500 words per submission | Prompt version: 1.0

The footer is designed to be copied by any editor and used as a standalone prompt parameter block to reproduce this session's constraints.

BEGIN

Respond now with: "Ready. Paste a single Wikipedia article section below. I will process it under WP:NEWLLM copyedit-only constraints. Maximum recommended length: ~1,500 words. If your section is longer, please break it into subsections."

--- END PROMPT ---

Editor Instructions

How to Use This System

Step 1: Start a new chat session in any AI tool (ChatGPT, Claude, Gemini, etc.). Paste the entire prompt from the section above as your first message — everything between the START PROMPT and END PROMPT markers, inclusive. Do not excerpt or abbreviate the prompt; it must be pasted in whole to function correctly. The AI will confirm it is ready.

Step 2: Paste one section of the Wikipedia article you are editing. Use only one section at a time. This is not a suggestion — the system enforces a word limit, and each section is processed with zero memory of any previous section. This prevents the AI from drifting into rewrite mode on longer texts.

Step 3: Review the output. The AI will return numbered paragraphs with inline edits and bracket notes. Here is how to read the markup:

More information You see this, It means ...

You see this	It means
(~~old word~~) new word	The AI suggests replacing "old word" with "new word"
(~~removed word~~)	The AI suggests deleting this word with no replacement
added word	The AI suggests inserting this word
[COPYEDIT] ...	A style or structure suggestion beyond a simple word swap
[PRE-EXISTING CONCERN] ...	A problem that was already in the original text
[CAUTION] ...	Warning that an edit might shift meaning relative to sources
[SOURCE-CHECK] ...	A citation that may need human verification
[STRUCTURAL] ...	A suggestion to merge, split, or reorder paragraphs

Accepting, Rejecting, and Locking Edits

To accept an edit: Remove the bold formatting from the replacement word. It becomes normal text. The strikethrough and parentheses around the old word remain visible until final cleanup.

To reject an edit: Delete the bold/italic replacement text. Remove the strikethrough and parentheses from the original word. Then italicize the original word to lock it — this tells the AI not to suggest changes to that word if you resubmit the paragraph.

To lock text you don't want touched: Italicize any word or phrase before submitting. The AI will not suggest word-level changes to italicized text.

To accept an entire paragraph: Tell the AI "Accept P[n]." It will return the paragraph in clean form with all markup removed. This becomes your new baseline for further passes.

Producing the Final Export

When all paragraphs are reviewed, tell the AI "Export final." It will produce clean text ready to paste into Wikipedia's editing interface. All edit markup, bracket notes, and formatting artifacts will be stripped. Citations and wikitext formatting will be preserved.

The disclosure header and footer are for your records and Talk page. They are not included in the export. Use them to document your process when posting your edit summary or responding to other editors about your changes.

Key Rules for Editors

You are responsible for every edit. The AI suggests; you decide. Under WP:NEWLLM, the editor bears full responsibility for all changes published to Wikipedia.
Never submit the AI's suggestions without review. Every inline edit and bracket note requires your human judgment.
One section at a time. Do not paste an entire article. The AI processes each section independently with no memory of previous work.
Check bracket notes carefully. [SOURCE-CHECK] and [PRE-EXISTING CONCERN] notes may reveal issues you were not aware of. Investigate them using the actual cited sources before publishing.
The AI cannot verify sources. It can flag potential mismatches based on text patterns, but it cannot access or read the cited references. Source verification is always your job.

Worked Example

Source Text (one paragraph from a hypothetical Wikipedia article)

The Milltown Bridge was constructed in 1923 by the Harwell Engineering Company, making it one of the oldest surviving steel truss bridges in the northeastern United States.[1] The bridge spans approximately 340 feet across the Milltown River and was originally designed to accomodate horse-drawn vehicles and early automobiles.[2] It was added to the National Register of Historic Places in 1987, recognizing its architectural significance and its role in regional transportation development.[3]

AI Output

ORIGINAL P1: The Milltown Bridge was constructed in 1923 by the Harwell Engineering Company, making it one of the oldest surviving steel truss bridges in the northeastern United States.[1] The bridge spans approximately 340 feet across the Milltown River and was originally designed to accomodate horse-drawn vehicles and early automobiles.[2] It was added to the National Register of Historic Places in 1987, recognizing its architectural significance and its role in regional transportation development.[3]

REVISED P1: The Milltown Bridge was constructed in 1923 by the Harwell Engineering Company, making it one of the oldest surviving steel truss bridges in the northeastern United States.[1] The bridge spans approximately 340 feet across the Milltown River and was originally designed to (~~accomodate~~) accommodate horse-drawn vehicles and early automobiles.[2] It was added to the National Register of Historic Places in 1987, (~~recognizing~~) in recognition of its architectural significance and its role in regional transportation development.[3]

[COPYEDIT] P1, sentence 3: "recognizing" is a dangling modifier — the sentence subject is "It" (the bridge), not the National Register. Suggested replacement "in recognition of" avoids the ambiguity. Review against source [3] to confirm this phrasing remains faithful to the cited reason for listing.

[CAUTION] P1, sentence 3: The change from "recognizing" to "in recognition of" slightly shifts the grammatical relationship. Verify that source [3] supports the framing that the listing was specifically because of these two factors (architectural significance and transportation role), not that the listing merely acknowledged them among other factors.

Editor Actions

Accept the spelling fix: Remove bold/italic from "accommodate" — it becomes normal text.

Reject the second edit: Delete "in recognition of", remove strikethrough and parentheses from "recognizing", then italicize "recognizing" to lock it:

...in 1987, recognizing its architectural significance...

Accept the paragraph: Tell the AI "Accept P1." Receive clean output:

The Milltown Bridge was constructed in 1923 by the Harwell Engineering Company, making it one of the oldest surviving steel truss bridges in the northeastern United States.[1] The bridge spans approximately 340 feet across the Milltown River and was originally designed to accommodate horse-drawn vehicles and early automobiles.[2] It was added to the National Register of Historic Places in 1987, recognizing its architectural significance and its role in regional transportation development.[3]

Result: one spelling error fixed, one stylistic suggestion rejected, meaning preserved, citations untouched. Avirkos (talk) 02:07, 30 March 2026 (UTC)

Here's a prediction, based on the way LLMs have actually been used by Wikipedia contributors. The only people who are going to follow those instructions are those already capable of writing articles without using an LLM. For everybody else (the majority of people submitting LLM-generated content right now), it will be treated as 'magical words' to copy-paste, at best. For many, it will simply be ignored entirely, beyond being seen as approval for using LLMs to create articles. This is a 'fix' for the wrong problem. Unless we can find a way to enforce it (which would appear to be impossible), it is likely to create more problems than it solves. AndyTheGrump (talk) 02:33, 30 March 2026 (UTC)

@AndyTheGrump For once you weren't grumpy enough! Polygnotus (talk) 02:49, 30 March 2026 (UTC)

Fair points, and I agree with most of them. This tool was designed to enable compliance and avoided the idea of enforcement entirely. That is a different function and has a more important place, but is beyond my expertise - and it has some traction in a few other post as well. Additionally, I know it will not stop bad actors, not sure there is much directly stopping them even now - especially the truly sophisticated ones. I took the approach to instead design for good-faith editors who want a structured, transparent way to use AI within the policy.

The piece I hoped would bring real value is the reproducible parameter block in the footer. If an editor discloses the prompt alongside their changes, any reviewer can paste that same prompt with the same source text and get substantially similar suggestions back. That gives the verify portion of "trust-but-verify" a concrete mechanism to build from.

To be direct about what this idea was: it's a rough starting point. My hope is that someone with the technical ability takes the concept and builds it into an actual tool that works inline with the editing interface. The prompt format was a deliberate choice to make the idea accessible and testable without requiring anyone to build anything first, but a prompt is not the end state.

For context, I use a platform in my work that does something similar for client communications. I enter technical notes, press a button, and it corrects and reformats my text. That tool is too aggressive for Wikipedia (it does full rewrites based on input, which violates WP:NEWLLM), but it demonstrates the real value of AI-assisted editing when the constraints are set right. That is what I was aiming to demonstrate here.

The ideal outcome would be something native to the wiki platform that leverages multiple AI providers in parallel to maximize compliance with the community's standards. Whether this prompt contributes to that directly or just starts a useful conversation, either result is worth the effort. Thanks for your insights and time responding, you guys are awesome. ~~~~ Avirkos (talk) 02:55, 30 March 2026 (UTC)

@Avirkos So this is a prompt, not a tool? Why do you call it a tool when its a prompt?

No prompt can ever fix the problems current LLMs have, and this prompt doesn't fix the problems current LLMs have.

As your post sounds LLM-generated I asked Claude to explain to you why this is a bad idea:

"This idea is based on some assumptions about how LLMs work that are incorrect and outdated.

The "stateless per section" design is the biggest issue. The conversation history stays in context — the prompt can't actually enforce amnesia between sections. It just asks the model to pretend it doesn't remember, which is not the same thing.

More broadly, writing "CORE CONSTRAINTS (NON-NEGOTIABLE)" in a prompt doesn't make them non-negotiable. These are requests, not guardrails. The model can still hallucinate, drift, or subtly alter meaning while perfectly reproducing the required markup format. An editor using this prompt may end up with false confidence that compliance is being enforced when it isn't.

The 1,500-word limit seems to have been carried over from much earlier models. Current frontier models handle 100k–200k+ tokens without issue; this constraint hasn't reflected reality for a while.

The markup system itself essentially recreates tracked changes, which word processors have had for decades — but with more friction and less reliability.

Finally, the disclosure footer proves nothing. Any model will output whatever text you instruct it to output, regardless of whether it actually followed the constraints. It documents intent, not compliance.

But editors relying on this for WP:NEWLLM compliance should know that the technical enforcement mechanisms don't work the way the prompt implies." Polygnotus (talk) 02:48, 30 March 2026 (UTC)

Fair insights, and valid on several points. To your first question: you're right, it is a prompt, not a tool. I called it a tool somewhat loosely in places and I should be more precise. As I am new here, should I correct it or let it stand? I don't want to correct it and then let that appear to misconstrue your response as one "without basis." It is a prompt system with a workflow, and I tried to be upfront in the post that my hope is someone with the technical ability converts the concept into an actual integrated tool.

On the technical points you and Claude raised:

Stateless per section is aspirational, not enforced. - The prompt instructs the model to treat each section independently, but you are correct that conversation history remains in context. The mitigation I settled on was instructing editors to start a new chat session per section, which does enforce it at the session level. If editing my original post is acceptable, I will make that clearer in the instructions.

Prompt constraints are requests, not guardrails. - Agreed. There is no technical mechanism preventing an LLM from drifting. The prompt aims to reduce drift to the extent it currently can, it does not eliminate it. I explicitly provide instructions to editors who may use this prompt and emphasized that every suggestion requires human review for exactly this reason.

The disclosure footer documents intent, not compliance. - Also agreed. It was designed for reproducibility, not proof. If another editor pastes the same prompt with the same source text and gets a substantially different result, that is a signal worth investigating. It is not a certificate of compliance.

On the ~1,500 word limit, this was an intentional reliability choice, not an artificial capability constraint. Current models can handle far more tokens, but longer inputs produce less consistent copyedit-quality output in my testing. The goal was keeping each pass focused enough that the model stays in copyedit mode and does not drift toward rewriting. That number can absolutely be adjusted as models improve. This also is why I only provided instructions to do sections at a time and not entire articles, again aiming to reduce failure modes in any AI tool. This is further complemented by the instructions to refrain from using the same chat for multiple edit passes or even multiple article edits. Each chat session should start with a single edit pass and end after that initial pass, at least in the way I was thinking about it.

On the post sounding LLM-generated, it partially was, and I attempted to disclose that in the About section where I stated that I built this with Claude as a development partner. The post content, the workflow design, and the architectural decisions are mine. The prose was refined collaboratively, which I aimed to be transparent about from the start.

The underlying design philosophy I was aiming for is what I would call a forcing function for compliance to the standards established by the new definition. The default state of the prompt aims to be compliant. An editor would have to actively work against it to generate content, fabricate citations, or rewrite passages. That does not make violations impossible, but it makes the path of least resistance the compliant one. Whether that principle lives in a prompt or eventually gets built into an integrated platform tool, I think it is the right foundation.

I appreciate the detailed pushback. This is exactly the kind of feedback that makes the concept better, and I am looking forward to further insights. I am not sure if this is the right frame, but I imagine us in a boardroom discussing a new concept and how to leverage it, or if we are going to leverage it at all. ~~~~ Avirkos (talk) 03:24, 30 March 2026 (UTC)

@Avirkos, I appreciate the intent and the effort that went into your proposal. I agree with @AndyTheGrump's point that the people that cause most disruption using LLMs would be least likely to use your prompt.

Having said that, I think that the idea to create a prompt (or a skill) that tells an LLM how it should behave will be valuable for tool-builders. When someone is building a new tool, they can reuse the skill rather then re-inventing the wheel.

For this to work though, it should be demonstrated that your prompt in fact *works*. Does its output actually satisfy the constraints? How does it behave in various edge cases? Alaexis_¿question? 19:43, 1 April 2026 (UTC)