Wikipedia:Village pump (WMF)
Discussion page for matters concerning the Wikimedia Foundation
From Wikipedia, the free encyclopedia
| Policy | Technical | Proposals | Idea lab | WMF | Miscellaneous |
- Discussions of proposals which do not require significant foundation attention or involvement belong at Village pump (proposals)
- Discussions of bugs and routine technical issues belong at Village pump (technical).
- Consider developing new ideas at the Village pump (idea lab).
- This page is not a place to appeal decisions about article content, which the WMF does not control (except in very rare cases); see Dispute resolution for that.
- Issues that do not require project-wide attention should often be handled through Wikipedia:Contact us instead of here.
- This board is not the place to report emergencies; go to Wikipedia:Emergency for that.
Threads may be automatically archived after 14 days of inactivity.
Behaviour on this page: This page is for engaging with and discussing the Wikimedia Foundation. Editors commenting here are required to act with appropriate decorum. While grievances, complaints, or criticism of the foundation are frequently posted here, you are expected to present them without being rude or hostile. Comments that are uncivil may be removed without warning. Personal attacks against other users, including employees of the Wikimedia Foundation, will be met with sanctions.
To scrape data from Wikipedia, do you need to go through Wikipedia Business
Just wondering. ~2026-82871-0 (talk) 00:59, 7 February 2026 (UTC)
- This isn't really answerable without a lot more context, but I think the answer is "no". * Pppery * it has begun... 02:20, 7 February 2026 (UTC)
- From a Foundation article from November: "Financial support means that most AI developers should properly access Wikipedia’s content through the Wikimedia Enterprise platform. Developed by the Wikimedia Foundation, this paid-for opt-in product allows companies to use Wikipedia content at scale and sustainably without severely taxing Wikipedia’s servers, while also enabling them to support our nonprofit mission."
- I would try looking at Wikimedia Enterprise. From what I am getting from this TechCrunch article, I think it might be what you are looking for or in the right direction. --Super Goku V (talk) 02:34, 7 February 2026 (UTC)
- How much data and how frequently? Aaron Liu (talk) 16:49, 8 February 2026 (UTC)
- You don't need to as long as you comply with Wikipedia's content licence, but if you are copying a lot of data it would probably be better (for both you and Wikipedia) to. Phil Bridger (talk) 17:01, 8 February 2026 (UTC)
- Considering that our API is free for most small usecases and we freely provide dumps for everyone to use, no? Wikimedia Enterprise is if you usecase meets the brief "if I do this, I will cause production outages" Sohom (talk) 18:37, 8 February 2026 (UTC)
- See WP:Database download for an overview of ways to get at our data. —Cryptic 21:16, 8 February 2026 (UTC)
- Hi @~2026-82871-0,
- Yes as other people have said here - it depends on "how much" or "how fast" you want... There are various APIs and database dumps that exist. Here's the User-Agent Policy and API Usage Guidelines for starters.
- You can also access and download content via the enterprise API service directly, at no cost, up to a fairly high limit. That same dataset is also available via several alternative methods including WikimediaCloudServices and external platforms. For information on those options see meta:Wikimedia_Enterprise#Access.
- LWyatt (WMF) (talk) 14:59, 16 February 2026 (UTC)
- There are even companies that will put all of Wikipedia on a hard drive and ship it to you for a fee. See prepperdisk.com (don't know if they are any good - I just picked the first one duckduckgo listed). --Guy Macon (talk) 15:22, 16 February 2026 (UTC)
- https://what-if.xkcd.com/31/ RoySmith (talk) 16:24, 24 February 2026 (UTC)
- There are even companies that will put all of Wikipedia on a hard drive and ship it to you for a fee. See prepperdisk.com (don't know if they are any good - I just picked the first one duckduckgo listed). --Guy Macon (talk) 15:22, 16 February 2026 (UTC)
- they ideally should but we can't legally do anything more than politely ask them to stop mghackerlady (talk) (contribs) 15:42, 9 March 2026 (UTC)
Wikimedia Foundation Bulletin 2026 Issue 3


Highlights
- Wikipedia Library: Wikipedia Library gained new content partnerships, restored access to the British Newspaper Archive, and added an Arabic language academic resource with more than 7 million records.
- Gender gap: The Celebrate Women 2026 campaign will run from March 1–31 to advance the achievements of the women’s rights and gender equity movement globally.
- Annual Planning: The Annual Plan is the Wikimedia Foundation’s description of what we hope to achieve in the coming year. We invite you to shape this plan together with us. Between now and the end of June 2026, we will have continuous conversations about how global trends may shape our future, how we can experiment, adapt and respond together.
Annual Goals Progress on Infrastructure
See also newsletters: Wikimedia Apps · Growth · Product Safety and Integrity · Readers · Research · Wikifunctions & Abstract Wikipedia · Tech News · Language and Internationalization · other newsletters on MediaWiki.org
- Patrolling improvements: A new feature available on Special:Contributions shows temporary accounts that are likely operated by the same person, and so makes patrolling less time-consuming.
- Wikifunctions: How Abstract Wikipedia articles can be integrated into Wikipedia language editions to enable Wikipedians to write an abstract article once and have it available in many languages.
- Suggestion Mode: A new Beta Feature for the VisualEditor, Suggestion Mode, is now available on English Wikipedia for experienced editors. This features proactively suggests actions that people can consider taking to improve Wikipedia articles, such as "add citation", "improve tone", or "fix an ambiguous link".
- WDQS Blazegraph Migration: As part of the migration away from Blazegraph (the current backend of the Wikidata Query Service), an initial evaluation of open-source triple store candidates has been completed. Using the published evaluation methodology, performance, stability, and compatibility was assessed.
- Tech News: Latest updates from Tech News week 06 and 07 include the new Watchlist labels feature that allows logged-in contributors to organise and filter watched pages in ways that improve their workflows. They also link to the 44 community submitted tasks that were resolved over the last two weeks.
Annual Goals Progress on Volunteer Support
See also blogs: Global Advocacy blog · Global Advocacy Newsletter · Policy blog · WikiLearn News · The Wikipedia Library · list of movement events
- Funding Principles: The interim Global Resource Distribution Committee (GRDC) has published a first version of the Funding Principles which guides the broader grantmaking ecosystem across the Wikimedia Movement. Share your feedback in the Discussion page.
- Wikipedia 25: Celebrating 25 years of Wikipedia in Warsaw.
- Responsible AI: Why the Global Index on Responsible AI matters for Wikimedians.
- Open Knowledge: Why the Open Knowledge Movement and Public Interest Journalism must unite forces. Shared principles and interdependence, points of convergence and the path forward.
- Journalism Awards: Applications for the Open the Knowledge Journalism Awards are now open until March 1. Presented by the International Center for Journalists in partnership with the Wikimedia Foundation, the awards seek to recognize African journalists whose reporting helps close knowledge gaps about Africa on Wikipedia.
- UN General Assembly: Wikimedia Foundation was invited to speak at the UN General Assembly (UNGA) hall about Wikipedia’s role in global digital governance.
- Advocacy: Wikimedia Foundation has adopted new and updated policies regarding the use of banners, logo changes, and blackouts on the projects, particularly for advocacy purposes. Specifically, the new "Use of Wikimedia sites for advocacy purposes" policy, and updates to the guidelines for CentralNotice usage and requesting wiki configuration changes. The policies establish clearer processes for advocacy activities, and require notification of Foundation staff for some proposed uses of the Wikimedia sites.
Annual Goals Progress on Effectiveness
See also: Progress on the annual plan
- Wikimedia Futures Lab: Reflections from a Wikimedian who attended the Wikimedia Futures Lab.
Other Movement curated newsletters & news
See also: Diff blog · Goings-on · Planet Wikimedia · Signpost (en) · Kurier (de) · Actualités du Wiktionnaire (fr) · Regards sur l’actualité de la Wikimedia (fr) · Wikimag (fr) · Education · GLAM · Milestones · Wikidata · Central and Eastern Europe · other newsletters
Subscribe or unsubscribe · Help translate
For information about the Bulletin and to read previous editions, see the project page on Meta-Wiki. Let foundationbulletin
wikimedia.org know if you have any feedback or suggestions for improvement!
MediaWiki message delivery 23:26, 17 February 2026 (UTC)
Error in above announcement
Re: "The Annual Plan is the Wikimedia Foundation’s description of what we hope to achieve...", the link to "Annual Plan" returns "This page doesn't currently exist". --Guy Macon (talk) 02:35, 18 February 2026 (UTC)
- Fixed. Typically when there's an error in a link and the link has a slash at the end, removing the slash fixes the error (MediaWiki interpreting the slash as part of the page name). FWIW @whomever this concerns, it would be good to have a person's name in the signature of this bulletin, so we can ping someone in particular if there's an error. I just went to do a courtesy ping since I edited it, but don't know who I'd ping. — Rhododendrites talk \\ 18:07, 18 February 2026 (UTC)
- The wikitext says:
<bdi lang="en" dir="ltr">[[User:MediaWiki message delivery|MediaWiki message delivery]]</bdi> 23:26, 17 February 2026 (UTC) <!-- Message sent by User:RAdimer-WMF@metawiki using the list at https://meta.wikimedia.org/w/index.php?title=Global_message_delivery/Targets/Wikimedia_Foundation_Bulletin&oldid=30053915 -->
- That URL isn't very helpful if you want to find the author. If you know where to look, "User:RAdimer-WMF@metawiki" eventually leads you to https://meta.wikimedia.org/wiki/User_talk:RAdimer-WMF but a straightforward signature is better than decoding a comment in the wikitext. --Guy Macon (talk) 23:01, 18 February 2026 (UTC)
- The wikitext says:
AI agents are coming - what's the current state of protection?
This feels like something that must've come up already, but I'm not seeing it. As many interventions likely require WMF involvement, I'm putting it here.
With the sudden popularity of e.g. OpenClaw, AI agents are becoming more popular, and stand to be radically disruptive to our project (omitting potential applications for the time being, to avoid compiling a playbook). I'm curious what the current plans are to deal with an influx of agents.
Seems to me there are interventions that would intercept a large number of unsophisticated agent users, like using clues in the user agent (the web kind, not to be confused with AI agent). Then the question is about people who take steps to be sneakier. Rapid edits can be dealt with by captchas (assuming the captchas are hard enough). We could take action against data center IPs, but that would probably snag some humans as well (and pushing agents to residential IPs makes them more costly but not impossible to use). Then there are the various imperfect LLM output detection tools, of course.
Apologies if this discussion is already taking place somewhere - happy to receive a pointer link. — Rhododendrites talk \\ 15:51, 14 February 2026 (UTC)
- But can AI agents press edit or even be able to navigate around the editing method? ~2026-68406-1 (talk) 16:50, 14 February 2026 (UTC)
- You can edit Wikipedia through the API without using the front-end web interface. That's how bots, tools, etc. make edits. Both use the same process on the back-end, more or less, as I understand it. — Rhododendrites talk \\ 21:10, 14 February 2026 (UTC)
- They have been shown to send emails on their own accord by navigating the Gmail interface, so I bet they would be able to edit Wikipedia as well (though I don't know about the CAPTCHA). OutsideNormality (talk) 06:02, 15 February 2026 (UTC)
- I had a small moment of panic about agentic browsers in December and the consensus seemed to be that it wasn't time yet, but now the OpenClaw-enabled crabby-rathbun/matplotlib incident has me worried again. ClaudineChionh (she/her · talk · email · global) 07:13, 15 February 2026 (UTC)
- That's either (1) a human pretending to be an agent or (2) a human prompting their agent to write a hit piece. SuperPianoMan9167 (talk) 18:19, 16 February 2026 (UTC)
- Notified: Wikipedia talk:WikiProject AI Cleanup. ClaudineChionh (she/her · talk · email · global) 07:21, 15 February 2026 (UTC)
- It would be interesting to encounter AI agents that you could try breaking their instruction prompts and have them dox their creator. That would be fun to attempt. There's so many good guides out there on how to destroy AI agents (under the guise of preventing such actions, but it's still informative on how to do it purposefully). SilverserenC 07:29, 15 February 2026 (UTC)
- i hope that the doxxing is said in jest and not an encouragement to do so. – robertsky (talk) 13:47, 15 February 2026 (UTC)
- It was in jest, though also somewhat uncontrollable? There's been multiple instances of AI agents doing it spontaneously or with minimal prodding, giving up either personal details if they somehow have them or just account and password info, IP address and computer info, ect. SilverserenC 18:14, 15 February 2026 (UTC)
- i hope that the doxxing is said in jest and not an encouragement to do so. – robertsky (talk) 13:47, 15 February 2026 (UTC)
- Thank you for raising this. The LLM capabilities that the major providers have released in the last month pose an existential threat to the project today, let alone factoring in capabilities in future releases. Early 2025 GPT-4 era models were cute little toys in comparison; non-autonomous, with obvious output that was easily caught with deterministic edit filters. Autonomous agents are indeed coming, and output may improve to the point that detection is difficult even for experts. Big tech data center capex is ramping 20%+ YoY and given the improvements in LLM functionality in the last 6 months, much more must now be expected. The latest releases have shaken me personally and professionally. NicheSports (talk) 08:38, 15 February 2026 (UTC)
- We have an obvious place to document how much of what we see on Wikipedia (and the Internet in general) is generated by AI. That page is Dead Internet theory. Alas, a single editor has taken WP:OWNERSHIP of that page and WP:BLUDGEONS any attempt to make the topic of that page the topic that is found in most reliable sources -- whether the Internet now consists primarily of automated content. Instead the page claims that the dead Internet theory is a conspiracy theory and that the theory only refers to a coordinated effort to control the population and stop humans from communicating with each other -- something no reliable source other that the few that bother to respond to the latest 4chan bullshit talk about. There does exist such a conspiracy theory -- promoted by Infowars and 4chan -- but that's not what most sources that write about the dead internet are talking about.
- There was even an overly broad RfC that is being misused. The result was no consensus for a complete rewrite of article, but is now used (with the usual trick of morphing no consensus into consensus against) as a club against anyone who suggests any changes to the wording of the lead sentence.
- It's sad really. It would be great if, in discussions like this one, we could point to a page that focuses on actual research about how big the problem is that human-seeming AIs are taking over the job formerly done by easily-detected bots. I gave up on trying to improve that page. Life is too short. --Guy Macon (talk) 13:29, 15 February 2026 (UTC)
- 4chan was the origin of the phrase and the conspiracy theory the original sense of it. It seems to have gone through semantic diffusion to now just mean "there are lots of bots on the internet". The process seems complete now though, inevitably the page will be rewritten, eventually... TryKid [dubious – discuss] 18:33, 15 February 2026 (UTC)
- These can be easily blocked as unauthorized bots. sapphaline (talk) 16:46, 15 February 2026 (UTC)
- Thanks for bringing this up. We have more time than usual here, since right now we're still in the phase of these tools being used by AI tech bros and not the general public. Which doesn't mean do nothing, obviously.
- I will admit to being somewhat less concerned about this development, at least for Wikipedia. This could be premature or overly optimistic but it seems like the main benefit of agents vs. chatbots for the average person using AI to edit Wikipedia is that they don't have to copy-paste ChatGPT output, which doesn't seem like an enormous amount of friction for this use case compared to, say, doing shopping.
- I also would expect that people, particularly the kinds of people who want to edit Wikipedia maliciously (which is a smaller subset of people, though) would find different ways to spoof User-Agent etc if they are not already. (Grok apparently is already.) Gnomingstuff (talk) 17:31, 15 February 2026 (UTC)
still in the phase of these tools being used by AI tech bros
- There are some of those with access to lots of resources who have expressed an interest in messing with Wikipedia... But also, it wouldn't take a lot of careful agents to be seriously disruptive. But we're getting into WP:TECHNOBEANS territory. Hard to talk defense on a transparent project without encouraging offense. :/ — Rhododendrites talk \\ 18:19, 15 February 2026 (UTC)- "we're getting into WP:TECHNOBEANS territory" - would you be comfortable discussing this by email? sapphaline (talk) 18:21, 15 February 2026 (UTC)
- By the way, none of the pre-emptive solutions proposed here are effective. Residential proxies are dirt cheap, user agents are easily spoofed and captchas are easily bypassed. sapphaline (talk) 18:01, 15 February 2026 (UTC)
- That they aren't going to catch everyone doesn't mean they're ineffective at catching some. Only an unsophisticated sock puppeteer, for example, would be caught by a checkuser, but it's still a valuable tool because it does catch a lot of sock puppets. It's a starting point, not a solution. — Rhododendrites talk \\ 18:14, 15 February 2026 (UTC)
- Thoughts and prayers PackMecEng (talk) 18:18, 15 February 2026 (UTC)
- guess ECPing main and project space is a (temporary) last resort Kowal2701 (talk, contribs) 22:58, 16 February 2026 (UTC)
user agents are easily spoofed
User agent spoofing can easily be detected. Look up TCP and TLS fingerprinting - while those can be spoofed, it's generally harder than spoofing a single header. With JavaScript (slightly outdated article), or even plain CSS (using a technique similar to NoScript Fingerprint), you can make it even harder to successfully spoof the user agent - especially if you don't outright block the user, but instead silently flag them in Special:SuggestedInvestigations, giving no feedback to attackers on if their spoof was successful or not, at least until they get blocked (although this may be undesirable, as the AI edits would be visible for a short while). OutsideNormality (talk) 23:03, 16 February 2026 (UTC)- (Of course I'm not necessarily suggesting any of this be implemented, I'm just outlining possibilities.) OutsideNormality (talk) 23:27, 16 February 2026 (UTC)
- I haven't quit editing yet, but I will in the future due to the overwhelming flood that is coming from AI. As is usually the case, the WMF will barely lift a finger, and if they do it will be the wrong finger. Millions of jobs are being replaced by AI in the real world workforce. The impact here will be felt just the same. We can't really stop it. The project will be destroyed by it. It's already happening. --Hammersoft (talk) 15:51, 16 February 2026 (UTC)
- Which fingers should they lift? — Rhododendrites talk \\ 16:25, 16 February 2026 (UTC)
- Maybe cook up some AI agents that can spot fake references and references that don't support the content cited to them? I think such AI would fix roughly 90% of all AI related problems we have right now (and 50% of the future ones) and many problems from non-AI edits. Jo-Jo Eumerus (talk) 17:36, 16 February 2026 (UTC)
- this won't work, if LLMs cannot accurately characterize a source then they definitely can't determine whether a source is accurately characterized, the same mechanism would be at work
- outright fake references are pretty rare nowadays Gnomingstuff (talk) 17:45, 16 February 2026 (UTC)
- That seems to assume that it's impossible for an AI - even a non-LLM AI - to compare sources to article claims, which is unproven (and likely false). Based on some complaints I have seen on AN and elsewhere, I am not sure that fake references are as solved as you seem to assume? Jo-Jo Eumerus (talk) 19:26, 16 February 2026 (UTC)
- Fake references aren't solved, but they have become less common with newer LLMs that have search capabilities and/or the ability to provide sources to them. Which doesn't mean that the text doesn't extrapolate beyond the source. Gnomingstuff (talk) 23:30, 16 February 2026 (UTC)
- OK, but this doesn't demonstrate that "this [cook up some AI agents that can spot fake references and references that don't support the content cited to them] won't work" at all. Jo-Jo Eumerus (talk) 08:15, 17 February 2026 (UTC)
- ...because the same process by which it summarizes a source is the process by which it "spots fake references"? Gnomingstuff (talk) 19:36, 17 February 2026 (UTC)
- @Gnomingstuff, Not really? Looking up information can be reduced to a similarity search on a vector database using transformers, "summarizing" is different in that it requires the generation of novel information based on the existing mappings. Sohom (talk) 19:58, 17 February 2026 (UTC)
- Thanks for the info, I didn't know that. At some point though, the information has to be actually conveyed, and then you're back to the LLM generating that. Gnomingstuff (talk) 04:26, 18 February 2026 (UTC)
- But that still doesn't support the contention - minutiae about how LLMs operate do not demonstrate that "this [cook up some AI agents that can spot fake references and references that don't support the content cited to them] won't work", because, for one thing, a LLM can operate recursively in a trial-and-error. Never mind that LLMs aren't the only type of AI out there. Jo-Jo Eumerus (talk) 16:33, 18 February 2026 (UTC)
- Thanks for raising this idea, @Jo-Jo Eumerus! We are actually beginning to explore exactly that: whether AI models might be able to help us surface to editors times when a reference appears not to support the claim it is being used to cite. Feel free to subscribe to or comment on that Phabricator task if you'd like to be involved!
- As to your question, @Gnomingstuff, about whether or not this is work feasible for AI, we don't know either. So I want to emphasize that it is still at a very early stage, and if we ultimately find that it's not a suitable task for AI, we won't move forward with it. We'll seek community collaboration on the development of any features that come out of it long before they reach the deployment stage. Also, any such features will be informed by our AI strategy that centers human judgment. For instance, I could envision a future in which an editor opens up an article and a Suggestion Mode card appears next to a reference stating that an AI tool thinks it may not support the text it's being used to cite, prompting them to check it (this is one way to keep a human in the loop).
- Cheers, Sdkb‑WMF talk 19:49, 23 February 2026 (UTC)
- But that still doesn't support the contention - minutiae about how LLMs operate do not demonstrate that "this [cook up some AI agents that can spot fake references and references that don't support the content cited to them] won't work", because, for one thing, a LLM can operate recursively in a trial-and-error. Never mind that LLMs aren't the only type of AI out there. Jo-Jo Eumerus (talk) 16:33, 18 February 2026 (UTC)
- Thanks for the info, I didn't know that. At some point though, the information has to be actually conveyed, and then you're back to the LLM generating that. Gnomingstuff (talk) 04:26, 18 February 2026 (UTC)
- @Gnomingstuff, Not really? Looking up information can be reduced to a similarity search on a vector database using transformers, "summarizing" is different in that it requires the generation of novel information based on the existing mappings. Sohom (talk) 19:58, 17 February 2026 (UTC)
- ...because the same process by which it summarizes a source is the process by which it "spots fake references"? Gnomingstuff (talk) 19:36, 17 February 2026 (UTC)
- OK, but this doesn't demonstrate that "this [cook up some AI agents that can spot fake references and references that don't support the content cited to them] won't work" at all. Jo-Jo Eumerus (talk) 08:15, 17 February 2026 (UTC)
- Fake references aren't solved, but they have become less common with newer LLMs that have search capabilities and/or the ability to provide sources to them. Which doesn't mean that the text doesn't extrapolate beyond the source. Gnomingstuff (talk) 23:30, 16 February 2026 (UTC)
- That seems to assume that it's impossible for an AI - even a non-LLM AI - to compare sources to article claims, which is unproven (and likely false). Based on some complaints I have seen on AN and elsewhere, I am not sure that fake references are as solved as you seem to assume? Jo-Jo Eumerus (talk) 19:26, 16 February 2026 (UTC)
- Given the capabilities recently released, with more coming, drastic action would be required. The following are the magnitude of changes that could even have a chance
- Negotiation with LLM providers to build guardrails into models preventing their use in generating wikipedia style content
- Banning TA editing, and requiring new editors to submit real-time typed essay responses during sign up to establish a semantic and statistical baseline
- Limiting new accounts to character-limited edits for their first N edits, to ensure that new users are willing and able to contribute without LLM assistance
- Obviously, completely banning LLM assistance in generation or rewriting of any content, anywhere on wikipedia. The latest releases are nothing like what came before; it will completely overwhelm the community's ability to even identify it. The strictest measures are the minimum measures
- Of course, most of these will not happen, so we will turn the project over to the machines. Devastating stuff really NicheSports (talk) 18:10, 16 February 2026 (UTC)
- There's already been a massive amount of traffic in having to deal with LLM using editors. From my chair, an immediate first step that must be taken is to ban the use of LLMs by any account, including TAs, and make it a bannable offense after one warning. That's just the first step that must be taken. --Hammersoft (talk) 18:14, 16 February 2026 (UTC)
- Agreed this is the first step NicheSports (talk) 18:20, 16 February 2026 (UTC)
- Disagreed. This violates a fundamental Wikipedia guideline. SuperPianoMan9167 (talk) 18:22, 16 February 2026 (UTC)
- I feel like TAs are a red herring here -- maybe you are seeing a different slice of this since you focus on new edits that haven't stuck around long, but the vast majority of AI edits I see are by registered users. Gnomingstuff (talk) 23:36, 16 February 2026 (UTC)
- We immediately indef anyone who's rapidly spreading harmful content, and I'd consider LLM-generated content to be a much more severe problem than something like placing offensive images in articles. Thebiguglyalien (talk) 🛸 23:44, 19 February 2026 (UTC)
- Community Consensus is to allow LLM generated content with heavy guardrails and restrictions. Besides, most good faith editors, using LLM's or not would either not want to live type their essays, or would be creeped out by the privacy concerns of letting Wikipedia access their keyboard to that level. ~2026-11404-95 (talk) 16:44, 24 February 2026 (UTC)
requiring new editors to submit real-time typed essay responses during sign up to establish a semantic and statistical baseline
You do realize someone could have their LLM open in another window and just type the words it generates into the form manually? SuperPianoMan9167 (talk) 18:15, 16 February 2026 (UTC)- This will leave a wildly obvious statistical pattern that conclusively demonstrates the response was not written by a human in real time. Key stroke sequence/timing would solve this robustly NicheSports (talk) 18:19, 16 February 2026 (UTC)
- So we need to mandatorily require a keylogger installed on their computer before they even think about contributing to Wikipedia? Sohom (talk) 18:44, 16 February 2026 (UTC)
- No, why would that be required for this to be implemented during sign up? The data could be collected as the user types into a response box in the browser. Possibly I'm missing something. Also these are not all firm suggestions... rather examples to demonstrate how far we are from the types of measures required. I need to stop responding now apologies NicheSports (talk) 19:00, 16 February 2026 (UTC)
- Plus many people also write articles in word or in notepad. What would it do for that? ~2025-38536-45 (talk) 19:16, 16 February 2026 (UTC)
- So we need to mandatorily require a keylogger installed on their computer before they even think about contributing to Wikipedia? Sohom (talk) 18:44, 16 February 2026 (UTC)
- This will leave a wildly obvious statistical pattern that conclusively demonstrates the response was not written by a human in real time. Key stroke sequence/timing would solve this robustly NicheSports (talk) 18:19, 16 February 2026 (UTC)
- There's probably a set of smaller bandaid fixes:
- Gather data and collate findings about what newer LLM output tends to look like, and then publicize this better than we already are (and no I don't care about some rando using it to make their claude plugin go semi-viral). WP:AISIGNS has some things that still happen and a few that only started happening around 2025, but a lot of that page describes GPT-4 or GPT-4o era text. I'm sort of doing this but I need to add the current numbers; I've gotten bogged down in cleaning the data of template boilerplate so I haven't updated them in a while.
- Disable Newcomer Tasks or at least the update, expand, and copyedit tasks, in practice these have just encouraged users to become AI fountains because it makes numbers go up faster. They have proven to be a net negative.
- Create a tool, whether via edit filter, plugin or (optimistically thinking) actual WMF integrations with an AI detection service, that automatically flags and/or disallows suspect content. I've been tossing around doing this but nothing concrete thus far.
- Make WP:LLMDISCLOSE mandatory. I've said this before, but the most realistic best-case endgame is probably to disclose, as permanently as possible, any AI-generated content, and let readers make their own decisions based on that.
- Somehow convince more people to work on this than the handful who currently are. We need people working on detection, we need people working on fact-checking, and we need people doing the most grueling task of all which is getting yelled at by everyone and their mother about doing the former two.
- Gnomingstuff (talk) 23:56, 16 February 2026 (UTC)
- Disabling newcomer tasks is something we could get in motion right now. Thebiguglyalien (talk) 🛸 23:49, 19 February 2026 (UTC)
- @Thebiguglyalien,@Gnomingstuff Disabling all newcomer tasks feels like taking a nuclear bomb to fight what is in general a good thing for newcomers. If you show numbers (and get consensus) I can/will support disabling the copyediting task pending the deployment of paste check or similar, I don't see a reason to disable (for example the "add a link" task or "find a reference" task) over this though. Sohom (talk) 23:57, 19 February 2026 (UTC)
- At the very least, a warning not to use LLMs in the newcomer tasks would mitigate the issue to some extent. But even that is going to be a tough sell because there are enough people who support LLM-generated content and will come along with "well technically it's not banned therefore we can't say anything that might be interpreted as discouraging it". Thebiguglyalien (talk) 🛸 00:00, 20 February 2026 (UTC)
- I don't really see how disabling one (1) feature that has proven to be a net negative for article quality is "a nuclear bomb." Gnomingstuff (talk) 00:37, 20 February 2026 (UTC)
- @Gnomingstuff I think there has been significant effort poured into newcomer tasks by the WMF (and also community members) that disabling all newcomer tasks is probably be a significant undertaking that would see opposition from a lot of folks. This is not to mention, that I think we would kinda doing well meaning newcomers a disservice by potentially breaking the Homepage (which relies on the infrastructure of Newcomer tasks), which is the first glimpse of contributor workflows they see after registering.
- I will don't think the same opposition applies to disabling specific tasks that are net negative, for what's worth I would not be averse to including a "don't use LLMs" notice to the prompt of the "copyedit article" prompts. And if you can show stats that for the copyediting tasks we are just creating a newbie biting machine/are creating a undue burden on Wikipedians, I would support turning off the specific tasks that are the problem. Sohom (talk) 01:21, 20 February 2026 (UTC)
- (Please stop pinging me.)
- This is just sunk cost fallacy. Significant effort is poured into a lot of things that turn out to be a bad idea.
- At one point I was tracking this; will take a look at the recent stuff if I can find the link. Gnomingstuff (talk) 02:17, 20 February 2026 (UTC)
- (Sorry about the pings, will keep that in mind. I prefer to be pinged, since I lose track of discussions on large threads like this -- and kinda assumed similar for you)
- I don't see this as a sunk cost fallacy, my point is that I do think the newcomer tasks benefit well meaning newcomers (who go on to be long-term editors), what you need to convince folks of is that the downsides of any newcomer tasks outweighs any benefits that come from engaging well-meaning newcomers, (again stressing any here, I don't disagree that the copy-editing/expanding article ones are a bit of a mess, and I could pretty easily convinced that it is in the communities interests to turn it off). What I'm also saying is that my understanding is that the WMF views this similarly (especially talking about the whole set of features called "newcomer tasks" in aggregate). I don't think WMF will object to us turning off individual tasks that can be shown to be a undue burden on editors as you or TBUA were suggesting the copy-editing task has become (which again is a position I kinda agree with). Sohom (talk) 02:40, 20 February 2026 (UTC)
- I just did a check of the 60 copyedit/expand task edits starting at the bottom of recent changes. tl;dr: not good!
- @Thebiguglyalien,@Gnomingstuff Disabling all newcomer tasks feels like taking a nuclear bomb to fight what is in general a good thing for newcomers. If you show numbers (and get consensus) I can/will support disabling the copyediting task pending the deployment of paste check or similar, I don't see a reason to disable (for example the "add a link" task or "find a reference" task) over this though. Sohom (talk) 23:57, 19 February 2026 (UTC)
- Disabling newcomer tasks is something we could get in motion right now. Thebiguglyalien (talk) 🛸 23:49, 19 February 2026 (UTC)
- There's already been a massive amount of traffic in having to deal with LLM using editors. From my chair, an immediate first step that must be taken is to ban the use of LLMs by any account, including TAs, and make it a bannable offense after one warning. That's just the first step that must be taken. --Hammersoft (talk) 18:14, 16 February 2026 (UTC)
- Maybe cook up some AI agents that can spot fake references and references that don't support the content cited to them? I think such AI would fix roughly 90% of all AI related problems we have right now (and 50% of the future ones) and many problems from non-AI edits. Jo-Jo Eumerus (talk) 17:36, 16 February 2026 (UTC)
- Which fingers should they lift? — Rhododendrites talk \\ 16:25, 16 February 2026 (UTC)
Extended content |
|---|
|
- Of these 60 edits, only
1918 of them did not contain obvious issues, and only a handful of those1918 were obviously good. This means that over two-thirds of the edits were obviously not improvements, and some were drastically not improvements. - These diffs are a little skewed since several the ones at the top are the same person, but based on my experience I don't think this is an unrepresentative sample. (You can check others by going to pretty much any of these articles; since people rarely remove the copyedit tags, the articles just accumulate more and more questionable edits.) Gnomingstuff (talk) 03:15, 20 February 2026 (UTC)
- Hi @Gnomingstuff! I wanted to chime in on behalf of the Growth team, which is responsible for Newcomer Tasks. Overall, Newcomer Tasks arose out of a recognition that Wikipedia needs more editors, and to achieve that we first need to make editing easier for newcomers who may go on to become experienced contributors. We had found that many newcomers were unsure how they could contribute, or they tried to take on very challenging tasks like creating a new article immediately, so we developed Newcomer Tasks to point them toward easier edits and give them a little more guidance.
- Our early analysis showed positive results: Newcomers with access to the tasks were more likely than other newcomers to make their first edit, less likely to have it reverted, and more likely to stick around and continue editing long-term. This led us to develop Structured Tasks that provide even more guidance. We deployed the first of these, "Add a Link", here last September after we saw similar results and gathered community input/consensus. Currently we’re testing out "Revise Tone" (see this discussion), and the early data is looking great; here’s the feed of those edits.
- Now, to speak to your spot checks, first of all, thank you for doing them! It's really helpful to have that kind of information. The number of edits with issues in that sample certainly isn't great, but one thing it may be helpful to keep in mind is that these are all edits by newcomers, who by virtue of being new tend to struggle navigating Wikipedia's unfamiliar environment. I'd be curious how a random sample of 60 non-task newcomer edits would compare to your sample; the fact that task edits are reverted less often is one clue that it might be even worse. It shows the magnitude of the challenge we face.
- Digging into the diffs, the most frequent issue you identified (in 16/60 edits) was overlinking. This is a known issue for which we're exploring possible solutions. Beyond that, it looks like 3/60 edits had signs of AI usage, although it's certainly possible others also used AI that wasn't immediately visible. One way we could discourage this would be to add a warning to the help panel guidance for relevant tasks. However, we find that adding too many warnings quickly causes editors to just stop reading guidance and miss other important info. A more targeted approach would be to identify the moment when an editor appears to be pasting LLM-generated content into the edit window and engage with them then, which is what we hope to do with Paste Check. That'll be available here next week.
- We're hoping to continue developing and introducing structured editing and feedback opportunities so that we can help incubate the next generation of editors. That effort has already shown some fruits: There are more than 500 editors on this project who did a Newcomer Task as one of their first 10 edits and have since made over 1,000 edits. That said, I know from my own experience that patrolling newcomer edits is a lot of work, and we don't want to exacerbate that. We are always looking for your collaboration to design new tasks in a way that sets up newcomers for success without worsening the moderation burden experienced volunteers already bear.
- Cheers, Sdkb‑WMF talk 20:18, 24 February 2026 (UTC)
- Thanks for the update! In my experience the AI stuff comes more into play with expand/update, although the lines get blurred a lot, and like you said, a lot of times minor AI copyedits are either OK or pointless-but-not-bad. Gnomingstuff (talk) 20:50, 28 February 2026 (UTC)
- My general sense of "newcomer tasks" is that they are a patch that tries to pretend away the fundamental problem, namely, it takes being a little odd to decide that writing an encyclopedia is a fun idea of a hobby. There's going to be a long tail of drive-by contributors, and a much smaller number of serious enthusiasts. Even the best automated scheme for suggesting edits will only push that curve a little bit. And they run the real risk of leading people to make useless-to-detrimental small edits, because by construction they necessarily lead the least experienced editors to make more edits faster. Unless editors get feedback about which changes were good and which were not, that's not a learning experience; it's just racking up points. Stepwise Continuous Dysfunction (talk) 23:59, 20 February 2026 (UTC)
- Yes exactly, perfectly stated.
- They're also not necessarily small edits, either -- one of the more insidious things here is the task encourages people, probably inadvertently, to mislabel what they are actually doing. Recent-ish example: This edit claims to remove promotional tone in the original text. I have no idea what the hell this is referring to; the original text was not promotional. And it introduces a few subtle changes of meaning -- for instance, claiming a series of books was "inspired, in part" by his wife, when the original text implies his wife took a more active role in introducing the topic. Gnomingstuff (talk) 03:42, 21 February 2026 (UTC)
- Of these 60 edits, only
- Is the expand task still live? I assumed it was disabled when the obvious issues emerged. If it isn't, it should be disabled pronto. CMD (talk) 04:01, 20 February 2026 (UTC)
- _I_ don't personally know which fingers to lift. I'm not an expert in this field. Following my recommendations would be decidedly ill-informed. That doesn't mean I can't recognize a problem. If my furnace fails to run, I know my abode isn't warm. I don't know how to fix the furnace, but I know it's broken. Where this goes to is competence, or lack thereof, of the WMF. While there's a number of things the WMF has done well, they have also demonstrated incompetence on a grand scale on a variety of occasions that are enough to inspire awe. I don't expect the WMF to be on the front edge of the curve on dealing with this problem. They will be reactive (if at all) rather than proactive. --Hammersoft (talk) 18:13, 16 February 2026 (UTC)
Millions of jobs are being replaced by AI in the real world workforce.
[citation needed]The project will be destroyed by it
We were told this a month ago, and two months ago, and six months ago, and a year ago, and two years ago, etc. We were told agents would replace humans in 2025. That didn't happen. We were promised AGI by 2026. That didn't happen. The AI industry is filled with broken promises, over and over and over again. Further reading here. SuperPianoMan9167 (talk) 18:29, 16 February 2026 (UTC)- Citations aren't required for comments. A quick Google search will reveal many high-quality publications suggesting that it is different this time. I'm going to stop replying here but you definitely should too. This is not constructive NicheSports (talk) 18:40, 16 February 2026 (UTC)
- My point is that all these posts saying "the project will die from AI" are starting to sound like Chicken Little saying "the sky is falling". SuperPianoMan9167 (talk) 18:43, 16 February 2026 (UTC)
- Maybe the warnings are like chicken little, or maybe they are like the seven warnings of sea ice that the Titanic ignored. Or maybe the radar warning about a large formation of aircraft approaching Pearl Harbor on December 7, 1941. --Guy Macon (talk) 19:39, 16 February 2026 (UTC)
- Sometimes they are just ballons. ~2025-38536-45 (talk) 20:25, 16 February 2026 (UTC)
- See The Boy Who Cried Wolf. There have been so many equally hyperbolic previous predictions that were incorrect that many people are disinclined to believe you this time, and this will only increase with every mistaken assertion that this time the end really is nigh. Thryduulf (talk) 22:14, 16 February 2026 (UTC)
- We should at the very least have a contingency plan, this is something the WMF should have done already Kowal2701 (talk, contribs) 23:23, 16 February 2026 (UTC)
- You tell 'em! Look at all the hyperbolic previous predictions that this time Mount Vesuvius will erupt.

We have been living here since 1945 and it's been fine... - --Guy Macon (talk) 01:48, 17 February 2026 (UTC)
- Maybe the warnings are like chicken little, or maybe they are like the seven warnings of sea ice that the Titanic ignored. Or maybe the radar warning about a large formation of aircraft approaching Pearl Harbor on December 7, 1941. --Guy Macon (talk) 19:39, 16 February 2026 (UTC)
- My point is that all these posts saying "the project will die from AI" are starting to sound like Chicken Little saying "the sky is falling". SuperPianoMan9167 (talk) 18:43, 16 February 2026 (UTC)
- Citations aren't required for comments. A quick Google search will reveal many high-quality publications suggesting that it is different this time. I'm going to stop replying here but you definitely should too. This is not constructive NicheSports (talk) 18:40, 16 February 2026 (UTC)
- Blueraspberry's recent Signpost article seems very applicable here:
Kowal2701 (talk, contribs) 14:42, 18 February 2026 (UTC)The solution that I want for the graph split, and for many other existing Wikimedia Movement challenges, is simply to be able to see that there is some group of Wikimedians somewhere who have active communication about our challenges. I want to get public communication from leadership who acknowledges challenges and who has the social standing to publicly discuss possible solutions. I want to see that someone is piloting the ship upon which we all sail, and which no one would replace if it ever failed and sunk. For lots of issues at the intersection of technical development and social controversy – data management, software development, response to AI, adapting to changes in political technology regulation – I would like to see Wikimedia user leadership in development, and instead I get anxious for all the communication disfluency that we experience.
- I suspect the (now-inactive )account Doughnuted was operated by AI agent—seems like the operator just prompted it to provide suggestions and the agent created and followed a plan of action (a very poor one, but still). If true, it's very far from fooling. But it seems little different from mindless copy and pasters we've been dealing with years. I'm not too concerned. Ca talk to me! 09:39, 17 February 2026 (UTC)
- This seems basically good-faith too. The larger suggestions aren't really improvements to me but the smaller copyedits seem clearly good and I'm implementing some of them (this for instance is good). Gnomingstuff (talk) 17:25, 17 February 2026 (UTC)
- We should at least make it explicit that AI agents aren't exempted by the bot policy, to avoid future wikilawyering that might slow us down from actually doing something about the issue. Chaotic Enby (talk · contribs) 14:29, 18 February 2026 (UTC)
- The bot policy applies to bots and to bot-like editing (WP:MEATBOT):
For the purpose of dispute resolution, it is irrelevant whether high-speed or large-scale edits that a) are contrary to consensus or b) cause errors an attentive human would not make are actually being performed by a bot, by a human assisted by a script, or even by a human without any programmatic assistance
. So I'm not sure what clarification is needed - if someone is engaging in high-speed or high-volume editing they need to get consensus first, regardless of what technologies they do or do not use. Thryduulf (talk) 15:27, 18 February 2026 (UTC)- There's no reason an AI agent would necessarily edit at high-sped or high-volume. Presumably they'd try to model real editors. CMD (talk) 15:35, 18 February 2026 (UTC)
- Then what would be the point of using an AI agent? My concern with agents (and bots) is automated POV-pushing, and that is effective when it is high-volume and high-speed. It would be a good policy to require preconsensus for high-volume edits, with bans if the user and their tools strays from the type of edit they said they would do. It won't solve all problematic edits, but it will stop some of them. WeirdNAnnoyed (talk) 12:01, 19 February 2026 (UTC)
- @WeirdNAnnoyed
It would be a good policy to require preconsensus for high-volume edit
the existing Bot policy already requires this.All bots that make any logged actions [...] must be approved for each of these tasks before they may operate. [...] Requests should state precisely what the bot will do, as well as any other information that may be relevant to its operation, including links to any community discussions sufficient to demonstrate consensus for the proposed task(s)
. Thryduulf (talk) 12:34, 19 February 2026 (UTC) - POV pushing can be very effective, perhaps more in some cases, at low volumes and low speeds. There are also other potential uses for AI agents, such as maintaining a specific page a specific way, a short-term task, or even plain old testing/trolling. CMD (talk) 13:12, 19 February 2026 (UTC)
- AI agents could also be used in a good faith effort to improve the encyclopaedia. Whether the edits would be an improvement or not is both not relevant to the intent and also unknowable in the abstract. Thryduulf (talk) 13:23, 19 February 2026 (UTC)
- Anything could potentially be used in good faith, but I don't see this alone as justifying an exemption from our current bot policy. Chaotic Enby (talk · contribs) 13:25, 19 February 2026 (UTC)
- Not sure how to understand this reply, the purposes I noted could be used in good faith. The original point, that AI agents would not necessarily edit at high-sped or high-volume, is also applicable to good faith uses. CMD (talk) 13:27, 19 February 2026 (UTC)
- @Chaotic Enby I was not suggesting anything of the sort. My main point in this discussion is that the existing bot policy already covers any bot-like editing from AI-agents.
- @CMD I think I misunderstood your final "trolling" comment (which is not possible to do in good faith, whether by human or AI) as indicating the tone of your whole comment. My apologies. I agree with your original point. Thryduulf (talk) 13:43, 19 February 2026 (UTC)
- Thanks, sorry for the misunderstanding. Chaotic Enby (talk · contribs) 13:52, 19 February 2026 (UTC)
- AI agents could also be used in a good faith effort to improve the encyclopaedia. Whether the edits would be an improvement or not is both not relevant to the intent and also unknowable in the abstract. Thryduulf (talk) 13:23, 19 February 2026 (UTC)
- @WeirdNAnnoyed
- Then what would be the point of using an AI agent? My concern with agents (and bots) is automated POV-pushing, and that is effective when it is high-volume and high-speed. It would be a good policy to require preconsensus for high-volume edits, with bans if the user and their tools strays from the type of edit they said they would do. It won't solve all problematic edits, but it will stop some of them. WeirdNAnnoyed (talk) 12:01, 19 February 2026 (UTC)
- There's no reason an AI agent would necessarily edit at high-sped or high-volume. Presumably they'd try to model real editors. CMD (talk) 15:35, 18 February 2026 (UTC)
- Agree we should be explicit, if for nothing else than to be clear that use of agentic AI falls under "bots" and not under "assisted or semi-automated editing". — Rhododendrites talk \\ 15:37, 18 February 2026 (UTC)
- The dividing line between "bot" and "assisted or semi-automated" is generally held to be whether the human individually reviews and approves each and every edit. If a use of agentic AI creates a proposed edit, shows it to the human (maybe as a diff or visual diff), and the edit is only posted after the human approves it, that would fall on the "assisted or semi-automated" side of the line (which, to be clear, could still be subject to WP:MEATBOT if the human isn't exercising their judgement in approving the edits). On the other hand, if the human instructs the AI "add such-and-such to this article" and the AI decides on the actual edit and submits it without further human review, that would almost certainly fall on the "bot" side of the line. There's probably plenty of grey area in between. Note that "high speed" or "high volume" aren't criteria for whether something is "a bot" or not, although higher-speed and higher-volume editing is more likely to draw attention and to be considered disruptive if people take issue with it. Anomie⚔ 23:57, 18 February 2026 (UTC)
- The bot policy applies to bots and to bot-like editing (WP:MEATBOT):
- I think it is inevitable that agents and AI will be the primary contributors to Wikipedia and eventually we'll only need a minority of editors to fix hallucinations and do general maintenance.
- This is also happening in the open source community.
- Writing articles the old way will still be an option for hobbyists, but we shouldn't be surprised if only 1% of the articles are done that way in a year or two... it's uncomfortable, but it is what it is and it doesn't make sense to resist it, IMO. Bocanegris (talk) 14:45, 20 February 2026 (UTC)
- That seems to be quite the overestimation of AI's ability to actually generate factual and/or encyclopedic content. If it somehow manages to make up a majority of edits to Wikipedia, there would have to be a bunch of overworked fact-checkings attempting to make the content factual still. It's not the same as code-changes. ~2026-68406-1 (talk) 16:47, 20 February 2026 (UTC)
- When AI was introduced, it could barely write a high school-level essay. Last year, when generating articles for Wikipedia, almost every source was hallucinated, so it was useless. This year, hallucinations still happen but are less common, and people have noticed that. That's why I said that maybe in a year or two, it could be as good as a person doing this (still making mistakes, as human editors do, but that's why we'll still need people fact-checking).
- When this started, I dismissed people who said "just wait a year and it will be better" because they said that a lot and it didn't get good enough. Then it actually got good enough, so now I think twice before I assume AI will never be able to do X or Y.
- They're using this (officially) in the medical and military fields. It's replacing programmers and artists... I don't think it's so far-fetched to think it will replace Wikipedia editors too, as uncomfortable as that sounds. Bocanegris (talk) 17:10, 20 February 2026 (UTC)
- Articles with hallucinated sources are way less common to be encountered because said articles are being speedily deleted. Articles with hallucinated sources or communication intended for the user are still being produced, as a quick look at the deletion log suggests. SuperPianoMan9167 (talk) 17:38, 20 February 2026 (UTC)
- There has been a significant change in LLM-generated content, though; instead of outright nonexistent references, it's more common for there to be real references that do not support the content they are cited for. SuperPianoMan9167 (talk) 17:45, 20 February 2026 (UTC)
- This is discussion is yet another example of those who are vehemently against any use of AI/LLMs at all not actually listening to people with different views. LLMs are not good enough, today, to write Wikipedia articles on their own. That is unarguable. However, the combination of some LLMs and an actively-engaged human co-author is able to produce a quality Wikipedia article. That there are a lot of humans who are not engaging sufficiently does not change this in the same way that inattentive bot operators don't prove all bot operators are inattentive.
- Additionally none of the above means that LLMs won't be good enough to produce quality Wikipedia articles with less (or even no) active supervision in the future. I'm less confident that this will happen than some in this thread, particularly on the timescales they quote, but I'm not going to say it can never happen. The technology is changing fast and we should be writing rules, procedures, etc. based on the outcomes we want (well-written, verifiable encyclopaedia articles) not based on hysterical reactions to the technology as it exists in February 2026 (or in some cases as it existed in 2024). Thryduulf (talk) 18:54, 20 February 2026 (UTC)
LLMs are not good enough, today, to write Wikipedia articles on their own. That is unarguable. However, the combination of some LLMs and an actively-engaged human co-author is able to produce a quality Wikipedia article. That there are a lot of humans who are not engaging sufficiently does not change this in the same way that inattentive bot operators don't prove all bot operators are inattentive.
Completely agree with this. The question then becomes "How can we make sure that human co-authors are actively engaged?" SuperPianoMan9167 (talk) 18:59, 20 February 2026 (UTC)the combination of some LLMs and an actively-engaged human co-author is able to produce a quality Wikipedia article
, assuming you're correct, that's a teeny tiny part of the editor community who would have that competence, and can be perfectly addressed with a user right. We should be writing PAGs for the present and change them as things develop, not frustrating any attempt to because of some distant possibility or empirically-unsupported notion. Kowal2701 (talk, contribs) 21:50, 20 February 2026 (UTC)- Actually I'd say that the vast majority of the editing community have the competence. A smaller proportion have both the access to a good-enough* LLM and the desire to edit in that manner. A user right one option from a social perspective, but my understanding from the last time this was discussed is that it would be technically meaningless.
- PAGs should work for the present but be flexible enough to also work as the technology develops without locking us in to things that only worked in 2026 without major discussions.
- *How good "good enough" is depends on how much effort the effort the human is willing to put in and what tasks it's being put to (copyediting one section requires less investment than writing an article from scratch. My gut feeling is that the LLM-output when asked to write an article about a western pop culture topic would require less work than the same model's output when asked to write an article about a topic less discussed in English on the machine-readable internet (say 18th century Thai poetry), but I've never seen this tested). Thryduulf (talk) 22:09, 20 February 2026 (UTC)
- In my opinion, the literal only way to use LLMs on Wikipedia without running afoul of PAGs or the risk of hallucination is to thoroughly check through the text you are going through and check if all the information is sourceable and verifiable, or even just feed sources to it and hope that it doesn't spit out a text that doesn't have source-text integrity. It's just not a good idea to write articles backward, text first, sources second. ~2026-68406-1 (talk) 05:36, 21 February 2026 (UTC)
- The perfect AI policy should probably prohibit specifically raw or unedited LLM output to prevent wikilawyering of 'oh I made this article with LLM but I heavily edited it so you can't spot if its LLM or not BWAHAHAHAHAH'. ~2026-68406-1 (talk) 05:38, 21 February 2026 (UTC)
- another reason why WP:LLMDISCLOSE should be mandatory; unironically, the most transparent I have ever seen anyone about their editing process was someone who almost definitely wasn't trying to be. (thanks to whoever showed this to me). Gnomingstuff (talk) 07:18, 21 February 2026 (UTC)
- Imo starting out with a ban while the technology is rubbish and disruptive, and then gradually loosening it as they develop and get better makes the most sense. People who would oppose any loosening on moral grounds are in the minority, I think CENT RfCs would function fine and ensure we don’t get locked into anything Kowal2701 (talk, contribs) 11:34, 21 February 2026 (UTC)
- That seems to be quite the overestimation of AI's ability to actually generate factual and/or encyclopedic content. If it somehow manages to make up a majority of edits to Wikipedia, there would have to be a bunch of overworked fact-checkings attempting to make the content factual still. It's not the same as code-changes. ~2026-68406-1 (talk) 16:47, 20 February 2026 (UTC)
- Just to ring in here from the WMF team responsible for our work on on-wiki bot detection; we’re definitely thinking about the agentic AI issue as well. You’ll be hearing from us soon on how the bot detection trial described in that link has gone (in short: very well).
- I do want to caution that there really is no panacea for detecting AI agents. Like all bots, it is an arms race with a hefty gray area. As mentioned elsewhere in this thread, the way a lot of bot detection works these days (and how we have been implementing it here) is more than just popping up a puzzle sometimes. It involves assessing clients along a spectrum of confidence, and it can often mean deferring immediate action in that moment, so as not to provide deceptive bots the ability to efficiently reverse engineer defenses.
- So, while I don’t have a simple answer to the concern here, I mainly wanted to get across that we are very aware of AI agents as we work to dramatically level up Wikipedia’s bot detection game — and that dealing with those agents is an internet-wide not-fully-solved problem that is not unique to Wikipedia. EMill-WMF (talk) 23:17, 23 February 2026 (UTC)
+1 sapphaline (talk) 09:46, 7 March 2026 (UTC)
- Contributors here may be interested in the talkpage of this as well, User talk:TomWikiAssist. CMD (talk) 13:17, 12 March 2026 (UTC)
- Following the conclusion of that talkpage discussion, whether it was an elaborate roleplay or not, it does not seem practical to apply OUTING concerns to what an AI agent may reveal. An individual knowingly setting up an AI agent is responsible for their output, and especially for their contributions here. This is not the same as a third-party editor posting personal information obtained from an external site. CMD (talk) 02:52, 13 March 2026 (UTC)
Arbitrary Section Break: WMF needs your ideas
Hi all! I’m Sonja and I lead the contributor product teams (so Editing, Growth, Moderator Tools, Connections, as well as Language and Product Localization) at WMF. I’d like to take a step back and reflect again on the broader issue this thread is raising: Over the last year especially, we’ve had many discussions on how already big backlogs are increasing to unsustainable sizes because AI is making it easier for everyone to add content. At the same time we continue to see declines in active editors, leading again to larger backlog sizes. Only looking at one of these core problems without looking at the other is no longer an option at this point if we want to ensure the sustainability of the projects.
That being said, I see it as WMF’s role to both provide the tools to support and grow our ranks of editors and help experienced editors keep our content accurate, trustworthy, and neutral. The question is: how can we do that in a way that’s not overwhelming? Or said differently: what tools do we need to provide you all with to ensure that backlog sizes don’t keep increasing, even as we bring on new generations of volunteers? We’ve also touched on this in our discussion on meta as part of our annual planning process, and folks like @TheDJ , @pythoncoder, and lots of others helpfully chimed in with their perspectives. One of the requests we’ve heard the most often is building tools to identify AI slop - this is something we’re already working on but it can only do so much as the quality and sophistication of AI tools changes. So what I’d really like to know is, from your perspectives what other tools or processes could WMF build to keep up with the challenges we’re facing today? SPerry-WMF (talk) 19:12, 25 February 2026 (UTC)
- If we're talking about detecting AI-generated content, then I can't think of anything that would be more useful than a tool to detect common AI patterns; if we're talking about unauthorized bot use, then there are already rate limits and hcaptcha in place. sapphaline (talk) 20:36, 25 February 2026 (UTC)
- Talking about unauthorized bot use, maybe there could be some software in place to intentionally waste their power or bandwidth? Like Anubis, a script to completely hammer their CPU, or something different. sapphaline (talk) 20:44, 25 February 2026 (UTC)
- There's MediaWiki:Editcheck-config.json. Something assisting that could be commissioning research to determine AI signs for some of the recent models (Gnomingstuff said our current signs are largely from GPT-4). Also phab:T399642 for flagging WP:V failures Kowal2701 (talk, contribs) 21:31, 25 February 2026 (UTC)
There's MediaWiki:Editcheck-config.json
- @Kowal2701: thank you for sharing this here. There's also the newly-introduced Special:EditChecks. This page offers a more more visual view of the Edit Checks and Suggestions that are currently available. The suggestions that appear within the "Beta features" section of that page are available if you enable "Suggestion Mode" in beta features. Note: one of the experimental suggestions available via Suggestion Mode leverages Wikipedia:Signs of AI writing to highlight text that may include AI-generated content. PPelberg (WMF) (talk) 23:39, 25 February 2026 (UTC)
- To clarify: With the caveat that we virtually never know which exact LLMs people use and whether they enabled "research mode" or whatever, our current signs are skewed toward 2024-era LLM text (GPT-4o, o1, etc), with a few historical ones (GPT-4) and one or two that are common in newer text.
- The real problem with writing this page, though, is to write it in a way that people will A) believe, B) not misinterpret, and C) not see as the main problem. With "promotional tone," for instance, that isn't totally accurate; there's a way in which AI writes promotional text, that is distinct from pre-AI promotional text. With the "AI vocabulary" section much of it is used in specific parts of a sentence more than others, etc. The less specific you are, the more people will misinterpret; but the more granular you are, the less likely people are to believe you. Gnomingstuff (talk) 09:07, 3 March 2026 (UTC)
- This feels important enough to merit marshalling some funds for some sort of in-person workshop (or at minimum a concerted effort, with outreach, to pull stakeholders into a call of some kind, rather than a subsection of a more generalized forum that will then be hidden in an archive). I know this board in particular is likely to receive a bunch of "wiki stuff should stay on-wiki" comments, but diffuse, complicated, multistakeholder conversations are just difficult to have on-wiki sometimes, and tend towards splintering, hijacking, and tangents in ways a focused events could avoid. I dare say it would also make sense to hold at least some of these conversations at a project-by-project level. Enwiki, for example, already has an awful lot of resources, guidelines, RfC decisions, a wikiproject, etc. and probably deals with a different quantity of AI-generated content than most other projects. Commons, for its part, has its own distinct needs and constraints. YMMV. — Rhododendrites talk \\ 21:26, 25 February 2026 (UTC)
- Hi @Rhododendrites, great idea. We do regular calls on the enwp Discord where we discuss early-stage product features and brainstorm ideas together and this would be a perfect topic to talk through together. We've just scheduled a call for March 18, 20:30 UTC to focus on this topic. Would love to see you there, along with anyone else reading this thread. SPerry-WMF (talk) 15:45, 27 February 2026 (UTC)
- Thanks a lot for bringing up that question! I believe that the Edit Check team is doing a great job in this direction already, and, beyond that, something that could help would be to make it more intuitive for editors to edit without relying on third-party AI tools (which give convincing results but are prone to hallucinations). For example, parsing the content of the edit and suggesting potential sources (that could be added to the edit text in one click), or evaluating the quality of existing sources. Getting an edit reverted for being unsourced can be a very frustrating first experience, and I believe it is a major roadblock towards editor retention, so anything that helps editors do this more intuitively could really help them not turn towards the authoritative-sounding promises of generative LLMs. Chaotic Enby (talk · contribs) 21:31, 25 February 2026 (UTC)
- Thanks for these comments.
- Re: Helping to remind editors/newcomers to add sources, Reference Check now does this and was deployed by default here on Enwiki just two weeks ago (cf. thread), plus the Suggestion Mode (currently a Beta Feature, cf. announcement) has a suggestion-type that highlights existing un-cited paragraphs. As always, feedback on that Beta Feature would be greatly appreciated, so that all aspects of it can be further refined/improved before it is shown to actual newcomers.
- Re: "evaluating the quality of existing sources" - As Kowal2701 notes above, T399642 [Signal] Identify cases where reference does not support published claim is something we're planning on working on very soon, and are still gathering data/references/ideas for. There's also the closely related idea of T276857 Surface Reference survival signal which proposes providing information to editors (and perhaps readers) about how some sites/sources might need deeper consideration before they use them as references. If anyone has additional tools or info for those tasks, please do share.
- Re: "parsing the content of the edit and suggesting potential sources" - I believe that idea is immensely more complicated, especially to do so reliably, and I'm not aware of any current WMF work/notes towards it, though I have seen some other editors mention it as a potential future goal once LLMs improve sufficiently.
- HTH. Quiddity (WMF) (talk) 00:16, 26 February 2026 (UTC)
- Thanks again, great to know all of these! Chaotic Enby (talk · contribs) 00:36, 26 February 2026 (UTC)
- Love this—exactly the sort of AI-powered tools I've been advocating for in other discussions about this. Anything that can do quick checks or flag possible issues for editors has potential to be helpful. I imagine newer editors would use features more like Suggestion Mode while experienced editors would use tools more like Signal. I have reservations about LLM detectors since they have a poor track record elsewhere, but something narrowed specifically to Wikipedia's purpose might be worth exploring. I'm not against adding things that are visible to readers, but it would need to be very unintrusive; otherwise it will become a source of annoyance and mockery for readers like the donation banners. Thebiguglyalien (talk) 05:24, 27 February 2026 (UTC)
- Coming back to the question "what other tools or processes could WMF build to keep up with the challenges we’re facing today?": aside from ideas related to AI, what other tools could help editors deal with the backlogs currently being created by newcomers? I'm especially thinking about backlogs that newcomers could potentially help with (at both Enwiki and globally), but also backlogs that require more experience. Are there more large-scale ideas that should be added for consideration in next year's annual plan? Is there anything missing that you think could have a big impact on these problems? SPerry-WMF (talk) 03:14, 6 March 2026 (UTC)
- @SPerry-WMF Hello! What the community desperately needs is meta:Community_Wishlist/W448 and meta:Community_Wishlist/W449 and meta:Community_Wishlist/W450. These 3 proposals would save an tremendous amount of time. Polygnotus (talk) 20:29, 6 March 2026 (UTC)
Why aren't we using Perma.cc?
Inspired by the recent archive.today drama, I now have the same question as this HN commenter: why aren't we using Perma.cc for web archiving?
Based on my understanding, the process would be something like this:
- WMF will pay Perma.cc so that anyone with a Wikipedia account meeting the same threshold Wikilibrary has can archive an unlimited/very high amount of pages monthly or annually.
- Automated archives will continue to be made on Wayback Machine.
- Perma.cc uses the same technology as Ghostarchive so captures are very high-fidelity; you can also upload PDF files and webpages as a screenshot if it can't crawl them. Unfortunately it doesn't provide options to archive audio or video files.
This seems like the perfect solution to our web archiving needs when Wayback Machine isn't enough. Could WMF work in this direction? sapphaline (talk) 15:23, 22 February 2026 (UTC)
- @Sapphaline Hi - I work on The Wikipedia Library at the Wikimedia Foundation, so I'm curious to learn more about this suggestion. We have partnered with organisations outside the typical paywalled-research category in the past (e.g. a translation website), so it's feasible that we could reach out to Perma.cc about this. I wanted to learn a bit more about this first though - when you say "when Wayback Machine isn't enough", could you be more specific? What is it that using Perma.cc would allow you to do that Internet Archive doesn't? Samwalton9 (WMF) (talk) 16:30, 24 February 2026 (UTC)
- Archive.today is usually a lot better than the Wayback Machine at archving. Their archives sort of "freeze" the page, making their archives of e.g. Instagram work. They are also known for bypassing paywalls partly through giving the crawler subscriptions to the websites. I think @GreenC would explain this a lot better than I can. Aaron Liu (talk) 16:52, 24 February 2026 (UTC)
- @Aaron Liu Is that 'freezing' something that Perma.cc also does better than Wayback Machine? Samwalton9 (WMF) (talk) 17:23, 24 February 2026 (UTC)
- Sam, it's good that you are listening to volunteers, but it would be best, before any decision is made, if you could look at the whole market. There seem to be plenty of players in this space. Maybe Perma.cc offers the best service for the price, But we shouldn't just go for the first option suggested without checking first. Phil Bridger (talk) 18:06, 24 February 2026 (UTC)
- That totally makes sense - I'm only asking about Perma.cc because it was the option proposed here, I'd like to understand what makes an archiving service good or bad, since I don't know very much about the options! Samwalton9 (WMF) (talk) 19:40, 24 February 2026 (UTC)
- Sam, it's good that you are listening to volunteers, but it would be best, before any decision is made, if you could look at the whole market. There seem to be plenty of players in this space. Maybe Perma.cc offers the best service for the price, But we shouldn't just go for the first option suggested without checking first. Phil Bridger (talk) 18:06, 24 February 2026 (UTC)
- @Aaron Liu Is that 'freezing' something that Perma.cc also does better than Wayback Machine? Samwalton9 (WMF) (talk) 17:23, 24 February 2026 (UTC)
- "What is it that using Perma.cc would allow you to do that Internet Archive doesn't" - Wayback Machine usually fails at archiving JavaScript-heavy websites, e.g. Mastodon. There's no option to upload a webpage manually - if Wayback Machine's crawler can't get it, it's unarchivable. It's also possible to directly download a webpage from Perma.cc in archived format (.warc) without using third-party tools like SingleFile. sapphaline (talk) 17:27, 24 February 2026 (UTC)
- Archive.today is usually a lot better than the Wayback Machine at archving. Their archives sort of "freeze" the page, making their archives of e.g. Instagram work. They are also known for bypassing paywalls partly through giving the crawler subscriptions to the websites. I think @GreenC would explain this a lot better than I can. Aaron Liu (talk) 16:52, 24 February 2026 (UTC)
- Note: Perma.cc WARCs are uploaded to the Internet Archive and indexed by Wayback Machine (due to being under the 'web' collection). Obviously still affected by exclusions and there's currently a backlog since they turned it off when IA went down in 2024 but just noting. --Nintendofan885T&Cs apply 22:42, 24 February 2026 (UTC)
My bot has needed to remove many perma.cc links over the years. A significant percentage of them have stopped working. It's also my understanding their target audience are institutional clients (courts, journalists, scholars) and the archive are not for public viewing ie. you need a login/pass to view them. For example the NY District Court may have an account and where they upload millions of captures and you need a pass to view them. They do offer public access accounts but I don't think they are very interested in hosting copies of The Guardian there and if you do good chance they won't last. They seem to offload (some?) WARCs to the Wayback Machine probably as a backup option but in that case you might as well use the Wayback Machine. They are appear to be trying to keep a low profile on the legal radar. All web archives face this fundamental problem of copyright and there are only a couple strategies. Archive.today is the king of the judiciary arbitrage strategy nobody does it better it was a major loss there are no peers. -- GreenC 05:23, 25 February 2026 (UTC)
- If Perma.cc links is able to be viewed by the public, then this is a good idea. Guz13 (talk) 23:34, 27 February 2026 (UTC)
- Funnily enough, myself and L235 just independently came up with a similar idea of using perma.cc with the Wikipedia library/some sort of gated way, which we mentioned to Eric Mill. I'm not sure that using perma.cc solves all our problems, but it could be part of a multifaceted solution. I think Kevin has a better sense of the upsides of using perma.cc so hopefully he chimes in ;) CaptainEek Edits Ho Cap'n!⚓ 21:40, 28 February 2026 (UTC)
Database server lag
What triggered this message:
Due to high database server lag, changes newer than N seconds may not appear in this list.
? sapphaline (talk) 11:01, 3 March 2026 (UTC)
- Better to post this kind of thing at WP:VPT. Looks like phab:T418839. Looks fixed now. –Novem Linguae (talk) 11:23, 3 March 2026 (UTC)
Wikimedia Foundation Bulletin 2026 Issue 4


Highlights
Let's Talk continues
- Birthday mode: This limited-time campaign feature celebrates 25 years of Wikipedia with a birthday mascot, Baby Globe. When turned on, Baby Globe is shown on ~2,500 articles, waiting to be discovered by readers. The feature is available for all Wikipedias to customise through Community Configuration until 6 April 2026. So far 17 Wikipedias have joined in the fun.
- Wikipedia's 25th birthday party celebrated on Commons: Content from the January 15th global birthday party selected as Media of the day.
Annual Goals Progress on Infrastructure
See also newsletters: Wikimedia Apps · Growth · Product Safety and Integrity · Readers · Research · Wikifunctions & Abstract Wikipedia · Tech News · Language and Internationalization · other newsletters on MediaWiki.org
- Etherpad cleanup: For security and performance reasons, all current pads on Wikimedia’s Etherpad instance, the web-based "ephemeral" editor for real-time collaborative document editing, will be permanently deleted after 30 April. We will continue running this Etherpad instance to support events and other short-term collaboration, but will be periodically deleting data going forward. If you have content in Etherpad you want to keep, please create local backups, as data will be permanently deleted and will not be able to be recovered.
- Activity tab: Wikipedia iOS app has rolled out the improved Activity tab to all users in version 7.9.0. A/B test results showed increased account creation among users with access to the feature. Updates include enhanced editing impact insights, module customization, and relocation of History into the Search tab.
- Reference Check: The feature Reference Check has been deployed to all Wikipedias. In A/B testing, the impact was substantial: newcomers shown Reference Check were approximately 2.2 times more likely to include a reference on desktop (or acknowledge/explain why they did not) and about 17.5 times more likely on mobile web.
- Semantic search: The Foundation has launched a limited Android mobile app experiment that tests hybrid search capabilities which can handle both semantic and keyword queries. The Phase 1 beta is now live on Greek Wikipedia. The goal is to understand whether combining meaning-based retrieval with keyword search helps readers find information more effectively. Testing will expand to English, French, and Portuguese Wikipedias in March.
- Navigation experience: The Foundation will run an experiment for mobile web users, that adds a table of contents and automatically expands all article sections, to learn more about navigation issues they face. The test will be available on Arabic, Chinese, English, French, Indonesian, and Vietnamese Wikipedias.

- Site notices: Site notices (MediaWiki:Sitenotice and MediaWiki:Anonnotice) now will render on all platforms, not just on the desktop site. Users on mobile web will now see these notices and be informed.
- Tech News: Latest updates from Tech News week 08 and 09 include the new “Edit full page” button for people who are editing a page-section using the mobile visual editor. They also link to the 40 community submitted tasks that were resolved over the last two weeks.
- Wikifunctions: Abstract Wikipedia is going to have its public preview within the next few weeks, here is the preview.
Annual Goals Progress on Volunteer Support
See also blogs: Global Advocacy blog · Global Advocacy Newsletter · Policy blog · WikiLearn News · The Wikipedia Library · list of movement events
- Gender gap: The Celebrate Women 2026 is coming! The Wikimedia Foundation will host a kick-off celebration that will work as a welcome session for both organizers and participants on March 5 at 13:00 UTC.
- Language: New edition of the Language and internationalization newsletter highlights new feature developments and improvements in various language-related technical projects.
- Let’s Connect Learning Clinic: Watch the recordings of past learning clinics about Wikipedia’s 25th Birthday Tool and Strengthening Local-Language Admin Communities.
- Wikimedia Research Showcase: You can watch the recording of this month research showcase whose theme is about "AI and Communities".
- Hubs: Lessons from hub pilots.
- Banners & logo policies: Wikimedia Foundation has adopted new and updated policies regarding the use of banners, logo changes, and blackouts on the projects, particularly for advocacy purposes.
- Digital Safety: The next edition of Digital Safety Office Hours will be on Mar 27 at 9:00 UTC and 19:00 UTC. The session will explore practical threat modelling: a structured way to think about risks, assess your exposure, and make informed choices.
Annual Goals Progress on Effectiveness
See also: Progress on the annual plan
- Wikimedia Enterprise: Ecosia Enriches Search Results and AI Answers with Wikimedia Enterprise.
- Human centered AI: Members of the Wikimedia Enterprise team presented on "Wikipedia in the Age of AI and Bots" at the seminar of Stanford’s Institute for Human-Centered Artificial Intelligence.
- Inclusive AI: Advancing Open, Inclusive AI with Free and Open Knowledge at the India AI Impact Summit 2026.
Other Movement curated newsletters & news
See also: Diff blog · Goings-on · Planet Wikimedia · Signpost (en) · Kurier (de) · Actualités du Wiktionnaire (fr) · Regards sur l’actualité de la Wikimedia (fr) · Wikimag (fr) · Education · GLAM · Milestones · Wikidata · Central and Eastern Europe · other newsletters
Subscribe or unsubscribe · Help translate
For information about the Bulletin and to read previous editions, see the project page on Meta-Wiki. Let foundationbulletin
wikimedia.org know if you have any feedback or suggestions for improvement!
MediaWiki message delivery 12:36, 3 March 2026 (UTC)
What happened?
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Closing prematurely to avoid having two identical discussions per WP:MULTI, please see the discussion at Wikipedia:Village pump (technical) § Meta-Wiki compromised instead FaviFake (talk) 17:25, 5 March 2026 (UTC)
Editing was disabled for over an hour, while in Meta-Wiki, the foundation was editing many people's JS pages. Is there a reason why? Nighfidelity (talk) 17:15, 5 March 2026 (UTC)
Wikimedia Foundation banner fundraising campaign in Malaysia
Dear all,
I would like to take the opportunity to inform you all about the upcoming annual Wikimedia Foundation banner fundraising campaign in Malaysia on English Wikipedia only.
The fundraising campaign will have two components.
- We will send emails to people who have previously donated from Malaysia. The emails are scheduled to be sent throughout March.
- We will run banners for non-logged in users in Malaysia on English Wikipedia itself. The banners will run from the 2nd to the 30th of June 2026.
Prior to this, we are planning to run some tests, so you might see banners for 3-5 hours a couple of times before the campaign starts. This activity will ensure that our technical infrastructure works.
Generally, before and during the campaign, you can contact us:
- On the talk page of the fundraising team
- If you need to report a bug or technical issue, please create a phabricator ticket
- If you see a donor on a talk page, VRT or social media having difficulties in donating, please refer them to donate at wikimedia.org
Thank you and regards, ~~~~ JBrungs (WMF) (talk) 10:57, 9 March 2026 (UTC)
