User talk:Headbomb/unreliable
From Wikipedia, the free encyclopedia
| If you're curious about why a source is highlighted, first check common cleanup and non-problematic cases and limitations, which should answer most questions. Feel free to make requests for various tweaks or more sources to be covered below and I'll address things as best I can. − Headbomb {t · c · p · b} |
A flaw in patterns
Hi again, Headbomb. As it stands, the current patterns unintentionally catch all domain names that include the target strings, e.g. there is cbn.com on your list, but it catches the unaffiliated cbn.com.cy. Perhaps adding a slash to each pattern would solve this? Daisy Blue (talk) 12:04, 22 February 2026 (UTC)
Source verification
Hi @Headbomb, I'm a long-time user of this tool and also a co-author of User:Alaexis/AI Source Verification. I'm wondering if you're planning any new features? Alaexis¿question? 13:05, 6 April 2026 (UTC)
Blogspot.com
There are versions of Blogspot links ending with ccTLDs (like .ca or .de) as well, e. g. *.blogspot.ca and *.blogspot.de exist. My proposal is to replace "blogspot\.com" with "blogspot(\.\w+)+". Alfa-ketosav (talk) 11:31, 16 April 2026 (UTC)
AI hallucinated sources
Hi Headbomb, quick question on scope: is there any plan for UPSD to help detect AI-hallucinated sources, for example made-up ISBNs or legitimate-looking source links that just go to 404s in newly expanded or destubbed articles? Is that something UPSD might eventually cover, or does it really belong in a separate tool? HerBauhaus · talk 11:35, 29 April 2026 (UTC)
- There's no way for the script to detect those. There's no difference between an AI hallucinated source, a human-made non-existant source, and a real source that's simply hard to find. For example, if I tell you check out Kurgzwell, Bob-Étienne (1867). The Book of Women With Horrible Ugly Moles on their Left Knee. Alan & Francis. pp. 67–85. LCCN 80-570., did I make it up? Did an AI hallucinate it? How is someone supposed to know this doesn't exist or that I've made up the LCCN, making sure it has a correct checksum? Headbomb {t · c · p · b} 21:25, 29 April 2026 (UTC)
- The made-up ISBNs, specifically, can often be detected through Category:CS1 errors: ISBN and through the error message "Check |isbn= value" produced by the citation templates for these errors. You might be careful to make up a fake id with a correct checksum, but that is a level of care often not taken by the LLMs. But because the citation templates already flag these I don't see a lot of motivation for adding them to this script. —David Eppstein (talk) 22:58, 29 April 2026 (UTC)
- Thanks, that makes sense. For the narrower problem of reliable looking source links that are dead, broken, or point somewhere unrelated, is there already a tool for that, or is it still mostly a manual check? HerBauhaus · talk 05:46, 30 April 2026 (UTC)
- There are bots that tag deadlinks, but I don't know whether there is any automation that finds wronglinks. —David Eppstein (talk) 07:22, 30 April 2026 (UTC)
- Thanks, that makes sense. For the narrower problem of reliable looking source links that are dead, broken, or point somewhere unrelated, is there already a tool for that, or is it still mostly a manual check? HerBauhaus · talk 05:46, 30 April 2026 (UTC)
- The made-up ISBNs, specifically, can often be detected through Category:CS1 errors: ISBN and through the error message "Check |isbn= value" produced by the citation templates for these errors. You might be careful to make up a fake id with a correct checksum, but that is a level of care often not taken by the LLMs. But because the citation templates already flag these I don't see a lot of motivation for adding them to this script. —David Eppstein (talk) 22:58, 29 April 2026 (UTC)