Archive.today
Web archive
From Wikipedia, the free encyclopedia
archive.today (formerly archive.is) is a web archiving website that saves snapshots on demand. It has support for JavaScript-heavy sites such as Google Maps and X.[2] archive.today records two snapshots: one replicates the original webpage including any functional live links; the other is a screenshot of the page.[3]
Screenshot of the archive.today home page | |
Type of site | Web archiving |
|---|---|
| Available in | Multilingual |
| Registration | No |
| Launched | 16 May 2012[1] |
History
archive.today was founded in 2012 as a web archive. It allegedly registered its trademark in the Czech Republic in 2013.[4] The site originally branded itself as archive.today, but changed the primary mirror to archive.is in May 2015.[5] It began to deprecate the archive.is domain in favor of other mirrors in January 2019.[6] According to the archive.today blog, the website had saved about 500 million pages by 2021,[7][8] 700 terabytes in total size.[9]
In July 2013, archive.today began supporting the API of the Memento Project at Los Alamos National Laboratory.[10][11] Due to budget constraints at LANL, the Memento Project was disestablished in September 2025.[citation needed]
In 2018, the Netherlands-based investigative journalism group Bellingcat recommended archive.today as a tool for preserving evidence from open-source investigations.[12]
In early 2023, a team of researchers at the University of Amsterdam identified archive.today as the most-used open-access archiving service among fact-checking organisations, based on the European Digital Media Observatory's dataset on the Russo-Ukrainian war.[13][14]
In August 2023, the Wikitravel Press co-founder and Google Cloud executive Jani Patokallio published an investigation on his blog Gyrovague regarding archive.today's funding sources and the founder's identity. He portrayed the website as a paywall bypassing service and compared it to Alexandra Elbakyan's Sci-Hub.[8][15] He alleged that archive.today employed a botnet and suggested that it was based in Russia.[8][16][17] A different investigation in 2024 alleged that a New York City software developer was the website's operator.[16]
In early October 2025, the Cypriot ad blocking company AdGuard, a DNS provider to archive.today, received demands from a newly registered opaque French entity, called Web Abuse Association Defense, to block all of the archive.today domains. According to AdGuard, the entity tried to exploit the French Bill for a Digital Republic with the use of forged legal records and falsely alleged that archive.today had failed to act on requests to remove archived child pornography content.[18][19]
On 30 October 2025, the US Federal Bureau of Investigation (FBI) subpoenaed archive.today's Canadian domain registrar, Tucows. The subpoena stated its purpose was to identify the owner(s) of the archive.today domain name, and that it was part of a criminal investigation conducted by the FBI, the nature of which was not disclosed.[20][16] The agency instructed Tucows not to disclose the subpoena, but archive.today obtained and published it.[21] Various commentators noted that the order followed the shutdown of the paywall bypassing service 12ft by the News Media Alliance in July 2025 and the removal of 749 million Anna's Archive URLs from Google search results.[22][17][21][23] The Catalan daily Ara interpreted the action as part of a campaign to selectively criminalize anonymous digital archives reliant on micro-donations (such as Anna's Archive), even though industrial datasets used for training large language models (such as the Common Crawl, financed by OpenAI and Anthropic) also fail to compensate content creators and owners.[9] News coverage of the subpoena mentioned Patokallio's report.[15][24]

On 10 January 2026, the website operator asked Patokallio to take down his 2023 blog entry about archive.today "for a few months".[25] Around 11 January 2026,[24] archive.today inserted malicious JavaScript code in its CAPTCHA page to involve visitors in a DDoS attack against Gyrovague. Over the following weeks, the archive.today blog posted public criticisms of Patokallio, accusing him of doxing the website's operator, and engaged in personal attacks against him. Emails released by Patokallio include archive.today threatening him with AI pornography.[15][25] On 20 February 2026, the English Wikipedia banned links to archive.today, citing the DDoS attack against Patokallio and evidence that archived content was tampered with to insert Patokallio's name.[26] The decision was weighed against concerns about maintaining content verifiability,[26] as archive.today was the second-largest archiving service used across the Wikimedia Foundation's projects.[27] The Wikimedia Foundation had stated its readiness to take action regardless of the community verdict.[26][27] Patokallio expressed his satisfaction with the outcome.[4] The website operator said they preferred the attention resulting from Wikipedia's "fifth" ban to the website being likened to 12ft, and announced their intention to "scale down 'the DDoS'".[28][29]
Features
Archiving
archive.today can capture individual pages in response to explicit user requests.[30][31][32] Since its beginning, it has supported crawling pages with URLs containing the now-deprecated hash-bang fragment (#!).[33] The website records only text and images, excluding XML, RTF, spreadsheet (xls or ods) and other non-static content. However, videos for certain sites, like Twitter, are saved.[34] It keeps track of the history of snapshots saved, requesting confirmation before adding a new snapshot of an already saved page.[35][36] Once a web page is archived, it cannot be deleted directly by any Internet user.[37] Users can download archived pages as a ZIP file, except pages archived since 29 November 2019,[update][38] when archive.today changed their browser engine from PhantomJS to Chromium (non-headless).[39] archive.today does not obey robots.txt because it acts "as a direct agent of the human user."[32]
Pages are captured at a browser width of 1,024 pixels. CSS is converted to inline CSS, removing responsive web design and selectors such as :hover and :active. Content generated using JavaScript during the crawling process appears in a frozen state.[40]
HTML class names are preserved inside the old-class attribute.
When text is selected, a JavaScript applet generates a URL fragment seen in the browser's address bar that automatically highlights that portion of the text when visited again.[citation needed] Web pages can be duplicated from archive.today to web.archive.org as second-level backup, but archive.today does not save its snapshots in WARC format. The reverse—from web.archive.org to archive.today—is also possible,[41] but the copy usually takes more time than a direct capture.

While saving a page, a list of URLs for individual page elements and their content sizes, HTTP statuses and MIME types is shown. This list can only be viewed during the crawling process.[citation needed] Removing advertisements, popups or expanding links from archived pages is possible by asking the owner to do it on his blog.[42]
Search
The research toolbar enables advanced keywords operators, using * as the wildcard character. Paired quotation marks address the search to an exact sequence of keywords present in the title or in the body of the webpage, whereas the insite operator restricts it to a specific Internet domain.[43] While saving a dynamic list, archive.today search box shows only a result that links the previous and the following section of the list (e.g. 20 links for page).[44] The other web pages saved are filtered, and sometimes may be found by one of their occurrences.[35][clarification needed] The search feature is backed by Google CustomSearch. If it delivers no results, archive.today attempts to utilize Yandex Search.[45]
Bypassing paywalls
archive.today users often employ the service to bypass paywalls, similarly to the defunct website 12ft.[20][46]
Worldwide availability
Australia and New Zealand
In March 2019, the site was blocked for six months by several internet providers in Australia and New Zealand in the aftermath of the Christchurch mosque shootings in an attempt to limit distribution of the footage of the attack.[47][48]
China
According to GreatFire.org, archive.today has been blocked in mainland China since March 2016,[update][49] archive.li since September 2017,[update][50] archive.fo since July 2018,[update][51] as well as archive.ph since December 2019.[update][52]
Finland
On 21 July 2015, the operators blocked access to the service from all Finnish IP addresses, stating on Twitter that they did this in order to avoid escalating a dispute they allegedly had with the Finnish government.[53][54]
Russia
In 2016, the Russian communications agency Roskomnadzor began blocking access to archive.is from Russia.[55][56][54]
See also
- Digital preservation – Practice to keep digital assets accessible in long term
- Link rot – URLs ceasing to function
- List of web archiving initiatives