The Wayback Machine (pronounced [ˈwejbak maˈʃin]) is a service and database that contains copies of a large number of Internet pages or sites. As a consequence of this project, it is also possible to consult the history or modifications of the pages over time.
The Wayback Machine service works in a simple way: just type the address (URL) of a web page to see which is the last copy that is saved in the file. If we want to see what a page was like a while ago on question, then the Wayback Machine will ask us on what date and year we want to visit the page. There is a calendar at the top of the screen, which allows you to graphically view the captures over time. The length of the bars in the graph indicates in which months the most copies were made.
What this site does to store all the content of the web page is very simple but ingenious: it stores only the html content of the source code and does not store the images but only the code; therefore, when an image server removes an image from the original website, it is not reproduced, but is instead marked as a 404 error. In 2012 it contained 10 petabytes of information and was growing by around 20 terabytes per month, In October 2019 its storage exceeded 20 petabytes. In December 2020 its storage exceeded 70 petabytes.
However, the Wayback Machine is far from being a complete copy of the Internet, as several sites avoid indexing and recording information, such as using the robots.txt file with:
User-agent: ia_archiver Disallow: /
Wayback CDX Server API
Since November 2015 Wayback Machine has had a page capture indexing service that allows, quickly, to know the recording history for each URL both in its own format and in JSON format. This service is a project in beta phase called Wayback CDX Server API and whose source code and user manual is hosted on GitHub.
Throughout its history, the Wayback Machine has presented a series of incidents, of which the most important have been the following:
In January 2017 they developed a plugin, both for the Chromium browser and Google Chrome, that allows you to save the web page that is being viewed, search if that page has been previously saved in the Wayback Machine and even allows you to do a search fast on Twitter, among other features. The plugin is properly cataloged in the Chrome Web Store. In October 2020, the Wayback site stopped working.
Official website (in English)