Archive of webpage

From LemonWiki共筆
Revision as of 22:58, 16 February 2024 by Planetoid (talk | contribs)
Jump to navigation Jump to search

Archive of webpage for backup purpose (中文版:網頁歸檔)

🌐 Switch language: EN, 漢字



Comparing the Article Backup Results of Different Social Media Websites

Medium:

  • Wayback Machine: Possible backup failure, successful backups may lack images. Examples of failures[1] and successes[2] are given.
  • Webpage archive: Possible successful backup (link), with examples including one with blurred images [3].
  • Perma.cc: Shows an example of a successful backup.
  • historio: Notes on backup visibility issues due to CSS conflicts.
  • Diigo (private access): Notes on reading backups in mhtml format.

PTT:

  • Wayback Machine: Mentions partial success with restrictions due to adult content warnings[4][5].
  • Webpage archive: Successful backup.
  • Perma.cc: Backup failure due to 18+ warnings.
  • historio: Successful backup.
  • Diigo (private access): Successful backup.

Facebook:

  • Wayback Machine: Backup results in a login screen, even when set to public.
  • Webpage archive: Error message "Not Found (yet?)"
  • Perma.cc: "You’re Temporarily Blocked" message.
  • historio: Using bookmarklet had no effect, backup was not successful.
  • Diigo (private access): Reading backups in mhtml format. ​

Dcard

  • Wayback Machine: Backup failed due to HTTP 403 error.
  • Webpage archive: Backup failed[6]
  • Diigo (private access): Reading backups in mhtml format. ​

YouTube

  • Wayback Machine: (1) Videos cannot be played, (2) Comments are not visible [7]
  • Archive Today: (1) Videos cannot be played, (2) Comments are visible [8]


Desktop tools

check approach filetype cached media (images, flash...) clickable text embeded with links kept the saved time* kept the original URL Comments
Fx 2.0: Save as HTML (kept images) html saved with another directory yes yes no
Fx 2.0: Save as HTML (html only) html no yes yes no
Fx 2.0 + ScrapBook 1.2 html saved with another directory yes yes* yes
Fx 1.5 + MAF 0.6.3: Save as MAF MHT Archive mht embeded into a single file yes yes yes
Fx 2.0 + Google Toolbar for Firefox 3: Send with Gmail html no, they use the original URL of media yes yes yes
IE 6.0.x: Save as MHT mht embeded into a single file yes yes yes
Acrobat PDFMaker 7.0.5 pdf embeded into a single file yes yes yes
Print to Adobe Acrobat Printer pdf embeded into a single file no yes yes
Print to pdfFactory Pro v2.45 pdf embeded into a single file no yes yes
IE + Adobe Acrobat 7: Convert web page to PDF pdf embeded into a single file yes yes no
Unipage Unifier 1.0 RC3(kept images or flash...) html embeded into a single file yes yes no


Online services

check approach filetype cached media (images, flash...) clickable text embeded with links kept the saved time* kept the original URL Information organization / Comments
BackupUrl.com (cache image) html yes yes yes yes no (visited: 2009-04-09)
Evernote Web (no cache image) html no, they use the original URL of media yes yes yes tags; It also offer the sync software ((visited: 2008-03-29))
Furl (no cache image) html no, they use the original URL of media yes yes yes Topic (tags)
Yahoo My Web 2.0 Beta (no cache image) html no, they use the original URL of media yes yes yes tags
Google Notebook (no cache image) html no, they use the original URL of media yes yes yes tags
"Jump" Knowledge html no, they use the original URL of media yes yes yes You can annotate the webpages, and share the link to others.
toread (no cache image) html(Email) no, they use the original URL of media(written in related path will appear normally) yes yes yes
WebCite(access error: 2007-05-07) html no, they use the original URL of media(written in related path will appear normally) yes yes yes You can browse or backup the same page at different time.

About kept the saved time: Most files already have this property. It varied easily if we saved to different storage media or FTP to another location. But the solution of Fx 1.5 + ScrapBook 0.18.4 saved this property with another function (metadata).


Winner is Firefox!