Archive of webpage

From LemonWiki共筆
Jump to navigation Jump to search

Archive of webpage for backup purpose

🌐 Switch language: EN, 漢字



Comparing the Article Backup Results of Different Social Media Websites[edit]

Medium:

  • Wayback Machine: Possible backup failure, successful backups may lack images. Examples of failures[1] and successes[2] are given.
  • Webpage archive: Possible successful backup (link), with examples including one with blurred images [3].
  • Perma.cc: Shows an example of a successful backup.
  • historio: When loading the backup, the content is visible for a few seconds, but it seems to conflict with CSS, resulting in a blank display. Using the mhtml format is necessary to read the backup.
  • Diigo (private access): Notes on reading backups in mhtml format.

PTT:

  • Wayback Machine: Mentions partial success with restrictions due to adult content warnings[4][5].
  • Webpage archive: Successful backup.
  • Perma.cc: Backup failure due to 18+ warnings.
  • historio: Successful backup.
  • Diigo (private access): Successful backup.

Facebook:

  • Wayback Machine: Backup results in a login screen, even when set to public.
  • Webpage archive: Error message "Not Found (yet?)"
  • Perma.cc: "You’re Temporarily Blocked" message.
  • historio: Using bookmarklet had no effect, backup was not successful.
  • Diigo (private access): Reading backups in mhtml format. ​

Dcard

YouTube


Desktop tools[edit]

check approach filetype cached media (images, flash...) clickable text embeded with links kept the saved time* kept the original URL Comments
Fx 2.0: Save as HTML (kept images) html saved with another directory yes yes no
Fx 2.0: Save as HTML (html only) html no yes yes no
Fx 2.0 + ScrapBook 1.2 html saved with another directory yes yes* yes
Fx 1.5 + MAF 0.6.3: Save as MAF MHT Archive mht embeded into a single file yes yes yes
Fx 2.0 + Google Toolbar for Firefox 3: Send with Gmail html no, they use the original URL of media yes yes yes
IE 6.0.x: Save as MHT mht embeded into a single file yes yes yes
Acrobat PDFMaker 7.0.5 pdf embeded into a single file yes yes yes
Print to Adobe Acrobat Printer pdf embeded into a single file no yes yes
Print to pdfFactory Pro v2.45 pdf embeded into a single file no yes yes
IE + Adobe Acrobat 7: Convert web page to PDF pdf embeded into a single file yes yes no
Unipage Unifier 1.0 RC3(kept images or flash...) html embeded into a single file yes yes no


Online services[edit]

check approach filetype cached media (images, flash...) clickable text embeded with links kept the saved time* kept the original URL Information organization / Comments
BackupUrl.com (cache image) html yes yes yes yes no (visited: 2009-04-09)
Evernote Web (no cache image) html no, they use the original URL of media yes yes yes tags; It also offer the sync software ((visited: 2008-03-29))
Furl (no cache image) html no, they use the original URL of media yes yes yes Topic (tags)
Yahoo My Web 2.0 Beta (no cache image) html no, they use the original URL of media yes yes yes tags
Google Notebook (no cache image) html no, they use the original URL of media yes yes yes tags
"Jump" Knowledge html no, they use the original URL of media yes yes yes You can annotate the webpages, and share the link to others.
toread (no cache image) html(Email) no, they use the original URL of media(written in related path will appear normally) yes yes yes
WebCite(access error: 2007-05-07) html no, they use the original URL of media(written in related path will appear normally) yes yes yes You can browse or backup the same page at different time.

About kept the saved time: Most files already have this property. It varied easily if we saved to different storage media or FTP to another location. But the solution of Fx 1.5 + ScrapBook 0.18.4 saved this property with another function (metadata).


Winner is Firefox!