Web scrape troubleshooting: Difference between revisions

From LemonWiki共筆
Jump to navigation Jump to search
mNo edit summary
No edit summary
Line 12: Line 12:
[[Category:Programming]]
[[Category:Programming]]
[[Category:Data science]]
[[Category:Data science]]
[[Category:Data collecting]]

Revision as of 14:39, 22 June 2016

list of technical issues

  1. website revision: expected web content (of DOM element) was empty
    • Multiple sources of same column such as different HTML DOM but have the same column value.
    • Backup the HTML text of parent DOM element
    • (optional) complete HTML file backup
  2. server ip ban
  3. CATCHA
  4. AJAX