Web scrape troubleshooting: Difference between revisions

From LemonWiki共筆
Jump to navigation Jump to search
mNo edit summary
No edit summary
Line 13: Line 13:
* Stateless: [https://stackoverflow.com/questions/13200152/why-say-that-http-is-a-stateless-protocol Why say that HTTP is a stateless protocol? - Stack Overflow]
* Stateless: [https://stackoverflow.com/questions/13200152/why-say-that-http-is-a-stateless-protocol Why say that HTTP is a stateless protocol? - Stack Overflow]
* Stateful: [http://www.webopedia.com/TERM/S/stateful.html What is stateful? Webopedia Definition]
* Stateful: [http://www.webopedia.com/TERM/S/stateful.html What is stateful? Webopedia Definition]
* [https://en.wikipedia.org/wiki/List_of_HTTP_status_codes List of HTTP status codes - Wikipedia]
{{Template:Troubleshooting}}


[[Category:Programming]]
[[Category:Programming]]
[[Category:Data science]]
[[Category:Data science]]
[[Category:Data collecting]]
[[Category:Data collecting]]

Revision as of 11:10, 23 May 2018

list of technical issues

  1. Content of web page was changed (revision): Th expected web content (of specified DOM element) became empty.
    • Multiple sources of same column such as different HTML DOM but have the same column value.
    • Backup the HTML text of parent DOM element
    • (optional) Complete HTML file backup
  2. The IP was banned from server
  3. CATCHA
  4. AJAX

Further reading


Troubleshooting of ...

Template