Web scrape troubleshooting: Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
No edit summary |
||
| Line 13: | Line 13: | ||
* Stateless: [https://stackoverflow.com/questions/13200152/why-say-that-http-is-a-stateless-protocol Why say that HTTP is a stateless protocol? - Stack Overflow] | * Stateless: [https://stackoverflow.com/questions/13200152/why-say-that-http-is-a-stateless-protocol Why say that HTTP is a stateless protocol? - Stack Overflow] | ||
* Stateful: [http://www.webopedia.com/TERM/S/stateful.html What is stateful? Webopedia Definition] | * Stateful: [http://www.webopedia.com/TERM/S/stateful.html What is stateful? Webopedia Definition] | ||
* [https://en.wikipedia.org/wiki/List_of_HTTP_status_codes List of HTTP status codes - Wikipedia] | |||
{{Template:Troubleshooting}} | |||
[[Category:Programming]] | [[Category:Programming]] | ||
[[Category:Data science]] | [[Category:Data science]] | ||
[[Category:Data collecting]] | [[Category:Data collecting]] | ||
Revision as of 11:10, 23 May 2018
list of technical issues
- Content of web page was changed (revision): Th expected web content (of specified DOM element) became empty.
- Multiple sources of same column such as different HTML DOM but have the same column value.
- Backup the HTML text of parent DOM element
- (optional) Complete HTML file backup
- The IP was banned from server
- Setting the temporization (sleep time) between each request ex: PHP: sleep - Manual, AutoThrottle extension — Scrapy 1.0.3 documentation
- The server responded with a status of 403: '403 forbidden' --> Change the network IP
- CATCHA
- AJAX
Further reading
- Stateless: Why say that HTTP is a stateless protocol? - Stack Overflow
- Stateful: What is stateful? Webopedia Definition
- List of HTTP status codes - Wikipedia
Troubleshooting of ...
- PHP, cUrl, Python, selenium, HTTP status code errors
- Database: SQL syntax debug, MySQL errors, MySQLTuner errors or PostgreSQL errors
- HTML/Javascript: Troubleshooting of javascript, XPath
- Software: Mediawiki, Docker, FTP problems, online conference software
- Test connectivity for the web service, Web Ping, Network problem, Web user behavior, Web scrape troubleshooting
Template