Web scrape troubleshooting: Difference between revisions
Jump to navigation
Jump to search
mNo edit summary |
mNo edit summary |
||
| Line 1: | Line 1: | ||
list of technical issues | list of technical issues | ||
# website revision | # website revision: expected web content (of DOM element) was empty | ||
#* Multiple sources of same column such as different HTML DOM but have the same column value. | #* Multiple sources of same column such as different HTML DOM but have the same column value. | ||
#* Backup the HTML text of parent DOM element | #* Backup the HTML text of parent DOM element | ||
Revision as of 13:16, 27 December 2015
list of technical issues
- website revision: expected web content (of DOM element) was empty
- Multiple sources of same column such as different HTML DOM but have the same column value.
- Backup the HTML text of parent DOM element
- (optional) complete HTML file backup
- server ip ban
- setting the temporization (sleep time) between pages ex: PHP: sleep - Manual, AutoThrottle extension — Scrapy 1.0.3 documentation
- CATCHA
- AJAX