Web scrape troubleshooting: Difference between revisions
Jump to navigation
Jump to search
m
no edit summary
mNo edit summary |
mNo edit summary |
||
| Line 1: | Line 1: | ||
== List of technical issues == | |||
# Content of web page was changed (revision): The expected web content (of specified DOM element) became empty. | # Content of web page was changed (revision): The expected web content (of specified DOM element) became empty. | ||
#* Multiple sources of same column such as different HTML DOM but have the same column value. | #* Multiple sources of same column such as different HTML DOM but have the same column value. | ||
| Line 45: | Line 46: | ||
</div> | </div> | ||
Further reading | == Before start to web scrpae == | ||
* Are they offer datasets? | |||
* Are they offer [https://en.wikipedia.org/wiki/Application_programming_interface API] (Application programming interface)? | |||
== Further reading == | |||
* Stateless: [https://stackoverflow.com/questions/13200152/why-say-that-http-is-a-stateless-protocol Why say that HTTP is a stateless protocol? - Stack Overflow] | * Stateless: [https://stackoverflow.com/questions/13200152/why-say-that-http-is-a-stateless-protocol Why say that HTTP is a stateless protocol? - Stack Overflow] | ||
* Stateful: [http://www.webopedia.com/TERM/S/stateful.html What is stateful? Webopedia Definition] | * Stateful: [http://www.webopedia.com/TERM/S/stateful.html What is stateful? Webopedia Definition] | ||
* [https://en.wikipedia.org/wiki/List_of_HTTP_status_codes List of HTTP status codes - Wikipedia] | * [https://en.wikipedia.org/wiki/List_of_HTTP_status_codes List of HTTP status codes - Wikipedia] | ||
References | == References == | ||
<references /> | <references /> | ||