Web scrape troubleshooting: Difference between revisions

Jump to navigation Jump to search
m
no edit summary
mNo edit summary
mNo edit summary
Line 11: Line 11:
# The web page needed to signed in
# The web page needed to signed in
# Blocking the request without {{kbd | key= Referer}} or other headers.
# Blocking the request without {{kbd | key= Referer}} or other headers.
# Connection timeout during a http request. e.g. In PHP {{kbd | key=default_socket_timeout}} is 30 seconds<ref>[https://www.php.net/manual/en/filesystem.configuration.php PHP: Runtime Configuration - Manual]</ref>.
# Language and [http://php.net/manual/en/function.urlencode.php URL-encodes string]
# Language and [http://php.net/manual/en/function.urlencode.php URL-encodes string]
# [[Data cleaning#Data_handling | Data cleaning]] issues e.g. [https://en.wikipedia.org/wiki/Non-breaking_space Non-breaking space] or other [https://en.wikipedia.org/wiki/Whitespace_character Whitespace character]
# [[Data cleaning#Data_handling | Data cleaning]] issues e.g. [https://en.wikipedia.org/wiki/Non-breaking_space Non-breaking space] or other [https://en.wikipedia.org/wiki/Whitespace_character Whitespace character]
Line 48: Line 49:
* Stateful: [http://www.webopedia.com/TERM/S/stateful.html What is stateful? Webopedia Definition]
* Stateful: [http://www.webopedia.com/TERM/S/stateful.html What is stateful? Webopedia Definition]
* [https://en.wikipedia.org/wiki/List_of_HTTP_status_codes List of HTTP status codes - Wikipedia]
* [https://en.wikipedia.org/wiki/List_of_HTTP_status_codes List of HTTP status codes - Wikipedia]
References
<references />


{{Template:Troubleshooting}}
{{Template:Troubleshooting}}

Navigation menu