Web scrape troubleshooting: Difference between revisions
Jump to navigation
Jump to search
m
→Skill tree of web scraping
| Line 54: | Line 54: | ||
Data extraction | Data extraction | ||
* How they build the website | * How they build the website & [[Information Architecture | information architecture]] | ||
** Understanding the navigation system ★★☆☆☆ | ** Understanding the navigation system ★★☆☆☆ | ||
** Parse the sitemap XML file ★★☆☆☆ | *** Understanding the classfication ★★☆☆☆ | ||
*** Parse the sitemap XML file ★★☆☆☆ | |||
* Understnding the web technology | * Understnding the web technology | ||
** HTTP GET/POST ★★☆☆☆ | ** HTTP GET/POST ★★☆☆☆ | ||