How to extract content from websites: Difference between revisions
Jump to navigation
Jump to search
m
→Methods
mNo edit summary |
m (→Methods) |
||
| Line 3: | Line 3: | ||
== Methods == | == Methods == | ||
[https://github.com/timothytylee/full-text-rss timothytylee/full-text-rss: Fork of Full-Text RSS to improve handling of non UTF-8 sites] | |||
* Demo: [https://www.fivefilters.org/full-text-rss/ Full-Text RSS - FiveFilters.org] | |||
* Requirement: PHP | |||
* License: GNU Affero General Public License v3.0 | |||
[https://github.com/postlight/mercury-parser postlight/mercury-parser: 📜 Extract meaningful content from the chaos of a web page] | |||
* Demo: | |||
* Requirement: Node.js | |||
* License: Apache License, Version 2.0 or MIT license | |||
[https://github.com/luin/readability luin/readability: 📚 Turn any web page into a clean view] | |||
* Demo: | |||
* Requirement: Node.js | |||
* License: Apache License 2.0 | |||
''$'' [https://www.diffbot.com/products/extract/ Diffbot | Extract Content From Websites Automatically] two weeks free trial | |||
* Demo: | |||
* Requirement: | |||
* License: | |||
[https://totheweb.com/learning_center/tools-convert-html-text-to-plain-text-for-content-review/ Free Tool: Convert Your Webpage to Plain Text » ToTheWeb] | |||
* Demo: | |||
* Requirement: | |||
* License: | |||
== Related pages == | == Related pages == | ||