15,039
edits
No edit summary |
|||
| Line 3: | Line 3: | ||
== Convert webpage to markdown == | == Convert webpage to markdown == | ||
1. [https://github.com/deathau/markdownload?tab=readme-ov-file deathau/markdownload]: "A Firefox and Google Chrome extension to clip websites and download them into a readable markdown file." {{Gd}} Automatically identify and extract the primary content area of a webpage (integration of [https://github.com/mozilla/readability mozilla/readability.js]) | |||
2. [https://jina.ai/reader/ Jina Reader API] | |||
Bookmarklet | Bookmarklet | ||
<pre> | <pre> | ||
javascript:(function(){location.href='https://r.jina.ai/'+location.href})(); | javascript:(function(){location.href='https://r.jina.ai/'+location.href})(); | ||
</pre> | </pre> | ||
3. ''$'' [https://www.firecrawl.dev/ Firecrawl - The API to search, scrape, and interact with the web at scale. 🔥] | |||
You may define the scrape result format, such as `markdown`, `rawHtml`, or `json`. For example, Firecrawl’s scrape endpoint supports body formats like those documented [https://docs.firecrawl.dev/api-reference/endpoint/scrape#body-formats here]. | |||
[[Category: Tool]] | [[Category: Tool]] | ||