Extract url from text: Difference between revisions
Jump to navigation
Jump to search
no edit summary
m (Planetoid moved page Regular extract url from text to Extract url from text) |
No edit summary |
||
| Line 1: | Line 1: | ||
從文章內容中擷取網址 (又稱 [https://zh.wikipedia.org/zh-tw/%E7%BB%9F%E4%B8%80%E8%B5%84%E6%BA%90%E5%AE%9A%E4%BD%8D%E7%AC%A6 統一資源定位符], [https://en.wikipedia.org/wiki/Uniform_Resource_Locator Uniform Resource Locator])。 | |||
== 擷取完整網址 == | == 擷取完整網址 == | ||
=== | === 使用 Google sheet 擷取完整網址 === | ||
使用 Google | 使用 Google 試算表正規表示法 ([[Regular expression]]) 的 [https://support.google.com/docs/answer/3098244?hl=zh-Hant REGEXEXTRACT] 函數,從文章內容擷取第一個網址。 | ||
<pre> | <pre> | ||
=REGEXEXTRACT(A1, "(http[s]?://[a-zA-Z0-9\-_\\._~\:\/\?#\[\]@\!\$&'\(\)\*\+,;\=%]+)") | =REGEXEXTRACT(A1, "(http[s]?://[a-zA-Z0-9\-_\\._~\:\/\?#\[\]@\!\$&'\(\)\*\+,;\=%]+)") | ||
</pre> | </pre> | ||
使用 Sublime | === 使用 Sublime Text 擷取完整網址 === | ||
使用 Sublime Text 等支援 regular expression 的文字編輯器 | |||
* 選單 Find --> Replace | * 選單 Find --> Replace | ||
* 啟用 Regular expression | * 啟用 Regular expression | ||
| Line 14: | Line 15: | ||
* Replace with: {{kbd | key= <nowiki>\1</nowiki>}} | * Replace with: {{kbd | key= <nowiki>\1</nowiki>}} | ||
=== | === 使用 Microsoft Excel 擷取完整網址 === | ||
資料限制:網址前後需要間隔空白或換行符號。以下公式從 B2 儲存格擷取完整網址:(公式修改自 guitarthrower 提供的公式<ref>[https://stackoverflow.com/questions/25429211/extract-urls-from-a-cell-of-text-in-excel vba - Extract URL's from a Cell of Text in Excel - Stack Overflow]</ref>) | |||
<pre> | |||
=IF(ISERROR(MID(SUBSTITUTE(B2, " | |||
", " "),FIND("http",SUBSTITUTE(B2, " | |||
", " ")),IFERROR(FIND(" ",SUBSTITUTE(B2, " | |||
", " "),FIND("http",SUBSTITUTE(B2, " | |||
", " ")))-1,LEN(SUBSTITUTE(B2, " | |||
", " ")))-FIND("http",SUBSTITUTE(B2, " | |||
", " "))+1)), "", MID(SUBSTITUTE(B2, " | |||
", " "),FIND("http",SUBSTITUTE(B2, " | |||
", " ")),IFERROR(FIND(" ",SUBSTITUTE(B2, " | |||
", " "),FIND("http",SUBSTITUTE(B2, " | |||
", " ")))-1,LEN(SUBSTITUTE(B2, " | |||
", " ")))-FIND("http",SUBSTITUTE(B2, " | |||
", " "))+1)) | |||
</pre> | |||
=== 測試資料 === | |||
輸入資料: 不包含 HTML 語法的 [http://www.w3schools.com/tags/att_a_href.asp a href] 屬性標籤 | 輸入資料: 不包含 HTML 語法的 [http://www.w3schools.com/tags/att_a_href.asp a href] 屬性標籤 | ||
<pre> | <pre> | ||
| Line 20: | Line 40: | ||
</pre> | </pre> | ||
輸出資料: | 輸出資料: | ||
<pre> | <pre> | ||
| Line 31: | Line 50: | ||
== 擷取網址中的網域部分 == | == 擷取網址中的網域部分 == | ||
=== 使用 Google sheet 擷取網域 === | |||
使用 Google 試算表 [https://support.google.com/docs/answer/3098244?hl=zh-Hant REGEXEXTRACT] 函數 | 使用 Google 試算表 [https://support.google.com/docs/answer/3098244?hl=zh-Hant REGEXEXTRACT] 函數 | ||
<pre> | <pre> | ||
| Line 50: | Line 70: | ||
== 擷取特定檔案類型的網址 == | == 擷取特定檔案類型的網址 == | ||
=== 使用 Sublime Text 擷取特定檔案類型的網址 === | |||
以下語法適用於 [https://www.sublimetext.com/ Sublime Tex] | 以下語法適用於 [https://www.sublimetext.com/ Sublime Tex] | ||