Extract url from text: Difference between revisions

Revision as of 17:45, 27 September 2016

使用正規表示法 (Regular expression) ，從文章內容中擷取網址。

使用 Google 試算表 REGEXEXTRACT 函數，從文章內容擷取第一個網址。

=REGEXEXTRACT(A1, "(http[s]?://[a-zA-Z0-9\-_\\._~\:\/\?#\[\]@\!\$&'\(\)\*\+,;\=%]+)\b?")

輸入:

Yahoo! 新聞 https://tw.news.yahoo.com/abc

輸出:

https://tw.news.yahoo.com/abc

說明:

網址可能是 http:// 或 https:// 開頭，所以條件是 http[s]?://
根據 RFC 3986 網址允許的文字有 ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=，其他文字則需要加上比例符號 % 編碼。 ^[1]

=REGEXEXTRACT(A1, "(http[s]?\://[^/]+)\b?")

輸入:

Yahoo! 新聞 https://tw.news.yahoo.com/abc

輸出:

https://tw.news.yahoo.com/

說明: