Extract domain from text

Extract the domain part from URLs in article content

Using Google Sheets to extract domains

Use Google Spreadsheet REGEXEXTRACT function

=REGEXEXTRACT(A1, "(http[s]?\://[^/]+)")

Input:

Yahoo! News https://tw.news.yahoo.com/abc

Output:

https://tw.news.yahoo.com

Explanation:

Domain refers to text that starts with http:// or https://, followed by multiple characters that are not the symbol /: [^/]+. =