15,032
edits
| (22 intermediate revisions by the same user not shown) | |||
| Line 38: | Line 38: | ||
These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_). | These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_). | ||
</pre> | </pre> | ||
教學文章:[https://errerrors.blogspot.com/2021/01/how-to-find-abbreviations-from-article-written-in-english-and-chinese-in-mysql.html 解決簡短英文單字的 MySQL 查詢:搜尋 app 而不是 apple] | |||
== Ignore special characters == | == Ignore special characters == | ||
| Line 45: | Line 47: | ||
* Approach: (1) remove the html tag (2) remove the return symbol ([https://en.wikipedia.org/wiki/Carriage_return Carriage return]). | * Approach: (1) remove the html tag (2) remove the return symbol ([https://en.wikipedia.org/wiki/Carriage_return Carriage return]). | ||
Ignore | Ignore white spaces, [https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms Halfwidth and fullwidth symbol] (半形字元和全形字元) | ||
* Examples: | * Examples: | ||
** Searched the keywords e.g. {{kbd | key = <nowiki>"嗎有"</nowiki>}} on Google and found the search result contains {{kbd | key = <nowiki>嗎? 有</nowiki>}} & {{kbd | key = <nowiki>嗎- 有</nowiki>}}. | ** Searched the keywords e.g. {{kbd | key = <nowiki>"嗎有"</nowiki>}} on Google and found the search result contains {{kbd | key = <nowiki>嗎? 有</nowiki>}} & {{kbd | key = <nowiki>嗎- 有</nowiki>}}. | ||
| Line 53: | Line 55: | ||
== Highlight search query keywords on resulting pages == | == Highlight search query keywords on resulting pages == | ||
Total 130 ~ 240 characters on Google resulting pages. | Returned result: Show 10 characters before or after the search keywords. (cf: Total 130 ~ 240 characters on Google resulting pages.) | ||
=== MySQL approach === | === MySQL approach === | ||
Input search keywords, and returned the matched paragraph. Using MySQL [http://www.w3resource.com/mysql/string-functions/mysql-substring-function.php SUBSTRING() function], [http://www.w3resource.com/mysql/string-functions/mysql-position-function.php POSITION() function] & [http://www.w3resource.com/mysql/string-functions/mysql-char_length-function.php CHAR_LENGTH() function]. | ==== SQL syntax ==== | ||
Input search keywords, and returned the the first occurrence of matched paragraph. Using MySQL [http://www.w3resource.com/mysql/string-functions/mysql-substring-function.php SUBSTRING() function], [http://www.w3resource.com/mysql/string-functions/mysql-position-function.php POSITION() function] & [http://www.w3resource.com/mysql/string-functions/mysql-char_length-function.php CHAR_LENGTH() function]. | |||
<pre> | <pre> | ||
| Line 79: | Line 82: | ||
) | ) | ||
ELSE '' | ELSE '' | ||
END AS | END AS `scrapbook` | ||
-- Returned result of scrapbook column: Show 10 characters before or after the search keywords. | -- Returned result of scrapbook column: Show 10 characters before or after the search keywords. | ||
| Line 85: | Line 88: | ||
</pre> | </pre> | ||
Run on [http://sqlfiddle.com/#!9/096df3/5/0 sqlfiddle] | |||
==== Instruction of SQL syntax ==== | |||
(1) [https://www.w3resource.com/mysql/string-functions/mysql-position-function.php MySQL POSITION() function - w3resource] "MySQL POSITION() returns the position of a substring within a string." | |||
<pre> | |||
SET @term := "吸星大法"; | |||
SET @message := "笑傲江湖中嵩山派掌門左冷禪所創掌法,可發出至陰至寒的真氣。左冷禪與任我行比武時,以此功對付吸星大法,使其全身凍僵、天池穴被封;與岳不群比劍奪帥時,左又使出寒冰神掌,與紫霞神功旗鼓相當、不分勝敗。 | |||
原文網址:https://kknews.cc/zh-tw/culture/xzaxbq.html"; | |||
SELECT POSITION(@term IN @message) | |||
-- > returns 46 | |||
</pre> | |||
(2) Avoid the the start position is 0 or negative. Minimum start position of each paragraph is 1. | |||
<pre> | |||
SELECT IF( | |||
POSITION(@term IN @message) > 0 && | |||
POSITION(@term IN @message) -10 < 0 | |||
, 1 | |||
, POSITION(@term IN @message) -10) | |||
-- > returns 36 = 46 - 10 | |||
</pre> | |||
(3) Show 10 characters before or after the search keywords. [https://www.w3resource.com/mysql/string-functions/mysql-substring-function.php MySQL SUBSTRING() function - w3resource]"returns a specified number of characters from a particular position of a given string." | |||
<pre> | |||
SELECT | |||
@message | |||
, CASE | |||
WHEN POSITION(@term IN @message) > 0 THEN SUBSTRING(@message | |||
, IF( | |||
POSITION(@term IN @message) > 0 && | |||
POSITION(@term IN @message) -10 < 0 | |||
, 1 | |||
, POSITION(@term IN @message) -10) | |||
, CHAR_LENGTH(@term) + 20 | |||
) | |||
ELSE '' | |||
END AS `scrapbook`; | |||
-- > returns 行比武時,以此功對付吸星大法,使其全身凍僵、天池 | |||
</pre> | |||
<pre> | <pre> | ||
| Line 104: | Line 152: | ||
) | ) | ||
ELSE '' | ELSE '' | ||
END AS | END AS `scrapbook` | ||
-- Returned result of scrapbook column: Show 10 characters before or after the search keywords. | -- Returned result of scrapbook column: Show 10 characters before or after the search keywords. | ||
| Line 136: | Line 184: | ||
</tr> | </tr> | ||
</table> | </table> | ||
1. English Keyword Version - "AI agent/agents" | |||
Create a Google Sheets formula that suggests a title by extracting text leading up to the "AI agent" mention. {{exclaim}} case-insensitive!: | |||
<pre> | |||
=IF( | |||
REGEXMATCH(A2, "(?i)\bAI\s*agents?\b"), | |||
REGEXEXTRACT( | |||
A2, | |||
".{0,10}(?i)\bAI\s*agents?\b.{0,10}" | |||
)&" ...", | |||
"" | |||
) | |||
</pre> | |||
Here's a breakdown of the Google Sheets formula that extracts excerpts containing "AI agent" or "AI agents": | |||
The formula has two main parts: | |||
# REGEXMATCH to check if the phrase exists | |||
# REGEXEXTRACT to get the surrounding context if found | |||
Pattern explanation: | |||
* `(?i)` makes the match case-insensitive | |||
* `\b` ensures word boundaries | |||
* `\s*` allows any number of spaces | |||
* `s?` makes the 's' optional (matches both singular and plural) | |||
The formula will: | |||
* Search for "AI agent" or "AI agents" in cell A2 | |||
* If found, extract up to 10 characters before and after the match | |||
* Add "..." to indicate truncation | |||
* Return empty string if no match | |||
Will match: | |||
* "AI agent" | |||
* "AI agents" | |||
* "ai Agent" | |||
* "Ai AGENTS" | |||
* "The AI agent is" | |||
* "Multiple AI agents are" | |||
Won't match: | |||
* "AIagent" | |||
* "AImagent" | |||
* "AI agentify" | |||
2. Chinese Keyword Version - "AI代理" or "AI 代理" | |||
Create a Google Sheets formula that suggests a title by extracting text containing "AI代理". {{exclaim}} case-insensitive!: | |||
<pre> | |||
=IF( | |||
REGEXMATCH(A2, "(?i)\bAI\s*代理"), | |||
REGEXEXTRACT( | |||
A2, | |||
".{0,10}(?i)\bAI\s*代理.{0,10}" | |||
)&" ...", | |||
"" | |||
) | |||
</pre> | |||
Here's a breakdown of the Google Sheets formula that extracts excerpts containing "AI代理": | |||
The formula has two main parts: | |||
# REGEXMATCH to check if the phrase exists | |||
# REGEXEXTRACT to get the surrounding context if found | |||
Pattern explanation: | |||
* `(?i)` makes the match case-insensitive (affects the "AI" part) | |||
* `\b` ensures word boundary before "AI" | |||
* `\s*` allows any number of spaces between "AI" and "代理" | |||
The formula will: | |||
* Search for "AI代理" or "AI 代理" in cell A2 | |||
* If found, extract up to 10 characters before and after the match | |||
* Add "..." to indicate truncation | |||
* Return empty string if no match | |||
Will match: | |||
* "AI代理" | |||
* "AI 代理" | |||
* "ai代理" | |||
* "ai 代理" | |||
* "This is AI代理 system" | |||
* "About AI 代理 research" | |||
Won't match: | |||
* "AI代理人" (AI agent person) | |||
* "智能代理" (Intelligent agent) | |||
* "代理AI" (Agent AI) | |||
=== Microsoft Spreadsheet approach === | === Microsoft Spreadsheet approach === | ||
Using [https://support.office.com/zh-tw/article/FIND%E3%80%81FINDB-%E5%87%BD%E6%95%B8-c7912941-af2a-4bdf-a553-d0d89b0a0628 FIND], [https://support.office.com/zh-tw/article/MID%E3%80%81MIDB-%E5%87%BD%E6%95%B8-d5f9e25c-d7d6-472e-b568-4ecb12433028 MID] & [https://support.office.com/zh-tw/article/CONCATENATE-%E5%87%BD%E6%95%B8-8f8ae884-2ca8-4f7a-b093-75d702bea31d CONCATENATE] functions | Using [https://support.office.com/zh-tw/article/FIND%E3%80%81FINDB-%E5%87%BD%E6%95%B8-c7912941-af2a-4bdf-a553-d0d89b0a0628 FIND], [https://support.office.com/zh-tw/article/MID%E3%80%81MIDB-%E5%87%BD%E6%95%B8-d5f9e25c-d7d6-472e-b568-4ecb12433028 MID] & [https://support.office.com/zh-tw/article/CONCATENATE-%E5%87%BD%E6%95%B8-8f8ae884-2ca8-4f7a-b093-75d702bea31d CONCATENATE] functions. {{exclaim}} FIND function is case-sensitive! | ||
<table border="1"> | <table border="1"> | ||
<tr> | <tr> | ||
| Line 163: | Line 299: | ||
</tr> | </tr> | ||
</table> | </table> | ||
[https://docs.google.com/spreadsheets/d/1ij-50vYqRXJwM71OEWXZHJzrkYfpZYCK-0MFJ3jvY1E/edit?usp=sharing Try it online] | |||
=== PHP approach === | === PHP approach === | ||
| Line 172: | Line 310: | ||
* [http://www.w3resource.com/mysql/string-functions/mysql-position-function.php MySQL POSITION() function - w3resource] / [http://www.w3resource.com/mysql/string-functions/mysql-length-function.php MySQL LENGTH() function - w3resource] where the keywords located. | * [http://www.w3resource.com/mysql/string-functions/mysql-position-function.php MySQL POSITION() function - w3resource] / [http://www.w3resource.com/mysql/string-functions/mysql-length-function.php MySQL LENGTH() function - w3resource] where the keywords located. | ||
== | == Related articles == | ||
to explore strange new worlds / related articles: | to explore strange new worlds / related articles: | ||
* [http://dev.mysql.com/doc/refman/5.1/en/regexp.html MySQL :: MySQL 5.1 Reference Manual :: 12.5.2 Regular Expressions] | * [http://dev.mysql.com/doc/refman/5.1/en/regexp.html MySQL :: MySQL 5.1 Reference Manual :: 12.5.2 Regular Expressions] | ||
| Line 179: | Line 317: | ||
* [https://search.yahoo.com/search/options?fr=fp-top&p= Yahoo Advanced Web Search] | * [https://search.yahoo.com/search/options?fr=fp-top&p= Yahoo Advanced Web Search] | ||
* [http://onlinehelp.microsoft.com/en-us/bing/ff808438.aspx Advanced search options] of bing | * [http://onlinehelp.microsoft.com/en-us/bing/ff808438.aspx Advanced search options] of bing | ||
* [http://errerrors.blogspot. | * [http://errerrors.blogspot.com/2016/10/excel.html 在 Excel 或 Google 試算表中,布林搜尋多個關鍵字] | ||
* [https://blog.longwin.com.tw/2012/07/mysql-fulltext-search-howto-2012/ MySQL Fulltext Search 使用方式 | Tsung's Blog] 只支援英文 | |||
other search cases: if the column ... (inspired by [http://www.outwit.com/ OutWit]) | other search cases: if the column ... (inspired by [http://www.outwit.com/ OutWit]) | ||
| Line 191: | Line 330: | ||
* does not equal ____ | * does not equal ____ | ||
== References == | |||
<references/> | |||
| Line 209: | Line 342: | ||
[[Category:Search]] | [[Category:Search]] | ||
[[Category:Data Science]] | [[Category:Data Science]] | ||
[[Category: Revised with LLMs]] | |||