MySQL full text search equivalents to Google search: Difference between revisions

Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 38: Line 38:
These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).
These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).
</pre>
</pre>
教學文章:[https://errerrors.blogspot.com/2021/01/how-to-find-abbreviations-from-article-written-in-english-and-chinese-in-mysql.html 解決簡短英文單字的 MySQL 查詢:搜尋 app 而不是 apple]


== Ignore special characters ==
== Ignore special characters ==
Line 45: Line 47:
* Approach: (1) remove the html tag (2) remove the return symbol ([https://en.wikipedia.org/wiki/Carriage_return Carriage return]).
* Approach: (1) remove the html tag (2) remove the return symbol ([https://en.wikipedia.org/wiki/Carriage_return Carriage return]).


Ignore space, [https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms Halfwidth and fullwidth symbol] (半形字元和全形字元)
Ignore white spaces, [https://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms Halfwidth and fullwidth symbol] (半形字元和全形字元)
* Examples:  
* Examples:  
** Searched the keywords e.g. {{kbd | key = <nowiki>"嗎有"</nowiki>}} on Google and found the search result contains {{kbd | key = <nowiki>嗎? 有</nowiki>}} & {{kbd | key = <nowiki>嗎- 有</nowiki>}}.
** Searched the keywords e.g. {{kbd | key = <nowiki>"嗎有"</nowiki>}} on Google and found the search result contains {{kbd | key = <nowiki>嗎? 有</nowiki>}} & {{kbd | key = <nowiki>嗎- 有</nowiki>}}.
Line 56: Line 58:


=== MySQL approach ===
=== MySQL approach ===
==== SQL syntax ====
Input search keywords, and returned the the first occurrence of matched paragraph. Using MySQL [http://www.w3resource.com/mysql/string-functions/mysql-substring-function.php SUBSTRING() function], [http://www.w3resource.com/mysql/string-functions/mysql-position-function.php POSITION() function] & [http://www.w3resource.com/mysql/string-functions/mysql-char_length-function.php CHAR_LENGTH() function].
Input search keywords, and returned the the first occurrence of matched paragraph. Using MySQL [http://www.w3resource.com/mysql/string-functions/mysql-substring-function.php SUBSTRING() function], [http://www.w3resource.com/mysql/string-functions/mysql-position-function.php POSITION() function] & [http://www.w3resource.com/mysql/string-functions/mysql-char_length-function.php CHAR_LENGTH() function].


Line 79: Line 82:
       )
       )
   ELSE ''
   ELSE ''
END AS "scrapbook"
END AS `scrapbook`


-- Returned result of scrapbook column: Show 10 characters before or after the search keywords.
-- Returned result of scrapbook column: Show 10 characters before or after the search keywords.
Line 85: Line 88:
</pre>
</pre>


Run on [http://sqlfiddle.com/#!9/096df3/5/0 sqlfiddle]


Instructions:
==== Instruction of SQL syntax ====
(1) [https://www.w3resource.com/mysql/string-functions/mysql-position-function.php MySQL POSITION() function - w3resource] "MySQL POSITION() returns the position of a substring within a string."
(1) [https://www.w3resource.com/mysql/string-functions/mysql-position-function.php MySQL POSITION() function - w3resource] "MySQL POSITION() returns the position of a substring within a string."
<pre>
<pre>
Line 125: Line 129:
       )
       )
   ELSE ''
   ELSE ''
END AS "scrapbook";
END AS `scrapbook`;


-- > returns 行比武時,以此功對付吸星大法,使其全身凍僵、天池
-- > returns 行比武時,以此功對付吸星大法,使其全身凍僵、天池
Line 148: Line 152:
       )
       )
   ELSE ''
   ELSE ''
END AS "scrapbook"
END AS `scrapbook`


-- Returned result of scrapbook column: Show 10 characters before or after the search keywords.
-- Returned result of scrapbook column: Show 10 characters before or after the search keywords.
Line 180: Line 184:
   </tr>
   </tr>
</table>
</table>
1. English Keyword Version - "AI agent/agents"
Create a Google Sheets formula that suggests a title by extracting text leading up to the "AI agent" mention. {{exclaim}} case-insensitive!:
<pre>
=IF(
  REGEXMATCH(A2, "(?i)\bAI\s*agents?\b"),
  REGEXEXTRACT(
    A2,
    ".{0,10}(?i)\bAI\s*agents?\b.{0,10}"
  )&" ...",
  ""
)
</pre>
Here's a breakdown of the Google Sheets formula that extracts excerpts containing "AI agent" or "AI agents":
The formula has two main parts:
# REGEXMATCH to check if the phrase exists
# REGEXEXTRACT to get the surrounding context if found
Pattern explanation:
* `(?i)` makes the match case-insensitive
* `\b` ensures word boundaries
* `\s*` allows any number of spaces
* `s?` makes the 's' optional (matches both singular and plural)
The formula will:
* Search for "AI agent" or "AI agents" in cell A2
* If found, extract up to 10 characters before and after the match
* Add "..." to indicate truncation
* Return empty string if no match
Will match:
* "AI agent"
* "AI agents"
* "ai Agent"
* "Ai AGENTS"
* "The AI agent is"
* "Multiple AI agents are"
Won't match:
* "AIagent"
* "AImagent"
* "AI agentify"
2. Chinese Keyword Version - "AI代理" or "AI 代理"
Create a Google Sheets formula that suggests a title by extracting text containing "AI代理". {{exclaim}} case-insensitive!:
<pre>
=IF(
  REGEXMATCH(A2, "(?i)\bAI\s*代理"),
  REGEXEXTRACT(
    A2,
    ".{0,10}(?i)\bAI\s*代理.{0,10}"
  )&" ...",
  ""
)
</pre>
Here's a breakdown of the Google Sheets formula that extracts excerpts containing "AI代理":
The formula has two main parts:
# REGEXMATCH to check if the phrase exists
# REGEXEXTRACT to get the surrounding context if found
Pattern explanation:
* `(?i)` makes the match case-insensitive (affects the "AI" part)
* `\b` ensures word boundary before "AI"
* `\s*` allows any number of spaces between "AI" and "代理"
The formula will:
* Search for "AI代理" or "AI 代理" in cell A2
* If found, extract up to 10 characters before and after the match
* Add "..." to indicate truncation
* Return empty string if no match
Will match:
* "AI代理"
* "AI 代理"
* "ai代理"
* "ai 代理"
* "This is AI代理 system"
* "About AI 代理 research"
Won't match:
* "AI代理人" (AI agent person)
* "智能代理" (Intelligent agent)
* "代理AI" (Agent AI)


=== Microsoft  Spreadsheet approach ===
=== Microsoft  Spreadsheet approach ===
Line 207: Line 299:
   </tr>
   </tr>
</table>
</table>
[https://docs.google.com/spreadsheets/d/1ij-50vYqRXJwM71OEWXZHJzrkYfpZYCK-0MFJ3jvY1E/edit?usp=sharing Try it online]


=== PHP approach ===
=== PHP approach ===
Line 216: Line 310:
* [http://www.w3resource.com/mysql/string-functions/mysql-position-function.php MySQL POSITION() function - w3resource] / [http://www.w3resource.com/mysql/string-functions/mysql-length-function.php MySQL LENGTH() function - w3resource] where the keywords located.
* [http://www.w3resource.com/mysql/string-functions/mysql-position-function.php MySQL POSITION() function - w3resource] / [http://www.w3resource.com/mysql/string-functions/mysql-length-function.php MySQL LENGTH() function - w3resource] where the keywords located.


== References or related articles ==
== Related articles ==
to explore strange new worlds / related articles:
to explore strange new worlds / related articles:
* [http://dev.mysql.com/doc/refman/5.1/en/regexp.html MySQL :: MySQL 5.1 Reference Manual :: 12.5.2 Regular Expressions]
* [http://dev.mysql.com/doc/refman/5.1/en/regexp.html MySQL :: MySQL 5.1 Reference Manual :: 12.5.2 Regular Expressions]
Line 236: Line 330:
* does not equal  ____
* does not equal  ____


references
== References ==
<references/>
<references/>
== Related news ==
{{News feed | title = MySQL OR nosql related news | feed = <rss>https://news.google.com/rss/search?q=MySQL+OR+nosql&hl=zh-TW&gl=TW&ceid=TW:zh-Hant</rss>
}}




Line 253: Line 342:
[[Category:Search]]
[[Category:Search]]
[[Category:Data Science]]
[[Category:Data Science]]
[[Category: Revised with LLMs]]

Navigation menu