MySQL full text search equivalents to Google search

From LemonWiki共筆
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

AND

Google search: keyword1 keyword2 same as keyword1 AND keyword2 or keyword1 +keyword2. Icon_exclaim.gif (1) The following is exact words search. (2) Replace column_name with your column name

  • Google: 易筋經 AND 吸星大法
  • MySQL: column_name REGEXP '易筋經' AND column_name REGEXP '吸星大法'
  • MySQL: column_name LIKE '%易筋經%' AND column_name LIKE '%吸星大法%' (online demo[1])
  • MySQL: IF(LOCATE('易筋經', column_name) > 0) AND IF(LOCATE('吸星大法, column_name) > 0)
  • MySQL: column_name LIKE '%易筋經%吸星大法%' AND column_name LIKE '%吸星大法%易筋經%' Icon_exclaim.gif Trivial for multiple keywords

OR

Google search: keyword1 OR keyword2

  • Google: 易筋經 OR 吸星大法
  • MySQL: column_name REGEXP '易筋經' OR column_name REGEXP '吸星大法'
  • MySQL: IF(LOCATE('易筋經', column_name) > 0) OR IF(LOCATE('吸星大法, column_name) > 0)
  • MySQL: column_name LIKE '%易筋經%' OR column_name LIKE '%吸星大法%' (online demo[2])

NOT

Google search: keyword1 NOT keyword2 same as keyword1 -keyword2

  • Google: 易筋經 NOT 吸星大法
  • MySQL: column_name REGEXP '易筋經' AND column_name NOT REGEXP '吸星大法' (online demo)
  • MySQL: IF(LOCATE('易筋經', column_name) > 0) AND IF(LOCATE('吸星大法, column_name) = 0)
  • MySQL: column_name LIKE '%易筋經%' AND column_name NOT LIKE '%吸星大法%'

* wildcard operator

Google * wildcard operator. "Use *, an asterisk character, known as a wildcard, to match one or more words in a phrase" [1] (online demo)

  • Google: 狐狸*叫
  • MySQL: column_name LIKE '狐狸%叫'[2]

English issue

When the keyword is short and written in English e.g. AI, the query result using column_name LIKE '%AI%' may NOT what you want e.g. Tainan, main, hair and so on.

  • (1) Remove all non-alpha-numeric-characters[3] (2) REGEXP word boundaries[4] e.g. (REPLACE(CONVERT(column_name USING ascii), '?', ' ') REGEXP '([[:<:]])AI([[:>:]])')


Cited from MySQL :: MySQL 5.7 Reference Manual :: 12.5.2 Regular Expressions

[[:<:]], [[:>:]]

These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).

教學文章:解決簡短英文單字的 MySQL 查詢:搜尋 app 而不是 apple

Ignore special characters

Ignore return symbol and span tag

  • Example:
    • Searched the keywords e.g. "意法" site:ptt.cc on Google and found the search result contains 意 & 法 located in the nearest but different rows. 意 is at the end of the n-th row. 法 is at the beginning of n+1-th row [3].
  • Approach: (1) remove the html tag (2) remove the return symbol (Carriage return).

Ignore white spaces, Halfwidth and fullwidth symbol (半形字元和全形字元)

  • Examples:
    • Searched the keywords e.g. "嗎有" on Google and found the search result contains 嗎? 有 & 嗎- 有.
    • Searched the keywords e.g. "人物誌Persona" on Google and found the search result contains 人物誌(Persona), 人物誌(Persona) & 「人物誌」(persona).
  • Approach: (1) remove the space symbol (2) remove the Halfwidth and fullwidth symbol.
  • References: PHP remove symbols from string - Stack Overflow

Highlight search query keywords on resulting pages

Returned result: Show 10 characters before or after the search keywords. (cf: Total 130 ~ 240 characters on Google resulting pages.)

MySQL approach

SQL syntax

Input search keywords, and returned the the first occurrence of matched paragraph. Using MySQL SUBSTRING() function, POSITION() function & CHAR_LENGTH() function.


SET @term := "吸星大法";
SET @message := "笑傲江湖中嵩山派掌門左冷禪所創掌法,可發出至陰至寒的真氣。左冷禪與任我行比武時,以此功對付吸星大法,使其全身凍僵、天池穴被封;與岳不群比劍奪帥時,左又使出寒冰神掌,與紫霞神功旗鼓相當、不分勝敗。

原文網址:https://kknews.cc/zh-tw/culture/xzaxbq.html";


SELECT 
@message

, CASE
  WHEN POSITION(@term IN @message) > 0 THEN SUBSTRING(@message
        , IF(
            POSITION(@term IN @message) > 0 &&
            POSITION(@term IN @message) -10 < 0
            , 1
            , POSITION(@term IN @message) -10)
        , CHAR_LENGTH(@term) + 20
      )
  ELSE ''
END AS `scrapbook`

-- Returned result of scrapbook column: Show 10 characters before or after the search keywords.
-- 行比武時,以此功對付吸星大法,使其全身凍僵、天池

Run on sqlfiddle

Instruction of SQL syntax

(1) MySQL POSITION() function - w3resource "MySQL POSITION() returns the position of a substring within a string."

SET @term := "吸星大法";
SET @message := "笑傲江湖中嵩山派掌門左冷禪所創掌法,可發出至陰至寒的真氣。左冷禪與任我行比武時,以此功對付吸星大法,使其全身凍僵、天池穴被封;與岳不群比劍奪帥時,左又使出寒冰神掌,與紫霞神功旗鼓相當、不分勝敗。

原文網址:https://kknews.cc/zh-tw/culture/xzaxbq.html";
SELECT POSITION(@term IN @message)

-- > returns 46

(2) Avoid the the start position is 0 or negative. Minimum start position of each paragraph is 1.

SELECT IF(
            POSITION(@term IN @message) > 0 &&
            POSITION(@term IN @message) -10 < 0
            , 1
            , POSITION(@term IN @message) -10)

-- > returns 36 = 46 - 10

(3) Show 10 characters before or after the search keywords. MySQL SUBSTRING() function - w3resource"returns a specified number of characters from a particular position of a given string."


SELECT 
@message

, CASE
  WHEN POSITION(@term IN @message) > 0 THEN SUBSTRING(@message
        , IF(
            POSITION(@term IN @message) > 0 &&
            POSITION(@term IN @message) -10 < 0
            , 1
            , POSITION(@term IN @message) -10)
        , CHAR_LENGTH(@term) + 20
      )
  ELSE ''
END AS `scrapbook`;

-- > returns 行比武時,以此功對付吸星大法,使其全身凍僵、天池
SET @term := "吸星大法";
SET @message := "原文網址:https://kknews.cc/zh-tw/culture/xzaxbq.html";


SELECT 
@message

, CASE
  WHEN POSITION(@term IN @message) > 0 THEN SUBSTRING(@message
        , IF(
            POSITION(@term IN @message) > 0 &&
            POSITION(@term IN @message) -10 < 0
            , 1
            , POSITION(@term IN @message) -10)
        , CHAR_LENGTH(@term) + 20
      )
  ELSE ''
END AS `scrapbook`

-- Returned result of scrapbook column: Show 10 characters before or after the search keywords.
-- [EMPTY]

Google sheet approach

Using REGEXEXTRACT function Icon_exclaim.gif case-sensitive!:

A B
1 文章 笑傲江湖中嵩山派掌門左冷禪所創掌法,可發出至陰至寒的真氣。左冷禪與任我行比武時,以此功對付吸星大法,使其全身凍僵、天池穴被封;與岳不群比劍奪帥時,左又使出寒冰神掌,與紫霞神功旗鼓相當、不分勝敗。 原文網址:https://kknews.cc/zh-tw/culture/xzaxbq.html
2 關鍵字 吸星大法
3 搜尋結果摘要 =IF(ISERROR(REGEXEXTRACT(LOWER(B1), "(.{10}"&B2&".{10})")), "", REGEXEXTRACT(LOWER(B1), "(.{10}"&B2&".{10})"))

Microsoft Spreadsheet approach

Using FIND, MID & CONCATENATE functions. Icon_exclaim.gif FIND function is case-sensitive!

A B
1 文章 笑傲江湖中嵩山派掌門左冷禪所創掌法,可發出至陰至寒的真氣。左冷禪與任我行比武時,以此功對付吸星大法,使其全身凍僵、天池穴被封;與岳不群比劍奪帥時,左又使出寒冰神掌,與紫霞神功旗鼓相當、不分勝敗。 原文網址:https://kknews.cc/zh-tw/culture/xzaxbq.html
2 關鍵字 吸星大法
3 搜尋結果摘要 =IF(ISERROR(FIND(B2, B1)), "", CONCATENATE(MID(B1, IF(FIND(B2, B1)-10 >= 1, FIND(B2, B1)-10, 1), 10), MID(B1, FIND(B2, B1), 10+LEN(B2))))

Try it online

PHP approach

PHP solution: php - highlight multiple keywords in search - Stack Overflow Unverified

Ranking factors

Possibile factors

References or related articles

to explore strange new worlds / related articles:

other search cases: if the column ... (inspired by OutWit)

  • contains ____
  • does not contain ____
  • begins with ____
  • does not begins with ____
  • ends with ____
  • does not ends with ____
  • equals to ____
  • does not equal ____

references

Related news

MySQL OR nosql related news
AWS明年2月底自動對MySQL 5.7與PostgreSQL 11資料庫,啟用RDS延伸支援 - iThome Online
Amazon RDS MySQL 5.7版本停止更新,升級訣竅帶您一次掌握 - T客邦 Techbang
MongoDB攜朋昶 推廣NoSQL資料庫 - 經濟日報
MySQL與MongoDB:關聯式與非關聯式資料庫比較|Kevin_malamute |凱文-馬拉穆 - 方格子 - 方格子
【資安日報】11月16日,殭屍網路OracleIV、Ddostf分別鎖定Docker環境、MySQL伺服器而來,將其用於發動DDoS攻擊 - iThome Online
360 萬台以上MySQL 伺服器,曝露於Internet 上 - 資安人科技網
AWS Amplify開發框架整合CDK,支援MySQL與PostgreSQL資料庫連接 - iThome Online
實作MySQL備份還原詳實說明三種機制模式 - 網管人雜誌
甲骨文以機器學習強化MySQL Heatwave,叫戰Amazon Aurora、Google BigQuery - iThome Online
MySQL 8.0正式版來了! 高負載讀寫效能是5.7版的2倍 - iThome Online
Google資料庫服務Cloud SQL現支援MySQL 8 - iThome Online
Digitalocean擴充資料庫託管服務,新加入MySQL和Redis - iThome Online
【資安日報】2022年6月1日,逾360萬臺MySQL伺服器曝露於網際網路、中國駭客利用Office零時差漏洞Follina發動攻擊 - iThome Online
Google推出Cloud Spanner,兼具關聯式資料庫與NoSQL優點 - iThome Online
【資安日報】2022年7月29日,駭客竊取SQL Server與MySQL伺服器網路頻寬牟利、臺廠Moxa修補工控裝置伺服器設備 ... - iThome Online
AWS雲端資料庫Aurora正式支援無伺服器MySQL應用 - iThome Online
PostgreSQL 9.5來了,關聯式資料庫也要和NoSQL拼場 - iThome Online
MariaDB 10新版大躍進 - iThome Online
相容於MySQL的Amazon Aurora資料庫服務正式上線 - iThome Online
亞馬遜如何讓Aurora資料庫效能比MySQL快5倍 - iThome Online
兼具傳統關聯式資料庫與NoSQL優點,Google Cloud Spanner正式上線! - iThome Online
蘋果FoundationDB資料庫再升級,不只是鍵值NoSQL資料庫也新增文件資料庫功能 - iThome Online
借助NoSQL資料庫,Acer eDC日誌稽核系統主打超高效能分析 - iThome Online
MySQL爆最高權限漏洞,MariaDB、PerconaDB受累 - iThome Online
Splunk Hunk 6.1可存取Hadoop與NoSQL的大資料 - iThome Online
吞吐量比Cassandra大10倍的NoSQL資料庫Scylla Open Source 3.0釋出 - iThome Online

Powered by Google News