Simple data anonymization: Difference between revisions
Jump to navigation
Jump to search
| Line 26: | Line 26: | ||
== case2: 王小明 --> 王O明 == | == case2: 王小明 --> 王O明 == | ||
ex: 王小明 --> 王O明 | ex: | ||
* 楊過 --> 楊O | |||
* 王小明 --> 王O明 | |||
* 孤獨求敗 --> 孤OO敗 | |||
* Guo da-xia --> GOOOOOOOOa | |||
methods: | |||
* Excel: | * Excel: | ||
** {{kbd | key=<nowiki>= | ** {{kbd | key=<nowiki>=IF(LEN(A1)=2, LEFT(A1, 1)&"O", LEFT(A1, 1)&REPT("O", LEN(A1)-2)&RIGHT(A1, 1))</nowiki>}} | ||
** {{kbd | key=<nowiki>=REPLACE( | ** {{kbd | key=<nowiki>=REPLACE(A1, 2, 1, "O")</nowiki>}}<ref>[http://blog.xuite.net/yh96301/blog/80724141-Excel+2010%E5%A7%93%E5%90%8D%E7%9A%84%E7%AC%AC%E4%BA%8C%E5%80%8B%E5%AD%97%E5%8F%96%E4%BB%A3%E7%82%BAO Excel 2010姓名的第二個字取代為O @ 軟體使用教學 :: 隨意窩 Xuite日誌]</ref> {{exclaim}} only applied for 3 words, NOT for 4 words | ||
* PHP: using regular_replace | * PHP: using regular_replace | ||
<pre> | <pre> | ||
$string = | if(mb_strlen($string, "UTF-8") == 2){ | ||
$pattern = '/^(\X)(\X)(\X | echo mb_substr($string, 0, 1, "UTF-8") . "O"; | ||
$ | |||
echo | }else{ | ||
$pattern = '/^(\X)(\X+)(\X)/u'; | |||
preg_match($pattern, $string, $matches); | |||
echo $matches[1]. str_repeat("O", mb_strlen($string, "UTF-8") - 2) . $matches[3]; | |||
} | |||
</pre> | </pre> | ||
* MySQL: | * MySQL: | ||
<pre> | <pre> | ||
SELECT CONCAT( | SET @name := "楊過"; | ||
-- SET @name := "王小明"; | |||
-- SET @name := "孤獨求敗"; | |||
-- SET @name := "Guo da-xia"; | |||
SELECT CASE | |||
WHEN CHAR_LENGTH(@name) =2 THEN CONCAT(LEFT(@name, 1), 'O') | |||
ELSE CONCAT(LEFT(@name, 1), REPEAT('O', CHAR_LENGTH(@name)-2), RIGHT(@name, 1)) | |||
END; | |||
</pre> | </pre> | ||
Revision as of 10:25, 29 July 2016
Simple data anonymization 使用 Excel 或 MySQL 資料庫查詢方式,做簡易個資去識別化
case1: 王小明 --> 王OO
ex:
- 楊過 -> 楊OO
- 王小明 --> 王OO
- 孤獨求敗 --> 孤OO
- Guo da-xia --> GOO
methods
- Excel:
- =REPLACE(A2, 2, LEN(A2)-1, "OO") also applied for 3 or 4 words
- =REPLACE(A2, 2, 2, "O")
only applied for 3 words, NOT for 4 words
- MySQL:
-- SET @name := "楊過"; SET @name := "王小明"; -- SET @name := "孤獨求敗"; -- SET @name := "Guo da-xia"; SELECT CONCAT(LEFT(@name, 1), 'OO');
case2: 王小明 --> 王O明
ex:
- 楊過 --> 楊O
- 王小明 --> 王O明
- 孤獨求敗 --> 孤OO敗
- Guo da-xia --> GOOOOOOOOa
methods:
- Excel:
- =IF(LEN(A1)=2, LEFT(A1, 1)&"O", LEFT(A1, 1)&REPT("O", LEN(A1)-2)&RIGHT(A1, 1))
- =REPLACE(A1, 2, 1, "O")[1]
only applied for 3 words, NOT for 4 words
- PHP: using regular_replace
if(mb_strlen($string, "UTF-8") == 2){
echo mb_substr($string, 0, 1, "UTF-8") . "O";
}else{
$pattern = '/^(\X)(\X+)(\X)/u';
preg_match($pattern, $string, $matches);
echo $matches[1]. str_repeat("O", mb_strlen($string, "UTF-8") - 2) . $matches[3];
}
- MySQL:
SET @name := "楊過";
-- SET @name := "王小明";
-- SET @name := "孤獨求敗";
-- SET @name := "Guo da-xia";
SELECT CASE
WHEN CHAR_LENGTH(@name) =2 THEN CONCAT(LEFT(@name, 1), 'O')
ELSE CONCAT(LEFT(@name, 1), REPEAT('O', CHAR_LENGTH(@name)-2), RIGHT(@name, 1))
END;
reference
further reading