Simple data anonymization: Difference between revisions
Jump to navigation
Jump to search
m (Text replacement - "== references ==" to "== References ==") |
|||
| (11 intermediate revisions by one other user not shown) | |||
| Line 1: | Line 1: | ||
使用 Excel 或 MySQL 資料庫查詢方式,簡易地將個人資料去識別化 (將姓名馬賽克) 。 | |||
== 姓名的個資去識別化 == | |||
=== case1: 王小明 --> 王O明; 孤獨求敗 --> 孤OO敗 === | |||
Reserve the first and last characters of name, and replace other characters with the symbol 'OO'. | |||
ex: | ex: | ||
* 楊過 --> 楊O | * 楊過 --> 楊O | ||
| Line 35: | Line 14: | ||
* Excel: | * Excel: | ||
** {{kbd | key=<nowiki>=IF(LEN(A1)=2, LEFT(A1, 1)&"O", LEFT(A1, 1)&REPT("O", LEN(A1)-2)&RIGHT(A1, 1))</nowiki>}} | ** {{kbd | key=<nowiki>=IF(LEN(A1)=2, LEFT(A1, 1)&"O", LEFT(A1, 1)&REPT("O", LEN(A1)-2)&RIGHT(A1, 1))</nowiki>}} | ||
** {{kbd | key=<nowiki>=REPLACE(A1, 2, 1, "O")</nowiki>}}<ref>[http://blog.xuite.net/yh96301/blog/80724141-Excel+2010%E5%A7%93%E5%90%8D%E7%9A%84%E7%AC%AC%E4%BA%8C%E5%80%8B%E5%AD%97%E5%8F%96%E4%BB%A3%E7%82%BAO Excel 2010姓名的第二個字取代為O @ 軟體使用教學 :: 隨意窩 Xuite日誌]</ref> {{exclaim}} only applied for 3 | ** {{kbd | key=<nowiki>=REPLACE(A1, 2, 1, "O")</nowiki>}}<ref>[http://blog.xuite.net/yh96301/blog/80724141-Excel+2010%E5%A7%93%E5%90%8D%E7%9A%84%E7%AC%AC%E4%BA%8C%E5%80%8B%E5%AD%97%E5%8F%96%E4%BB%A3%E7%82%BAO Excel 2010姓名的第二個字取代為O @ 軟體使用教學 :: 隨意窩 Xuite日誌]</ref> {{exclaim}} only applied for 3 characters, NOT for 4 characters | ||
* PHP: using regular_replace | * PHP: using regular_replace | ||
<pre> | <pre> | ||
| Line 60: | Line 39: | ||
END; | END; | ||
</pre> | </pre> | ||
=== case2: 王小明 --> 王O明; 孤獨求敗 --> 孤O敗 === | |||
ex: | |||
* 楊過 -> 楊O | |||
* 王小明 --> 王O明 | |||
* 孤獨求敗 --> 孤O敗 | |||
* Guo da-xia --> GOa | |||
methods | |||
* Excel: | |||
** {{kbd | key=<nowiki>=IF(LEN(A2)=2, LEFT(A2, 1)&"O", LEFT(A2, 1)&REPT("O", 1)&RIGHT(A2, 1))</nowiki>}} also applied for 3 or 4 characters | |||
=== case3: 王小明 --> 王OO; 孤獨求敗 --> 孤OO === | |||
Reserve the first character of name, and replace other characters with the symbol 'OO'. | |||
ex: | |||
* 楊過 -> 楊OO | |||
* 王小明 --> 王OO | |||
* 孤獨求敗 --> 孤OO | |||
* Guo da-xia --> GOO | |||
methods | |||
* Excel: | |||
** {{kbd | key=<nowiki>=REPLACE(A2, 2, LEN(A2)-1, "OO")</nowiki>}} also applied for 3 or 4 characters | |||
** {{kbd | key=<nowiki>=REPLACE(A2, 2, 2, "O")</nowiki>}} {{exclaim}} only applied for 3 characters, NOT for 4 characters | |||
* MySQL: | |||
<pre> | |||
-- SET @name := "楊過"; | |||
SET @name := "王小明"; | |||
-- SET @name := "孤獨求敗"; | |||
-- SET @name := "Guo da-xia"; | |||
SELECT CONCAT(LEFT(@name, 1), 'OO'); | |||
</pre> | |||
=== case4: 王小明 --> 王OO; 孤獨求敗 --> 孤OO敗 === | |||
Replace the second and third characters with the symbol 'OO', and reserve other characters. | |||
* 楊過 --> 楊OO | |||
* 王小明 --> 王OO | |||
* 孤獨求敗 --> 孤OO敗 | |||
* Guo da-xia --> GOO da-xia | |||
methods: | |||
* Excel | |||
** {{kbd | key=<nowiki>=REPLACE(A1, 2, 2, "OO")</nowiki>}} | |||
* PHP | |||
<pre> | |||
//$string = '楊過'; | |||
$string = '王小明'; | |||
//$string = '孤獨求敗'; | |||
//$string = 'Guo da-xia'; | |||
$pattern = '/^(\X)(\X\X?)(\X?)/u'; | |||
$replacement = '$1OO$3'; | |||
echo preg_replace($pattern, $replacement, $string); | |||
</pre> | |||
* MySQL | |||
<pre> | |||
-- SET @name := "楊過"; | |||
SET @name := "王小明"; | |||
-- SET @name := "孤獨求敗"; | |||
-- SET @name := "Guo da-xia"; | |||
SELECT CONCAT(LEFT(@name, 1), 'OO', RIGHT(@name, CHAR_LENGTH(@name)-3)); | |||
</pre> | |||
== 手機號碼的個資去識別化 == | |||
=== case1: 0912345678 --> 09123XXXXX8 === | |||
methods: | |||
* Excel | |||
** {{kbd | key=<nowiki>=LEFT(A2, 5)&REPT("X", 5)&RIGHT(A2, 1)</nowiki>}} | |||
== References == | |||
<references /> | <references /> | ||
further reading | further reading | ||
* [https://vtaiwan.tw/personal-data-protection/init/ 個資利用與去識別化 - vTaiwan.tw] | * [https://vtaiwan.tw/personal-data-protection/init/ 個資利用與去識別化 - vTaiwan.tw] | ||
* [http://www.cc.ntu.edu.tw/chinese/epaper/0040/20170320_4008.html 淺談個資遮罩] | |||
related terms | |||
* [https://en.wikipedia.org/wiki/Data_masking Data masking] ([http://www.ithome.com.tw/tech/87443 資料遮罩]) | |||
[[Category:Data transformation]] [[Category:Data Science]] [[Category:Excel]] | [[Category:Data transformation]] [[Category:Data Science]] [[Category:Excel]] | ||
Latest revision as of 17:13, 10 July 2022
使用 Excel 或 MySQL 資料庫查詢方式,簡易地將個人資料去識別化 (將姓名馬賽克) 。
姓名的個資去識別化[edit]
case1: 王小明 --> 王O明; 孤獨求敗 --> 孤OO敗[edit]
Reserve the first and last characters of name, and replace other characters with the symbol 'OO'.
ex:
- 楊過 --> 楊O
- 王小明 --> 王O明
- 孤獨求敗 --> 孤OO敗
- Guo da-xia --> GOOOOOOOOa
methods:
- Excel:
- =IF(LEN(A1)=2, LEFT(A1, 1)&"O", LEFT(A1, 1)&REPT("O", LEN(A1)-2)&RIGHT(A1, 1))
- =REPLACE(A1, 2, 1, "O")[1]
only applied for 3 characters, NOT for 4 characters
- PHP: using regular_replace
if(mb_strlen($string, "UTF-8") == 2){
echo mb_substr($string, 0, 1, "UTF-8") . "O";
}else{
$pattern = '/^(\X)(\X+)(\X)/u';
preg_match($pattern, $string, $matches);
echo $matches[1]. str_repeat("O", mb_strlen($string, "UTF-8") - 2) . $matches[3];
}
- MySQL:
SET @name := "楊過";
-- SET @name := "王小明";
-- SET @name := "孤獨求敗";
-- SET @name := "Guo da-xia";
SELECT CASE
WHEN CHAR_LENGTH(@name) =2 THEN CONCAT(LEFT(@name, 1), 'O')
ELSE CONCAT(LEFT(@name, 1), REPEAT('O', CHAR_LENGTH(@name)-2), RIGHT(@name, 1))
END;
case2: 王小明 --> 王O明; 孤獨求敗 --> 孤O敗[edit]
ex:
- 楊過 -> 楊O
- 王小明 --> 王O明
- 孤獨求敗 --> 孤O敗
- Guo da-xia --> GOa
methods
- Excel:
- =IF(LEN(A2)=2, LEFT(A2, 1)&"O", LEFT(A2, 1)&REPT("O", 1)&RIGHT(A2, 1)) also applied for 3 or 4 characters
case3: 王小明 --> 王OO; 孤獨求敗 --> 孤OO[edit]
Reserve the first character of name, and replace other characters with the symbol 'OO'.
ex:
- 楊過 -> 楊OO
- 王小明 --> 王OO
- 孤獨求敗 --> 孤OO
- Guo da-xia --> GOO
methods
- Excel:
- =REPLACE(A2, 2, LEN(A2)-1, "OO") also applied for 3 or 4 characters
- =REPLACE(A2, 2, 2, "O")
only applied for 3 characters, NOT for 4 characters
- MySQL:
-- SET @name := "楊過"; SET @name := "王小明"; -- SET @name := "孤獨求敗"; -- SET @name := "Guo da-xia"; SELECT CONCAT(LEFT(@name, 1), 'OO');
case4: 王小明 --> 王OO; 孤獨求敗 --> 孤OO敗[edit]
Replace the second and third characters with the symbol 'OO', and reserve other characters.
- 楊過 --> 楊OO
- 王小明 --> 王OO
- 孤獨求敗 --> 孤OO敗
- Guo da-xia --> GOO da-xia
methods:
- Excel
- =REPLACE(A1, 2, 2, "OO")
- PHP
//$string = '楊過'; $string = '王小明'; //$string = '孤獨求敗'; //$string = 'Guo da-xia'; $pattern = '/^(\X)(\X\X?)(\X?)/u'; $replacement = '$1OO$3'; echo preg_replace($pattern, $replacement, $string);
- MySQL
-- SET @name := "楊過"; SET @name := "王小明"; -- SET @name := "孤獨求敗"; -- SET @name := "Guo da-xia"; SELECT CONCAT(LEFT(@name, 1), 'OO', RIGHT(@name, CHAR_LENGTH(@name)-3));
手機號碼的個資去識別化[edit]
case1: 0912345678 --> 09123XXXXX8[edit]
methods:
- Excel
- =LEFT(A2, 5)&REPT("X", 5)&RIGHT(A2, 1)
References[edit]
further reading
related terms