Simple data anonymization: Difference between revisions

From LemonWiki共筆
Jump to navigation Jump to search
No edit summary
No edit summary
Line 2: Line 2:


== 姓名的個資去識別化 ==
== 姓名的個資去識別化 ==
=== case1: 王小明 --> 王O明; 孤獨求敗 --> 孤O敗 ===
=== case1: 王小明 --> 王O明; 孤獨求敗 --> 孤OO敗 ===
 
ex:
* 楊過 -> 楊O
* 王小明 --> 王O明
* 孤獨求敗 --> 孤O敗
* Guo da-xia --> GOa
 
 
methods
* Excel:
** {{kbd | key=<nowiki>=IF(LEN(A2)=2, LEFT(A2, 1)&"O", LEFT(A2, 1)&REPT("O", 1)&RIGHT(A2, 1))</nowiki>}} also applied for 3 or 4 characters
 
=== case2: 王小明 --> 王O明; 孤獨求敗 --> 孤OO敗 ===
Reserve the first and last characters of name, and replace other characters with the symbol 'OO'.
Reserve the first and last characters of name, and replace other characters with the symbol 'OO'.


Line 52: Line 39:
     END;
     END;
</pre>
</pre>
=== case2: 王小明 --> 王O明; 孤獨求敗 --> 孤O敗 ===
ex:
* 楊過 -> 楊O
* 王小明 --> 王O明
* 孤獨求敗 --> 孤O敗
* Guo da-xia --> GOa
methods
* Excel:
** {{kbd | key=<nowiki>=IF(LEN(A2)=2, LEFT(A2, 1)&"O", LEFT(A2, 1)&REPT("O", 1)&RIGHT(A2, 1))</nowiki>}} also applied for 3 or 4 characters


=== case3: 王小明 --> 王OO; 孤獨求敗 --> 孤OO ===
=== case3: 王小明 --> 王OO; 孤獨求敗 --> 孤OO ===

Revision as of 11:44, 22 July 2017

使用 Excel 或 MySQL 資料庫查詢方式,簡易地將個人資料去識別化 (將姓名馬賽克) 。

姓名的個資去識別化

case1: 王小明 --> 王O明; 孤獨求敗 --> 孤OO敗

Reserve the first and last characters of name, and replace other characters with the symbol 'OO'.

ex:

  • 楊過 --> 楊O
  • 王小明 --> 王O明
  • 孤獨求敗 --> 孤OO敗
  • Guo da-xia --> GOOOOOOOOa

methods:

  • Excel:
    • =IF(LEN(A1)=2, LEFT(A1, 1)&"O", LEFT(A1, 1)&REPT("O", LEN(A1)-2)&RIGHT(A1, 1))
    • =REPLACE(A1, 2, 1, "O")[1] Icon_exclaim.gif only applied for 3 characters, NOT for 4 characters
  • PHP: using regular_replace
if(mb_strlen($string, "UTF-8") == 2){
   echo mb_substr($string, 0, 1, "UTF-8") . "O";

}else{
  $pattern = '/^(\X)(\X+)(\X)/u';
  preg_match($pattern, $string, $matches);
  echo $matches[1]. str_repeat("O", mb_strlen($string, "UTF-8") - 2) . $matches[3];

}
  • MySQL:
SET @name := "楊過";
-- SET @name := "王小明";
-- SET @name := "孤獨求敗";
-- SET @name := "Guo da-xia";

SELECT CASE  
    WHEN CHAR_LENGTH(@name) =2 THEN CONCAT(LEFT(@name, 1), 'O')
    ELSE CONCAT(LEFT(@name, 1), REPEAT('O', CHAR_LENGTH(@name)-2), RIGHT(@name, 1))
    END;

case2: 王小明 --> 王O明; 孤獨求敗 --> 孤O敗

ex:

  • 楊過 -> 楊O
  • 王小明 --> 王O明
  • 孤獨求敗 --> 孤O敗
  • Guo da-xia --> GOa


methods

  • Excel:
    • =IF(LEN(A2)=2, LEFT(A2, 1)&"O", LEFT(A2, 1)&REPT("O", 1)&RIGHT(A2, 1)) also applied for 3 or 4 characters


case3: 王小明 --> 王OO; 孤獨求敗 --> 孤OO

Reserve the first character of name, and replace other characters with the symbol 'OO'.

ex:

  • 楊過 -> 楊OO
  • 王小明 --> 王OO
  • 孤獨求敗 --> 孤OO
  • Guo da-xia --> GOO


methods

  • Excel:
    • =REPLACE(A2, 2, LEN(A2)-1, "OO") also applied for 3 or 4 characters
    • =REPLACE(A2, 2, 2, "O") Icon_exclaim.gif only applied for 3 characters, NOT for 4 characters
  • MySQL:
-- SET @name := "楊過";
SET @name := "王小明";
-- SET @name := "孤獨求敗";
-- SET @name := "Guo da-xia";

SELECT CONCAT(LEFT(@name, 1), 'OO');

case4: 王小明 --> 王OO; 孤獨求敗 --> 孤OO敗

Replace the second and third characters with the symbol 'OO', and reserve other characters.

  • 楊過 --> 楊OO
  • 王小明 --> 王OO
  • 孤獨求敗 --> 孤OO敗
  • Guo da-xia --> GOO da-xia

methods:

  • Excel
    • =REPLACE(A1, 2, 2, "OO")
  • PHP
//$string = '楊過';
$string = '王小明';
//$string = '孤獨求敗';
//$string = 'Guo da-xia';

$pattern = '/^(\X)(\X\X?)(\X?)/u';
$replacement = '$1OO$3';
echo preg_replace($pattern, $replacement, $string);
  • MySQL
-- SET @name := "楊過";
SET @name := "王小明";
-- SET @name := "孤獨求敗";
-- SET @name := "Guo da-xia";

SELECT CONCAT(LEFT(@name, 1), 'OO', RIGHT(@name, CHAR_LENGTH(@name)-3));


手機號碼的個資去識別化

case1: 0912345678 --> 09123XXXXX8

methods:

  • Excel
    • =LEFT(A2, 5)&REPT("X", 5)&RIGHT(A2, 1)

references

further reading

related terms