Difference between revisions of "Named entity recognition tools"

From LemonWiki共筆
Jump to: navigation, search
Line 1: Line 1:
Named entity recognition (NER) 或稱命名實體辨識、專有名詞辨識
+
Named entity recognition (NER) 或稱[https://zh.wikipedia.org/wiki/%E5%91%BD%E5%90%8D%E5%AE%9E%E4%BD%93%E8%AF%86%E5%88%AB 命名實體識別]、專有名詞識別
  
 
== CKIP Neural Chinese Word Segmentation, POS Tagging, and NER ==
 
== CKIP Neural Chinese Word Segmentation, POS Tagging, and NER ==
Line 8: Line 8:
  
 
<table border="1" class="wikitable sortable">
 
<table border="1" class="wikitable sortable">
<tr><th>Class name in English</th><th>Class name in Chinese</th></tr>
+
<tr><th>Class name in English</th><th>Class name in Traditional Chinese</th></tr>
 
<tr><td>person</td><td>人名</td></tr>
 
<tr><td>person</td><td>人名</td></tr>
 
<tr><td>norp</td><td>團體</td></tr>
 
<tr><td>norp</td><td>團體</td></tr>
Line 34: Line 34:
 
* language support:
 
* language support:
 
* classes of entity: "For English, by default, this annotator recognizes named (PERSON, LOCATION, ORGANIZATION, MISC), numerical (MONEY, NUMBER, ORDINAL, PERCENT), and temporal (DATE, TIME, DURATION, SET) entities (12 classes). <ref>[https://stanfordnlp.github.io/CoreNLP/ner.html#description Named Entity Recognition – NERClassifierCombiner | Stanford CoreNLP]</ref>"
 
* classes of entity: "For English, by default, this annotator recognizes named (PERSON, LOCATION, ORGANIZATION, MISC), numerical (MONEY, NUMBER, ORDINAL, PERCENT), and temporal (DATE, TIME, DURATION, SET) entities (12 classes). <ref>[https://stanfordnlp.github.io/CoreNLP/ner.html#description Named Entity Recognition – NERClassifierCombiner | Stanford CoreNLP]</ref>"
 +
 +
== spaCy ==
 +
[https://spacy.io/ spaCy · Industrial-strength Natural Language Processing in Python]
 +
* license: [https://github.com/explosion/spaCy/blob/master/LICENSE MIT License] {{Gd}}
 +
* language support:
 +
* classes of entity: "PERSON, NORP, FAC, ORG, GPE, LOC, PRODUCT, EVENT, WORK_OF_ART, LAW, LANGUAGE, DATE, TIME, PERCENT, MONEY, QUANTITY, ORDINAL and CARDINAL <ref>[https://spacy.io/api/annotation#named-entities Annotation Specifications · spaCy API Documentation]</ref>"
  
 
== Google Cloud Natural Language ==
 
== Google Cloud Natural Language ==
Line 53: Line 59:
 
* classes of entity: "Date, Duration, EmailAddress, Facility, GeographicFeature, Hashtag, IPAddress, JobTitle, Location and more ..."<ref>[https://cloud.ibm.com/docs/services/natural-language-understanding?topic=natural-language-understanding-entity-types-version-2&locale=en Entity types (Version 2)]</ref>
 
* classes of entity: "Date, Duration, EmailAddress, Facility, GeographicFeature, Hashtag, IPAddress, JobTitle, Location and more ..."<ref>[https://cloud.ibm.com/docs/services/natural-language-understanding?topic=natural-language-understanding-entity-types-version-2&locale=en Entity types (Version 2)]</ref>
  
== spaCy ==
+
== 卓騰語言科技中文斷詞 ==
[https://spacy.io/ spaCy · Industrial-strength Natural Language Processing in Python]
+
[https://api.droidtown.co/ 卓騰語言科技中文斷詞 API]
 +
* license:
 +
* language support: Traditional Chinese
 +
* classes of entity: "person, location, time, measurement and more ... <ref>[https://api.droidtown.co/document/ 卓騰語言科技中文斷詞 API]</ref>"
 +
 
 +
== BosonNLP ==
 +
[https://bosonnlp.com/ BosonNLP]
 
* license:  
 
* license:  
* language support:
+
* language support: simplified Chinese
* classes of entity: "PERSON, NORP, FAC, ORG, GPE, LOC, PRODUCT, EVENT, WORK_OF_ART, LAW, LANGUAGE, DATE, TIME, PERCENT, MONEY, QUANTITY, ORDINAL and CARDINAL <ref>[https://spacy.io/api/annotation#named-entities Annotation Specifications · spaCy API Documentation]</ref>"
+
* classes of entity: "time, location, person_name, org_name, company_name, product_name and job_title <ref>[http://docs.bosonnlp.com/ner.html 命名实体识别 — BosonNLP HTTP API 1.0 documentation]</ref>"
 +
 
 +
<table border="1" class="wikitable sortable">
 +
<tr><th>Class name in English</th><th>Class name in Simplified Chinese</th><th>Class name in Traditional Chinese</th></tr>
 +
<tr><td>time</td><td>时间</td><td>時間</td></tr>
 +
<tr><td>location</td><td>地点</td><td>地點</td></tr>
 +
<tr><td>person_name</td><td>人名</td><td>人名</td></tr>
 +
<tr><td>org_name</td><td>组织名</td><td>組織名</td></tr>
 +
<tr><td>company_name</td><td>公司名</td><td>公司名</td></tr>
 +
<tr><td>product_name</td><td>产品名</td><td>產品名</td></tr>
 +
<tr><td>job_title</td><td>职位</td><td>職位</td></tr>
 +
</table>
  
 
== References ==
 
== References ==

Revision as of 00:39, 12 September 2019

Named entity recognition (NER) 或稱命名實體識別、專有名詞識別

CKIP Neural Chinese Word Segmentation, POS Tagging, and NER

ckiplab/ckiptagger: CKIP Neural Chinese Word Segmentation, POS Tagging, and NER

Class name in EnglishClass name in Traditional Chinese
person人名
norp團體
facility設施
organization組織
gpe地理
location地點
product商品
event事件
work of art藝術品
law法律
language語言
date日期
time時間
percent比例
money
quantity數量
ordinal序數
cardinal數詞

Stanford CoreNLP

Stanford CoreNLP – Natural language software | Stanford CoreNLP

  • license: GNU General Public License v3 Good!
  • language support:
  • classes of entity: "For English, by default, this annotator recognizes named (PERSON, LOCATION, ORGANIZATION, MISC), numerical (MONEY, NUMBER, ORDINAL, PERCENT), and temporal (DATE, TIME, DURATION, SET) entities (12 classes). [2]"

spaCy

spaCy · Industrial-strength Natural Language Processing in Python

  • license: MIT License Good!
  • language support:
  • classes of entity: "PERSON, NORP, FAC, ORG, GPE, LOC, PRODUCT, EVENT, WORK_OF_ART, LAW, LANGUAGE, DATE, TIME, PERCENT, MONEY, QUANTITY, ORDINAL and CARDINAL [3]"

Google Cloud Natural Language

Cloud Natural Language  |  Cloud Natural Language API  |  Google Cloud

Amazon Comprehend

Amazon Comprehend – 自然語言處理(NLP) 和機器學習 (ML)

  • license:
  • language support:
  • classes of entity: "COMMERCIAL_ITEM, DATE, EVENT, LOCATION, ORGANIZATION, OTHER, PERSON, QUANTITY and TITLE"[4]

IBM Watson

Watson Natural Language Understanding

  • license:
  • language support:
  • classes of entity: "Date, Duration, EmailAddress, Facility, GeographicFeature, Hashtag, IPAddress, JobTitle, Location and more ..."[5]

卓騰語言科技中文斷詞

卓騰語言科技中文斷詞 API

  • license:
  • language support: Traditional Chinese
  • classes of entity: "person, location, time, measurement and more ... [6]"

BosonNLP

BosonNLP

  • license:
  • language support: simplified Chinese
  • classes of entity: "time, location, person_name, org_name, company_name, product_name and job_title [7]"
Class name in EnglishClass name in Simplified ChineseClass name in Traditional Chinese
time时间時間
location地点地點
person_name人名人名
org_name组织名組織名
company_name公司名公司名
product_name产品名產品名
job_title职位職位

References