Named entity recognition tools: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
Line 13: | Line 13: | ||
<tr><td>norp</td><td>團體</td></tr> | <tr><td>norp</td><td>團體</td></tr> | ||
<tr><td>FAC</td><td>設施</td></tr> | <tr><td>FAC</td><td>設施</td></tr> | ||
<tr><td>facility</td><td>設施</td></tr> | <tr><td>facility</td><td>設施*</td></tr> | ||
<tr><td>ORG</td><td>組織</td></tr> | <tr><td>ORG</td><td>組織</td></tr> | ||
<tr><td>organization</td><td>組織</td></tr> | <tr><td>organization</td><td>組織*</td></tr> | ||
<tr><td>gpe</td><td>地理</td></tr> | <tr><td>gpe</td><td>地理</td></tr> | ||
<tr><td>LOC</td><td>地點</td></tr> | <tr><td>LOC</td><td>地點</td></tr> | ||
<tr><td>location</td><td>地點</td></tr> | <tr><td>location</td><td>地點*</td></tr> | ||
<tr><td>product</td><td>商品</td></tr> | <tr><td>product</td><td>商品</td></tr> | ||
<tr><td>event</td><td>事件</td></tr> | <tr><td>event</td><td>事件</td></tr> | ||
<tr><td>WORK</td><td>藝術品</td></tr> | <tr><td>WORK</td><td>藝術品</td></tr> | ||
<tr><td>work of art</td><td>藝術品</td></tr> | <tr><td>work of art</td><td>藝術品*</td></tr> | ||
<tr><td>law</td><td>法律</td></tr> | <tr><td>law</td><td>法律</td></tr> | ||
<tr><td>language</td><td>語言</td></tr> | <tr><td>language</td><td>語言</td></tr> | ||
Line 33: | Line 33: | ||
<tr><td>cardinal</td><td>數詞</td></tr> | <tr><td>cardinal</td><td>數詞</td></tr> | ||
</table> | </table> | ||
: [[Image:Owl icon.jpg]] Notes: Wilcat symbol means there are different class name in English but same class name in Chinese. | |||
== Stanford CoreNLP == | == Stanford CoreNLP == |
Revision as of 09:50, 9 April 2020
Named entity recognition (NER) 或稱命名實體識別、實體識別、專有名詞辨識
CKIP Neural Chinese Word Segmentation, POS Tagging, and NER
ckiplab/ckiptagger: CKIP Neural Chinese Word Segmentation, POS Tagging, and NER
- license: GNU General Public License v3.0
- language support: Traditional Chinese
- programming language: Python
- classes of entity[1]
Class name in English | Class name in Traditional Chinese |
---|---|
person | 人名 |
norp | 團體 |
FAC | 設施 |
facility | 設施* |
ORG | 組織 |
organization | 組織* |
gpe | 地理 |
LOC | 地點 |
location | 地點* |
product | 商品 |
event | 事件 |
WORK | 藝術品 |
work of art | 藝術品* |
law | 法律 |
language | 語言 |
date | 日期 |
time | 時間 |
percent | 比例 |
money | 錢 |
quantity | 數量 |
ordinal | 序數 |
cardinal | 數詞 |
- Notes: Wilcat symbol means there are different class name in English but same class name in Chinese.
Stanford CoreNLP
Stanford CoreNLP – Natural language software | Stanford CoreNLP
- license: GNU General Public License v3
- language support:
- programming language: Java
- classes of entity: "For English, by default, this annotator recognizes named (PERSON, LOCATION, ORGANIZATION, MISC), numerical (MONEY, NUMBER, ORDINAL, PERCENT), and temporal (DATE, TIME, DURATION, SET) entities (12 classes). [2]"
spaCy
spaCy · Industrial-strength Natural Language Processing in Python
- license: MIT License
- language support:
- programming language: Python
- classes of entity: "PERSON, NORP, FAC, ORG, GPE, LOC, PRODUCT, EVENT, WORK_OF_ART, LAW, LANGUAGE, DATE, TIME, PERCENT, MONEY, QUANTITY, ORDINAL and CARDINAL [3]"
Google Cloud Natural Language
Cloud Natural Language | Cloud Natural Language API | Google Cloud
- license:
- language support: 語言支援 | Cloud Natural Language API | Google Cloud included Traditional Chinese
- programming language: multiple
- classes of entity: Details on Entity | Cloud Natural Language API | Google Cloud -> Type of the entity e.g. "UNKNOWN, PERSON, LOCATION, ORGANIZATION, EVENT, WORK_OF_ART, CONSUMER_GOOD, OTHER, PHONE_NUMBER, ADDRESS, DATE, NUMBER and PRICE"
Amazon Comprehend
Amazon Comprehend – 自然語言處理(NLP) 和機器學習 (ML)
- license:
- language support:
- programming language:
- classes of entity: "COMMERCIAL_ITEM, DATE, EVENT, LOCATION, ORGANIZATION, OTHER, PERSON, QUANTITY and TITLE"[4]
IBM Watson
Watson Natural Language Understanding
- license:
- language support:
- programming language:
- classes of entity: "Date, Duration, EmailAddress, Facility, GeographicFeature, Hashtag, IPAddress, JobTitle, Location and more ..."[5]
卓騰語言科技中文斷詞
- license:
- language support: Traditional Chinese
- programming language:
- classes of entity: "person, location, time, measurement and more ... [6]"
BosonNLP
- license:
- language support: simplified Chinese
- programming language: multiple
- classes of entity: "time, location, person_name, org_name, company_name, product_name and job_title [7]"
Class name in English | Class name in Simplified Chinese | Class name in Traditional Chinese |
---|---|---|
time | 时间 | 時間 |
location | 地点 | 地點 |
person_name | 人名 | 人名 |
org_name | 组织名 | 組織名 |
company_name | 公司名 | 公司名 |
product_name | 产品名 | 產品名 |
job_title | 职位 | 職位 |
百度AI开放平台
语言处理基础技术-百度AI开放平台 "专名识别"[8]
- license:
- language support: simplified Chinese
- programming language: multiple
- classes of entity:
Class name in English (缩略词) | Class name in Simplified Chinese | Class name in Traditional Chinese |
---|---|---|
PER | 人名 | 人名 |
LOC | 地名 | 地名 |
ORG | 机构名 | 機構名 |
TIME | 时间 | 時間 |
other NER tools
- Article Extraction API Documentation - Diffbot "Array of tags/entities, generated from analysis of the extracted text and cross-referenced with DBpedia and other data sources. Language-specific tags will be returned if the source text is in English, Chinese, French, German, Spanish or Russian."
- 搭配文字分析 API 使用實體辨識 - Azure Cognitive Services | Microsoft Docs