Named entity recognition tools: Difference between revisions
Jump to navigation
Jump to search
| Line 55: | Line 55: | ||
* language support: [https://cloud.google.com/natural-language/docs/languages 語言支援 | Cloud Natural Language API | Google Cloud] included Traditional Chinese | * language support: [https://cloud.google.com/natural-language/docs/languages 語言支援 | Cloud Natural Language API | Google Cloud] included Traditional Chinese | ||
* programming language: multiple | * programming language: multiple | ||
* Score: Available. '''salience score''' in the [0, 1.0] range. "The salience score for an entity provides information about the importance or centrality of that entity to the entire document text. Scores closer to 0 are less salient, while scores closer to 1.0 are highly salient.<ref>[https://cloud.google.com/natural-language/docs/reference/rest/v1/Entity Entity | Cloud Natural Language API | Google Cloud]</ref>" | |||
* classes of entity: Details on [https://cloud.google.com/natural-language/docs/reference/rest/v1/Entity Entity | Cloud Natural Language API | Google Cloud] -> Type of the entity e.g. "UNKNOWN, PERSON, LOCATION, ORGANIZATION, EVENT, WORK_OF_ART, CONSUMER_GOOD, OTHER, PHONE_NUMBER, ADDRESS, DATE, NUMBER and PRICE" | * classes of entity: Details on [https://cloud.google.com/natural-language/docs/reference/rest/v1/Entity Entity | Cloud Natural Language API | Google Cloud] -> Type of the entity e.g. "UNKNOWN, PERSON, LOCATION, ORGANIZATION, EVENT, WORK_OF_ART, CONSUMER_GOOD, OTHER, PHONE_NUMBER, ADDRESS, DATE, NUMBER and PRICE" | ||
Revision as of 17:04, 31 August 2020
Named entity recognition (NER) 或稱命名實體識別、實體識別、專有名詞辨識
CKIP Neural Chinese Word Segmentation, POS Tagging, and NER
ckiplab/ckiptagger: CKIP Neural Chinese Word Segmentation, POS Tagging, and NER
- license: GNU General Public License v3.0

- language support: Traditional Chinese
- programming language: Python
- classes of entity[1]
| Class name in English | Class name in Traditional Chinese |
|---|---|
| person | 人名 |
| norp | 團體 |
| FAC | 設施 |
| facility | 設施* |
| ORG | 組織 |
| organization | 組織* |
| gpe | 地理 |
| LOC | 地點 |
| location | 地點* |
| product | 商品 |
| event | 事件 |
| WORK | 藝術品 |
| work of art | 藝術品* |
| law | 法律 |
| language | 語言 |
| date | 日期 |
| time | 時間 |
| percent | 比例 |
| money | 錢 |
| quantity | 數量 |
| ordinal | 序數 |
| cardinal | 數詞 |
Notes: Asterisk symbol means there are different class name in English but same class name in Chinese.
Stanford CoreNLP
Stanford CoreNLP – Natural language software | Stanford CoreNLP
- license: GNU General Public License v3

- language support: English, Chinese ..
- programming language: Java
- classes of entity: "For English, by default, this annotator recognizes named (PERSON, LOCATION, ORGANIZATION, MISC), numerical (MONEY, NUMBER, ORDINAL, PERCENT), and temporal (DATE, TIME, DURATION, SET) entities (12 classes). [2]"
spaCy
spaCy · Industrial-strength Natural Language Processing in Python
- license: MIT License

- language support:
- programming language: Python
- classes of entity: "PERSON, NORP, FAC, ORG, GPE, LOC, PRODUCT, EVENT, WORK_OF_ART, LAW, LANGUAGE, DATE, TIME, PERCENT, MONEY, QUANTITY, ORDINAL and CARDINAL [3]"
Google Cloud Natural Language
Cloud Natural Language | Cloud Natural Language API | Google Cloud
- license:
- language support: 語言支援 | Cloud Natural Language API | Google Cloud included Traditional Chinese
- programming language: multiple
- Score: Available. salience score in the [0, 1.0] range. "The salience score for an entity provides information about the importance or centrality of that entity to the entire document text. Scores closer to 0 are less salient, while scores closer to 1.0 are highly salient.[4]"
- classes of entity: Details on Entity | Cloud Natural Language API | Google Cloud -> Type of the entity e.g. "UNKNOWN, PERSON, LOCATION, ORGANIZATION, EVENT, WORK_OF_ART, CONSUMER_GOOD, OTHER, PHONE_NUMBER, ADDRESS, DATE, NUMBER and PRICE"
Amazon Comprehend
Amazon Comprehend – 自然語言處理(NLP) 和機器學習 (ML)
- license:
- language support:
- programming language:
- classes of entity: "COMMERCIAL_ITEM, DATE, EVENT, LOCATION, ORGANIZATION, OTHER, PERSON, QUANTITY and TITLE"[5] as the following:
| Type | Description | Type 中文 |
|---|---|---|
| COMMERCIAL_ITEM | A branded product | 商品 |
| DATE | A full date (for example, 11/25/2017), day (Tuesday), month (May), or time (8:30 a.m.) | 日期 |
| EVENT | An event, such as a festival, concert, election, etc. | 事件 |
| LOCATION | A specific location, such as a country, city, lake, building, etc. | 地點 |
| ORGANIZATION | Large organizations, such as a government, company, religion, sports team, etc. | 機構 |
| OTHER | Entities that don't fit into any of the other entity categories | 其他 |
| PERSON | Individuals, groups of people, nicknames, fictional characters | 人名 |
| QUANTITY | A quantified amount, such as currency, percentages, numbers, bytes, etc. | 量詞 |
| TITLE | An official name given to any creation or creative work, such as movies, books, songs, etc. | 抬頭 |
IBM Watson
Watson Natural Language Understanding
- license:
- language support:
- programming language:
- classes of entity: "Date, Duration, EmailAddress, Facility, GeographicFeature, Hashtag, IPAddress, JobTitle, Location and more ..."[6]
卓騰語言科技中文斷詞
- license:
- language support: Traditional Chinese
- programming language:
- classes of entity: "person, location, time, measurement and more ... [7]"
百度AI开放平台
语言处理基础技术-百度AI开放平台 "专名识别"[8]
- license:
- language support: simplified Chinese
- programming language: multiple
- classes of entity:
| Class name in English (缩略词) | Class name in Simplified Chinese | Class name in Traditional Chinese |
|---|---|---|
| PER | 人名 | 人名 |
| LOC | 地名 | 地名 |
| ORG | 机构名 | 機構名 |
| TIME | 时间 | 時間 |
BosonNLP (out of service)
- license:
- language support: simplified Chinese
- programming language: multiple
- classes of entity: "time, location, person_name, org_name, company_name, product_name and job_title [9]"
| Class name in English | Class name in Simplified Chinese | Class name in Traditional Chinese |
|---|---|---|
| time | 时间 | 時間 |
| location | 地点 | 地點 |
| person_name | 人名 | 人名 |
| org_name | 组织名 | 組織名 |
| company_name | 公司名 | 公司名 |
| product_name | 产品名 | 產品名 |
| job_title | 职位 | 職位 |
other NER tools
- Article Extraction API Documentation - Diffbot "Array of tags/entities, generated from analysis of the extracted text and cross-referenced with DBpedia and other data sources. Language-specific tags will be returned if the source text is in English, Chinese, French, German, Spanish or Russian."
- 搭配文字分析 API 使用實體辨識 - Azure Cognitive Services | Microsoft Docs
References
- ↑ 中文專有名詞辨識系統 簡報
- ↑ Named Entity Recognition – NERClassifierCombiner | Stanford CoreNLP
- ↑ Annotation Specifications · spaCy API Documentation
- ↑ Entity | Cloud Natural Language API | Google Cloud
- ↑ Detect Entities - Amazon Comprehend
- ↑ Entity types (Version 2)
- ↑ 卓騰語言科技中文斷詞 API
- ↑ 词法分析接口
- ↑ 命名实体识别 — BosonNLP HTTP API 1.0 documentation