自然语言处理标记工具汇总

自然语言处理标记工具汇总

整理了一些比较好用的自然语言处理标记工具,如有遗漏欢迎补充。

名称 年份 描述 协议 官网 github
doccano 2019 doccano is an open source text annotation tool for human. It provides annotation features for text classification, sequence labeling and sequence to sequence. So, you can create labeled data for sentiment analysis, named entity recognition, text summarization and so on. Just create project, upload data and start annotation. You can build a dataset in hours. MIT https://github.com/chakki-works/doccano https://github.com/chakki-works/doccano
INCEpTION 2018 A semantic annotation platform offering intelligent assistance and knowledge managementThe annotation of specific semantic phenomena often require compiling task-specific corpora and creating or extending task-specific knowledge bases. Presently, researchers require a broad range of skills and tools to address such semantic annotation tasks. Apache https://inception-project.github.io/ https://github.com/inception-project/inception
NeuroNER 2017 NeuroNER is a program that performs named-entity recognition (NER). https://github.com/Franck-Dernoncourt/NeuroNER https://github.com/Franck-Dernoncourt/NeuroNER
Prodigy 2017 Prodigy is a machine teaching tool so efficient that a single data scientist can create end-to-end prototypes for new funtionality without commissioning external annotations, and with a smooth path to production. Whether you're working on entity recognition, intent detection or image classification, Prodigy can help you train and evaluate your models faster. Commercial https://prodi.gy/
chineseannator 2017 Annotator for Chinese Text Corpus Apache https://github.com/deepwel/Chinese-Annotator
Chatito 2017 Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL! MIT https://rodrigopivi.github.io/Chatito/ https://github.com/rodrigopivi/Chatito
YEDDA 2016 YEDDA (the previous SUTDAnnotator) is developed for annotating chunk/entity/event on text (almost all languages including English, Chinese), symbol and even emoji. It supports shortcut annotation which is extremely efficient to annotate text by hand. The user only need to select text span and press shortcut key, the span will be annotated automatically. It also support command annotation model which annotates multiple entities in batch and support export annotated text into sequence text. Apache https://github.com/jiesutd/YEDDA https://github.com/jiesutd/YEDDA
rasa-nlu-trainer 2016 This is a tool to edit your training examples for rasa NLU Use the online version or install with npm Commercial https://github.com/RasaHQ/rasa-nlu-trainer
TALEN 2016 A lightweight web-based tool for annotating word sequences. Research https://github.com/CogComp/talen https://github.com/CogComp/talen
WebAnno 2014 WebAnno is a general purpose web-based annotation tool for a wide range of linguistic annotations including various layers of morphological, syntactical, and semantic annotations.Additionaly, custom annotation layers can be defined, allowing WebAnno to be used also for non-linguistic annotation tasks.WebAnno is a multi-user tool supporting different roles such as annotator, curator, and project manager. The progress and quality of annotation projects can be monitored and measuered in terms of inter-annotator agreement. Multiple annotation projects can be conducted in parallel. Apache https://webanno.github.io/webanno/ https://github.com/webanno/webanno
MAE 2014 MAE is a lightweight, general-purpose natural language annotation tool GPL https://github.com/keighrim/mae-annotation https://github.com/keighrim/mae-annotation
Anafora 2013 Anafora (pronounced "a-nuh-FOUR-uh", /ænəˈfɔɹə/) is a new annotation tool written at the University of Colorado by Wei-te Chen and Will Styler. Anafora is designed to be a lightweight, flexible annotation solution which is easy to deploy for large and small projects. Apache https://github.com/weitechen/anafora https://github.com/weitechen/anafora
brat 2010 brat is a web-based tool for text annotation; that is, for adding notes to existing text documents.brat is designed in particular for structured annotation, where the notes are not freeform text but have a fixed form that can be automatically processed and interpreted by a computer. MIT https://github.com/nlplab/brat https://github.com/nlplab/brat

你可能感兴趣的:(自然语言处理)