总结下来就3步,上传数据,标注,下载带有标签的数据。
官网:http://doccano.herokuapp.com/
第一个演示是序列标记任务之一,命名实体识别。 您只需选择文本跨度并对其进行标注即可。 由于doccano支持快捷键,因此您可以快速标注文本跨度。
第二个演示是文本分类任务之一,主题分类。 由于可能有多个类别,因此您可以标注多个标签。
最终演示是序列任务,机器翻译的序列之一。 由于序列任务的顺序可能不止一个,因此您可以创建多个响应。
安装:我使用的安装方式是docker镜像:
【不知道docker是什么?docker有什么用?docker的学习资料?docker的相关资料稍后会整理成博文,链接将贴在此处;敬请期待】
docker pull chakkiworks/doccano
docker run -d --rm --name doccano \
-e "ADMIN_USERNAME=admin" \
-e "[email protected]" \
-e "ADMIN_PASSWORD=password" \
-p 8000:8000 \
chakkiworks/doccano
其中可自定义用户名和密码并设置邮箱:
用户名:ADMIN_USERNAME=yourself_username
邮箱:ADMIN_EMAIL=yourself_email
密码:ADMIN_PASSWORD=yourself_password
现在,打开Web浏览器并转到http://IP:8000/login/。 你应该看到登录界面:
现在,尝试使用您在上一步中创建的超级用户帐户登录。 您应该看到doccano项目列表页面:
注意:只有超级管理员才能创建项目,上面的运行实际上就是创建超级管理员账号
尚未创建任何项目。 要创建项目,请确保您已进入项目列表页面并选择“创建项目”按钮。 您应该看到以下屏幕:
在此步骤中,您可以选择四种项目类型:文本分类,序列标记,序列到序列和语言转文字。 您应该选择符合您目的的类型。
创建项目后,您将看到“导入数据”页面,或单击导航栏中的“导入数据”按钮。 您应该看到以下屏幕:
#查找到对应的containerID
[root@docker ~]# docker ps
#通过containerID查看docker容器运行日志;格式:docker logs containerID。;例如我的containerID为7c5250576e3e6d133006b80610a4c4f03896e36bd016e8c3ad0e4e4ddc8e4834
[root@docker ~]# docker logs 7c5250576e3e6d133006b80610a4c4f03896e36bd016e8c3ad0e4e4ddc8e4834
Operations to perform:
Apply all migrations: admin, api, auth, authtoken, contenttypes, sessions, social_django
Running migrations:
Applying contenttypes.0001_initial... OK
Applying auth.0001_initial... OK
Applying admin.0001_initial... OK
Applying admin.0002_logentry_remove_auto_add... OK
Applying admin.0003_logentry_add_action_flag_choices... OK
Applying contenttypes.0002_remove_content_type_name... OK
Applying api.0001_initial... OK
Applying api.0002_speech2text... OK
Applying api.0002_project_single_class_classification... OK
Applying api.0003_merge_20200612_0205... OK
Applying auth.0002_alter_permission_name_max_length... OK
Applying auth.0003_alter_user_email_max_length... OK
Applying auth.0004_alter_user_username_opts... OK
Applying auth.0005_alter_user_last_login_null... OK
Applying auth.0006_require_contenttypes_0002... OK
Applying auth.0007_alter_validators_add_error_messages... OK
Applying auth.0008_alter_user_username_max_length... OK
Applying auth.0009_alter_user_last_name_max_length... OK
Applying authtoken.0001_initial... OK
Applying authtoken.0002_auto_20160226_1747... OK
Applying sessions.0001_initial... OK
Applying social_django.0001_initial... OK
Applying social_django.0002_add_related_name... OK
Applying social_django.0003_alter_email_max_length... OK
Applying social_django.0004_auto_20160423_0400... OK
Applying social_django.0005_auto_20160727_2333... OK
Applying social_django.0006_partial... OK
Applying social_django.0007_code_timestamp... OK
Applying social_django.0008_partial_timestamp... OK
Role created successfully "project_admin"
Role created successfully "annotator"
Role created successfully "annotation_approver"
Setting password for User admin.
Superuser created successfully.#创建超级用户成功
[2020-07-15 06:33:02 +0000] [26] [INFO] Starting gunicorn 19.9.0
[2020-07-15 06:33:02 +0000] [26] [INFO] Listening at: http://0.0.0.0:8000 (26)#访问路径
[2020-07-15 06:33:02 +0000] [26] [INFO] Using worker: sync
[2020-07-15 06:33:02 +0000] [29] [INFO] Booting worker with pid: 29
[2020-07-15 06:33:02 +0000] [31] [INFO] Booting worker with pid: 31
/usr/local/lib/python3.6/site-packages/django/views/generic/list.py:88: UnorderedObjectListWarning: Pagination may yield inconsistent results with an unordered object_list: QuerySet.
allow_empty_first_page=allow_empty_first_page, **kwargs)
Bad Request: /v1/projects/1
/usr/local/lib/python3.6/site-packages/django/views/generic/list.py:88: UnorderedObjectListWarning: Pagination may yield inconsistent results with an unordered object_list: QuerySet.
allow_empty_first_page=allow_empty_first_page, **kwargs)
Bad Request: /v1/projects/1
Internal Server Error: /v1/projects/2/docs/upload
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
response = get_response(request)
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py", line 126, in _get_response
response = self.process_exception_by_middleware(e, request)
File "/usr/local/lib/python3.6/site-packages/django/core/handlers/base.py", line 124, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/usr/local/lib/python3.6/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
return view_func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/django/views/generic/base.py", line 68, in view
return self.dispatch(request, *args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 497, in dispatch
response = self.handle_exception(exc)
File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 457, in handle_exception
self.raise_uncaught_exception(exc)
File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 468, in raise_uncaught_exception
raise exc
File "/usr/local/lib/python3.6/site-packages/rest_framework/views.py", line 494, in dispatch
response = handler(request, *args, **kwargs)
File "/doccano/app/api/views.py", line 265, in post
project_id=kwargs['project_id'],
File "/doccano/app/api/views.py", line 276, in save_file
storage.save(user)
File "/usr/local/lib/python3.6/contextlib.py", line 52, in inner
return func(*args, **kwds)
File "/doccano/app/api/utils.py", line 172, in save
unique_labels = self.extract_unique_labels(labels)
File "/doccano/app/api/utils.py", line 182, in extract_unique_labels
return set([label for _, _, label in itertools.chain(*labels)])
File "/doccano/app/api/utils.py", line 182, in
return set([label for _, _, label in itertools.chain(*labels)])
ValueError: too many values to unpack (expected 3)
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: Activate your account.
From: webmaster@localhost
To: [email protected]
Date: Wed, 15 Jul 2020 09:50:37 -0000
Message-ID: <159480663790.31.2908584818714001225@7c5250576e3e>
Hi shuchang,#新注册的用户,需要通过下面的链接访问doccano
Please click on the link to confirm your email and activate your Doccano account:
http://192.168.21.146:8000/activate/Mg/5i7-32639c44a643c5060b95
-------------------------------------------------------------------------------
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: Activate your account.
From: webmaster@localhost
To: [email protected]
Date: Wed, 15 Jul 2020 10:23:53 -0000
Message-ID: <159480863343.29.15283910455911044340@7c5250576e3e>
Hi root,#新注册的用户,需要通过下面的链接访问doccano
Please click on the link to confirm your email and activate your Doccano account:
http://192.168.21.146:8000/activate/Mw/5i7-43991a26cf2c0c812c5d
-------------------------------------------------------------------------------
[root@docker ~]#