https://github.com/amundsen-io/amundsen
注意有两种启动方式,装载测试数据的时候选择第一种,第2种可以忽视
Make sure you have at least 3GB available to docker. Install
docker
anddocker-compose
.-
Clone this repo and its submodules by running:
$ git clone --recursive [email protected]:amundsen-io/amundsen.git
-
Enter the cloned directory and run:
source-shell # For Neo4j Backend $ docker-compose -f docker-amundsen.yml up # For Atlas $ docker-compose -f docker-amundsen-atlas.yml up
-
Ingest provided sample data into Neo4j by doing the following: (Please skip if you are using Atlas backend)
- In a separate terminal window, change directory to the amundsendatabuilder submodule.
-
sample_data_loader
python script included inexamples/
directory uses elasticsearch client, pyhocon and other libraries. Install the dependencies in a virtual env and run the script by following the commands below:
$ python3 -m venv venv $ source venv/bin/activate $ pip3 install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com $ python3 setup.py install $ python3 example/scripts/sample_data_loader.py
View UI at [
http://localhost:5000
]and try to searchtest
, it should return some result.We could also do an exact matched search for table entity. For example: search
test_table1
in table field and it return the records that matched.
坑基本上在datahub中踩的差不多了,主要是几个东西docker-compose,python3的安装,docker的安装
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
解决:
切换到root用户
执行命令:
sysctl -w vm.max_map_count=262144
查看结果:
sysctl -a|grep vm.max_map_count
显示:
vm.max_map_count = 262144
上述方法修改之后,如果重启虚拟机将失效,所以:
解决办法:
在 /etc/sysctl.conf文件最后添加一行
vm.max_map_count=262144
即可永久修改
修改 vim ./amundsensearchlibrary/public.Dockerfile
RUN pip3 install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
修改
vim ./amundsenmetadatalibrary/public.Dockerfile
npm config set registry http://mirrors.cloud.tencent.com/npm/ 腾讯源,淘宝源好像死翘翘了
RUN pip3 install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
修改
vim ./amundsenfrontendlibrary/local.Dockerfile
ARG METADATASERVICE_BASE
ARG SEARCHSERVICE_BASE
FROM node:12-slim as node-stage
#COPY sources.list /etc/apt/sources.list
#RUN sudo apt-get update
#run cat /etc/apt/sources.list
WORKDIR /app/amundsen_application/static
RUN cat /etc/issue
COPY amundsen_application/static/package.json /app/amundsen_application/static/package.json
COPY amundsen_application/static/package-lock.json /app/amundsen_application/static/package-lock.json
RUN npm config set registry https://registry.npm.taobao.org
#RUN npm install -g cnpm --registry http://mirrors.cloud.tencent.com/npm/
RUN npm install -g cnpm --registry https://registry.npm.taobao.org
RUN npm install
COPY amundsen_application/static/ /app/amundsen_application/static/
RUN cnpm install cross-env
RUN cnpm rebuild node-sass
RUN cnpm run dev-build
COPY . /app
FROM python:3.7-slim
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip3 install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
COPY --from=node-stage /app /app
RUN python3 setup.py install
ENTRYPOINT [ "python3" ]
CMD [ "amundsen_application/wsgi.py" ]
报错2
Processing amundsen_databuilder-4.0.4-py3.8.egg
Removing /app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/amundsen_databuilder-4.0.4-py3.8.egg
Copying amundsen_databuilder-4.0.4-py3.8.egg to /app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages
amundsen-databuilder 4.0.4 is already the active version in easy-install.pth
Installed /app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/amundsen_databuilder-4.0.4-py3.8.egg
Processing dependencies for amundsen-databuilder==4.0.4
error: urllib3 1.26.2 is installed but urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 is required by {'requests'}
解决
pip3 install --upgrade pip urllib3==1.25.2 -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
装载测试数据的报错3
python3 example/scripts/sample_data_loader.py
Traceback (most recent call last):
File "example/scripts/sample_data_loader.py", line 28, in
from elasticsearch import Elasticsearch
File "/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/elasticsearch/__init__.py", line 24, in
from .client import Elasticsearch
File "/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/elasticsearch/client/__init__.py", line 4, in
from ..transport import Transport
File "/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/elasticsearch/transport.py", line 4, in
from .connection import Urllib3HttpConnection
File "/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/elasticsearch/connection/__init__.py", line 3, in
from .http_urllib3 import Urllib3HttpConnection, create_ssl_context
File "/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/elasticsearch/connection/http_urllib3.py", line 2, in
import ssl
File "/usr/local/python3/lib/python3.8/ssl.py", line 98, in
import _ssl # if we can't import it, let the error propagate
解决方案
https://blog.csdn.net/qq_23889009/article/details/100887640
报错5
python3 example/scripts/sample_data_loader.py
/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/pandas/compat/__init__.py:120: UserWarning: Could not import the lzma module. Your installed Python is incomplete. Attempting to use lzma compression will result in a RuntimeError.
warnings.warn(msg)
WARNING:elasticsearch:PUT http://localhost:9200/table_7a462ed7-cd6c-4306-8596-5d638c14a7dc [status:N/A request:0.001s]
Traceback (most recent call last):
File "/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
conn = connection.create_connection(
File "/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/urllib3/util/connection.py", line 80, in create_connection
raise err
File "/app/ttt/amundsen/amundsendatabuilder/venv/lib/python3.8/site-packages/urllib3/util/connection.py", line 70, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
……
原因容器 es_amundsen未启动或者异常退出
退出原因:
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
截图留念