首先讲一下背景:
我现在在开发的一个项目,需要运行RabbitMQ和Nodejs接收端(amqplib库),但是在Nodejs接收端运行时,无法连接至RabbitMQ端,经常提示说 connect ECONNREFUSED ,无法连接至RabbitMQ服务器,我在docker-compose.yml文件中明明在Nodejs接收端将RabbitMQ服务器设置为depends_on,即RabbitMQ运行之后Nodejs接收端才会启动,那么为什么还是无法连接呢?
我查找了相关的资料,终于找到了原因
RabbitMQ容器启动后,离服务可用还有一小段时间
Nodejs接收端容器,则是启动后,就开始尝试连接至RabbitMQ服务器,那么中间就会有一个时间差,当Nodejs接收端去连接RabbitMQ服务器,服务无法响应,则会造成无法连接的错误
那么应该如何解决这个问题呢?
找到一个解决方案:wait-for-it,这个库中会增加对IP和端口的监测,只有当某个IP和端口可访问后,才会进行下一步动作,正好解决了以上的关于服务启动的时间差问题
关于写法,可以参考《https://stackoverflow.com/questions/48015477/docker-and-rabbitmq-econnrefused-between-containers》
需要在Nodejs接收端中增加wait-for-it.sh文件,还需要在Dockfile中将sh文件加入至镜像中,并且启动的命令是,当RabbitMQ服务已经启动后,再尝试启动Nodejs服务,这样就可以正常的启动了
请注意:其中等待的最长时间是我们在脚本中定义好的,我这国定义的是120秒,即两分钟,如果容器在两分钟还是无法启动成功,估计是会造成失败的
wait-for-it.sh文件内容,请最好参考官方的github地址(https://github.com/vishnubob/wait-for-it/edit/master/wait-for-it.sh),我这边的是现在能用,但不能保证未来一定可用
#!/usr/bin/env bash # Use this script to test if a given TCP host/port are available WAITFORIT_cmdname=${0##*/} echoerr() { if [[ $WAITFORIT_QUIET -ne 1 ]]; then echo "$@" 1>&2; fi } usage() { cat << USAGE >&2 Usage: $WAITFORIT_cmdname host:port [-s] [-t timeout] [-- command args] -h HOST | --host=HOST Host or IP under test -p PORT | --port=PORT TCP port under test Alternatively, you specify the host and port as host:port -s | --strict Only execute subcommand if the test succeeds -q | --quiet Don't output any status messages -t TIMEOUT | --timeout=TIMEOUT Timeout in seconds, zero for no timeout -- COMMAND ARGS Execute command with args after the test finishes USAGE exit 1 } wait_for() { if [[ $WAITFORIT_TIMEOUT -gt 0 ]]; then echoerr "$WAITFORIT_cmdname: waiting $WAITFORIT_TIMEOUT seconds for $WAITFORIT_HOST:$WAITFORIT_PORT" else echoerr "$WAITFORIT_cmdname: waiting for $WAITFORIT_HOST:$WAITFORIT_PORT without a timeout" fi WAITFORIT_start_ts=$(date +%s) while : do if [[ $WAITFORIT_ISBUSY -eq 1 ]]; then nc -z $WAITFORIT_HOST $WAITFORIT_PORT WAITFORIT_result=$? else (echo > /dev/tcp/$WAITFORIT_HOST/$WAITFORIT_PORT) >/dev/null 2>&1 WAITFORIT_result=$? fi if [[ $WAITFORIT_result -eq 0 ]]; then WAITFORIT_end_ts=$(date +%s) echoerr "$WAITFORIT_cmdname: $WAITFORIT_HOST:$WAITFORIT_PORT is available after $((WAITFORIT_end_ts - WAITFORIT_start_ts)) seconds" break fi sleep 1 done return $WAITFORIT_result } wait_for_wrapper() { # In order to support SIGINT during timeout: http://unix.stackexchange.com/a/57692 if [[ $WAITFORIT_QUIET -eq 1 ]]; then timeout $WAITFORIT_BUSYTIMEFLAG $WAITFORIT_TIMEOUT $0 --quiet --child --host=$WAITFORIT_HOST --port=$WAITFORIT_PORT --timeout=$WAITFORIT_TIMEOUT & else timeout $WAITFORIT_BUSYTIMEFLAG $WAITFORIT_TIMEOUT $0 --child --host=$WAITFORIT_HOST --port=$WAITFORIT_PORT --timeout=$WAITFORIT_TIMEOUT & fi WAITFORIT_PID=$! trap "kill -INT -$WAITFORIT_PID" INT wait $WAITFORIT_PID WAITFORIT_RESULT=$? if [[ $WAITFORIT_RESULT -ne 0 ]]; then echoerr "$WAITFORIT_cmdname: timeout occurred after waiting $WAITFORIT_TIMEOUT seconds for $WAITFORIT_HOST:$WAITFORIT_PORT" fi return $WAITFORIT_RESULT } # process arguments while [[ $# -gt 0 ]] do case "$1" in *:* ) WAITFORIT_hostport=(${1//:/ }) WAITFORIT_HOST=${WAITFORIT_hostport[0]} WAITFORIT_PORT=${WAITFORIT_hostport[1]} shift 1 ;; --child) WAITFORIT_CHILD=1 shift 1 ;; -q | --quiet) WAITFORIT_QUIET=1 shift 1 ;; -s | --strict) WAITFORIT_STRICT=1 shift 1 ;; -h) WAITFORIT_HOST="$2" if [[ $WAITFORIT_HOST == "" ]]; then break; fi shift 2 ;; --host=*) WAITFORIT_HOST="${1#*=}" shift 1 ;; -p) WAITFORIT_PORT="$2" if [[ $WAITFORIT_PORT == "" ]]; then break; fi shift 2 ;; --port=*) WAITFORIT_PORT="${1#*=}" shift 1 ;; -t) WAITFORIT_TIMEOUT="$2" if [[ $WAITFORIT_TIMEOUT == "" ]]; then break; fi shift 2 ;; --timeout=*) WAITFORIT_TIMEOUT="${1#*=}" shift 1 ;; --) shift WAITFORIT_CLI=("$@") break ;; --help) usage ;; *) echoerr "Unknown argument: $1" usage ;; esac done if [[ "$WAITFORIT_HOST" == "" || "$WAITFORIT_PORT" == "" ]]; then echoerr "Error: you need to provide a host and port to test." usage fi WAITFORIT_TIMEOUT=${WAITFORIT_TIMEOUT:-15} WAITFORIT_STRICT=${WAITFORIT_STRICT:-0} WAITFORIT_CHILD=${WAITFORIT_CHILD:-0} WAITFORIT_QUIET=${WAITFORIT_QUIET:-0} # check to see if timeout is from busybox? WAITFORIT_TIMEOUT_PATH=$(type -p timeout) WAITFORIT_TIMEOUT_PATH=$(realpath $WAITFORIT_TIMEOUT_PATH 2>/dev/null || readlink -f $WAITFORIT_TIMEOUT_PATH) if [[ $WAITFORIT_TIMEOUT_PATH =~ "busybox" ]]; then WAITFORIT_ISBUSY=1 WAITFORIT_BUSYTIMEFLAG="-t" else WAITFORIT_ISBUSY=0 WAITFORIT_BUSYTIMEFLAG="" fi if [[ $WAITFORIT_CHILD -gt 0 ]]; then wait_for WAITFORIT_RESULT=$? exit $WAITFORIT_RESULT else if [[ $WAITFORIT_TIMEOUT -gt 0 ]]; then wait_for_wrapper WAITFORIT_RESULT=$? else wait_for WAITFORIT_RESULT=$? fi fi if [[ $WAITFORIT_CLI != "" ]]; then if [[ $WAITFORIT_RESULT -ne 0 && $WAITFORIT_STRICT -eq 1 ]]; then echoerr "$WAITFORIT_cmdname: strict mode, refusing to execute subprocess" exit $WAITFORIT_RESULT fi exec "${WAITFORIT_CLI[@]}" else exit $WAITFORIT_RESULT fi
原本以为这个方案是十拿九稳的,在运行的过程中又出现了问题,即在Nodejs接收端,又出现了No such file or directory/usr/bin/env: bash的错误,这是个什么问题???
查找了网上的一些相关问题后,锁定了问题的原因,是因为Windows和Linux文件的编码问题,其中找到一个ELK的docker-compose也是相同的问题《https://github.com/deviantony/docker-elk/issues/36》
其中提到了两点,
1、使用Nodepad++替换指定的换行符
2、在docker-compose up -d启动前,使用docker-compose build命令编译一次
那好,第一步:将wait-for-it.sh文件使用Nodepad++处理一下
第二步:在docker-compose up -d前面执行docker-compose build一次
最后,在启动容器后,可以看到,在Nodejs接收端,wait-for-it作用后,过了九秒才启动Nodejs接收端
2019-03-29 更新:当第二次使用容器时,出现另一个错误
ERROR: for mes-run_mesrabbitmqmonitoringclient_1 Cannot start service mesrabbitmqmonitoringclient: OCI runtime create failed: container_linux.go:348: starting container process caused "exec: \"./wait-for-it.sh\": permission denied": unknown ERROR: for mesrabbitmqmonitoringclient Cannot start service mesrabbitmqmonitoringclient: OCI runtime create failed: container_linux.go:348: starting container process caused "exec: \"./wait-for-it.sh\": permission denied": unknown ERROR: Encountered errors while bringing up the project.
个人猜测原因是,代码是在Windows电脑上进行代码开发的,默认情况下在Docker Windows中build image时,会为文件默认增加运行(X)权限,然后将镜像tag image后,直接push image到docker hub上,在服务器中,使用Docker-compose直接从docker hub 上pull image后运行container
但现在为何不行?因为现在修改为使用Azure DevOps平台build image,默认是没有设置运行(X)权限的,pull image后,在运行wait-for-it.sh时,会无法检测到权限
接下来应该要解决的是,如何在运行前增加sudo 用户,并且chrom +x ./wait-for-it.sh
2019-03-29 最终解决问题了
现在也依然是使用Azure DevOps build image,增加一行代码chmod +x ./wait-for-it.sh即可
# FROM BASIC IMAGE FROM node:7 # CREATE AND SET WORK-DIRECTORY RUN mkdir -p /usr/src/app WORKDIR /usr/src/app # INSTALL APP DEPENDENCIES COPY package.json /usr/src/app/ RUN npm install -g cross-env babel-cli RUN npm install # DB ENVIRONMENT SETTING # ENV NODE_ENV="local" # BUILD SOURCE CODE COPY . /usr/src/app COPY wait-for-it.sh /usr/src/app
#这一行非常重要 RUN chmod +x ./wait-for-it.sh RUN npm run build # No Need PORT(Just Command) # EXPOSE 3000 # DEFAULT COMMAND # CMD [ "npm", "start" ] CMD ["./wait-for-it.sh", "rabbitmq:5672", "-t", "120", "--", "npm", "start"]
如果问题没有得到解决,可以参考:https://github.com/nodejs/docker-node/issues/6
方案参考地址:
《Docker and Rabbitmq: ECONNREFUSED between containers》
《docker-compose start kibana failed!--No such file or directory/usr/bin/env: bash》