TiDB Ansible部署及遇到的问题

一、TiDB  Ansible部署

部署可以参照官方文档,写的很详细,如下链接:

https://pingcap.com/docs-cn/op-guide/ansible-deployment/

 

 

二、部署过程中遇到的问题

2.1.1 问题一

One of the configured repositories failed (Zabbix Official Repository - x86_64),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Run the command with the repository temporarily disabled
            yum --disablerepo=zabbix ...

     4. Disable the repository permanently, so yum won't use it by default. Yum
        will then just ignore the repository until you permanently enable it
        again or use --enablerepo for temporary usage:

            yum-config-manager --disable zabbix
        or
            subscription-manager repos --disable=zabbix

     5. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=zabbix.skip_if_unavailable=true

2.1.2 解决一

上面的示例是Yum给出的解决方法, 在线安装zabbix, 并且添加了zabbix源, 导致今后执行所有yum命令时都会报错, 解决办法就是将这个源文件删除:

rm -fr /etc/yum.repos.d/xxx.repo

2.2.1 问题二

在运行命令:

ansible-playbook -i hosts.ini deploy_ntp.yml -u tidb -b

执行以下命令如果所有 server 返回 tidb 表示 ssh 互信配置成功。

ansible -i inventory.ini all -m shell -a 'whoami'

执行以下命令如果所有 server 返回 root 表示 tidb 用户 sudo 免密码配置成功。

ansible -i inventory.ini all -m shell -a 'whoami' -b

都可能报错,出现如下错误:

Ansible FAILED!

Start to adjust time with pool.ntp.org

ntpdate[50809]:no server suitable for synchronization found

TiDB Ansible部署及遇到的问题_第1张图片

2.2.2 解决二

  • 查看一下/etc/ntp.conf里面的server ,如果里面server 为ntp02.intstg.sfdc.com.cn
  • 然后把tidb_ansible_master/hosts.ini里面的ntp_server 改成一致的,即ntp02.intstg.sfdc.com.cn
  • 然后按照下面的链接设置一下NTP服务就可以了,记得在设置NTP服务器的时候要与上面的一致,即ntp02.intstg.sfdc.com.cn
  • https://pingcap.com/docs-cn/op-guide/ansible-deployment/#%E5%A6%82%E4%BD%95%E6%A3%80%E6%B5%8B-ntp-%E6%9C%8D%E5%8A%A1%E6%98%AF%E5%90%A6%E6%AD%A3%E5%B8%B8

2.3.1问题三

在运行命令

ansible-playbook deploy.yml

报错如下:

Ansible FAILED!

playbook:deploy.yml;TASK:node_exporter:deploy node_exporter binary;

Could not find or access '/home/tidb/tidb-ansible-master/resources/bin/node_exporter'

2.3.2 解决三

这种情况就是因为安装包中缺少文件导致的,即/home/tidb/tidb-ansible-master/resources/bin目录下的文件node_exporter缺少了,需要重要下载一个完整的安装包,重新再安装一遍。


2.4.1问题四

如果TiDB数据库集群中有一台服务器宕机了,在修复问题之后,如何把这台服务器重新加入到集群中呢?

 

 

2.4.2解决四

假设一台服务器宕机之后(如10.222.222.123),首先需要检查宕机的服务器上的NTP服务是否正常,如下命令:

sudo systemctl status ntpd.service


如果NTP协议有错,则需要进行下面的三步曲:
sudo systemctl stop ntpd.service

sudo ntpdate ntp02.intstg.com.cn(上面说到NTP服务器)

sudo systemctl start ntpd.service

 

然后在中控机中重启TiDB的服务,就可以了。如下命令:

ansible-playbook stop.yml

ansible-playbook start.yml

 

 

 

你可能感兴趣的:(数据库)