在学习Linux运维时,普遍反馈是:Linux Shell是一个很难的知识板块。虽然大家都认真学,基本的语法也都掌握了,但有需求时,很难直接上手编程,要么写了很久,要么写不好!
也有很多做运维很多年的朋友也是如此,Shell脚本一直写的不6!在网上看例子能照猫画虎写出来,完全独立写就困难了。对于初学者而言,因为没有实战经验,写不出来Shell脚本很正常,如果工作了几年的运维老年还是写不出来,那就是没主动找需求,缺乏练习,缺乏经验。
针对以上问题,总结了30个生产环境中经典的Shell脚本,通过这些需求案例,希望能帮助大家提升Shell编写思路,掌握编写技巧。

先了解下编写Shell过程中注意事项:
开头加解释器:#!/bin/bash
语法缩进,使用四个空格;多加注释说明。
命名建议规则:变量名大写、局部变量小写,函数名小写,名字体现出实际作用。
默认变量是全局的,在函数中变量local指定为局部变量,避免污染其他作用域。
有两个命令能帮助我调试脚本:set -e 遇到执行非0时退出脚本,set-x 打印执行过程。
写脚本一定先测试再到生产上。

1、获取随机字符串或数字

获取随机8位字符串:
方法1:
# echo $RANDOM |md5sum |cut -c 1-8
471b94f2
方法2:
# openssl rand -base64 4
vg3BEg==
方法3:
# cat /proc/sys/kernel/random/uuid |cut -c 1-8
ed9e032c
获取随机8位数字:
方法1:
# echo $RANDOM |cksum |cut -c 1-8
23648321
方法2:
# openssl rand -base64 4 |cksum |cut -c 1-8
38571131
方法3:
# date +%N |cut -c 1-8
69024815
cksum:打印CRC效验和统计字节

2、定义一个颜色输出字符串函数

方法1:
function echo_color() {
if [ $1 == "green" ]; then
echo -e "\033[32;40m$2\033[0m"
elif [ $1 == "red" ]; then
echo -e "\033[31;40m$2\033[0m"
fi
}
方法2:
function echo_color() {
case $1 in
green)
echo -e "[32;40m$2[0m"
;;
red)
echo -e "[31;40m$2[0m"
;;
*)
echo "Example: echo_color red string"
esac
}
使用方法:echo_color green "test"
function关键字定义一个函数,可加或不加。

3、批量创建用户

#!/bin/bash
DATE=$(date +%F_%T)
USER_FILE=user.txt
echo_color(){
if [ $1 == "green" ]; then
echo -e "[32;40m$2[0m"
elif [ $1 == "red" ]; then
echo -e "[31;40m$2[0m"
fi
}
# 如果用户文件存在并且大小大于0就备份
if [ -s $USER_FILE ]; then
mv $USER_FILE ${USER_FILE}-${DATE}.bak
echo_color green "$USER_FILE exist, rename ${USER_FILE}-${DATE}.bak"
fi
echo -e "User Password" >> $USER_FILE
echo "----------------" >> $USER_FILE
for USER in user{1..10}; do
if ! id $USER &>/dev/null; then
PASS=$(echo $RANDOM |md5sum |cut -c 1-8)
useradd $USER
echo $PASS |passwd --stdin $USER &>/dev/null
echo -e "$USER $PASS" >> $USER_FILE
echo "$USER User create successful."
else
echo_color red "$USER User already exists!"
fi
done

4、检查软件包是否安装

#!/bin/bash
if rpm -q sysstat &>/dev/null; then
echo "sysstat is already installed."
else
echo "sysstat is not installed!"
fi

5、检查服务状态

#!/bin/bash
PORT_C=$(ss -anu |grep -c 123)
PS_C=$(ps -ef |grep ntpd |grep -vc grep)
if [ $PORT_C -eq 0 -o $PS_C -eq 0 ]; then
echo "内容" | mail -s "主题" [email protected]
fi

6、检查主机存活状态

方法1:将错误IP放到数组里面判断是否ping失败三次
#!/bin/bash
IP_LIST="192.168.18.1 192.168.1.1 192.168.18.2"
for IP in $IP_LIST; do
NUM=1
while [ $NUM -le 3 ]; do
if ping -c 1 $IP > /dev/null; then
echo "$IP Ping is successful."
break
else

echo "$IP Ping is failure $NUM"

        FAIL_COUNT[$NUM]=$IP
        let NUM++
    fi
done
if [ ${#FAIL_COUNT[*]} -eq 3 ];then
    echo "${FAIL_COUNT[1]} Ping is failure!"
    unset FAIL_COUNT[*]
fi

done
方法2:将错误次数放到FAIL_COUNT变量里面判断是否ping失败三次
#!/bin/bash
IP_LIST="192.168.18.1 192.168.1.1 192.168.18.2"
for IP in $IP_LIST; do
FAIL_COUNT=0
for ((i=1;i<=3;i++)); do
if ping -c 1 $IP >/dev/null; then
echo "$IP Ping is successful."
break
else

echo "$IP Ping is failure $i"

        let FAIL_COUNT++
    fi
done
if [ $FAIL_COUNT -eq 3 ]; then
    echo "$IP Ping is failure!"
fi

done
方法3:利用for循环将ping通就跳出循环继续,如果不跳出就会走到打印ping失败
#!/bin/bash
ping_success_status() {
if ping -c 1 $IP >/dev/null; then
echo "$IP Ping is successful."
continue
fi
}
IP_LIST="192.168.18.1 192.168.1.1 192.168.18.2"
for IP in $IP_LIST; do
ping_success_status
ping_success_status
ping_success_status
echo "$IP Ping is failure!"
done

7、监控CPU、内存和硬盘利用率

1)CPU
借助vmstat工具来分析CPU统计信息。
#!/bin/bash
DATE=$(date +%F" "%H:%M)
IP=$(ifconfig eth0 |awk -F [ :]+ /inet addr/{print $4} ) # 只支持CentOS6br/>MAIL="[email protected]"
if ! which vmstat &>/dev/null; then
echo "vmstat command no found, Please install procps package."
exit 1
fi
US=$(vmstat |awk NR==3{print $13} )
SY=$(vmstat |awk NR==3{print $14} )
IDLE=$(vmstat |awk NR==3{print $15} )
WAIT=$(vmstat |awk NR==3{print $16} )
USE=$(($US+$SY))
if [ $USE -ge 50 ]; then
echo "
Date: $DATE
Host: $IP
Problem: CPU utilization $USE
" | mail -s "CPU Monitor" $MAIL
fi
2)内存
#!/bin/bash
DATE=$(date +%F" "%H:%M)
IP=$(ifconfig eth0 |awk -F [ :]+ /inet addr/{print $4} ) br/>MAIL="[email protected]"
TOTAL=$(free -m |awk /Mem/{print $2} )
USE=$(free -m |awk /Mem/{print $3-$6-$7} )
FREE=$(($TOTAL-$USE))
# 内存小于1G发送报警邮件
if [ $FREE -lt 1024 ]; then
echo "
Date: $DATE
Host: $IP
Problem: Total=$TOTAL,Use=$USE,Free=$FREE
" | mail -s "Memory Monitor" $MAIL
fi
3)硬盘
#!/bin/bash
DATE=$(date +%F" "%H:%M)
IP=$(ifconfig eth0 |awk -F [ :]+ /inet addr/{print $4} ) br/>MAIL="[email protected]"
TOTAL=$(fdisk -l |awk -F [: ]+ BEGIN{OFS="="}/^Disk /dev/{printf "%s=%sG,",$2,$3} )
PART_USE=$(df -h |awk BEGIN{OFS="="}/^/dev/{print $1,int($5),$6} )
for i in $PART_USE; do
PART=$(echo $i |cut -d"=" -f1)
USE=$(echo $i |cut -d"=" -f2)
MOUNT=$(echo $i |cut -d"=" -f3)
if [ $USE -gt 80 ]; then
echo "
Date: $DATE
Host: $IP
Total: $TOTAL
Problem: $PART=$USE($MOUNT)
" | mail -s "Disk Monitor" $MAIL
fi
done

8、批量主机磁盘利用率监控

前提监控端和被监控端SSH免交互登录或者密钥登录。
写一个配置文件保存被监控主机SSH连接信息,文件内容格式:IP User Port

#!/bin/bash
HOST_INFO=host.info
for IP in $(awk /^[^#]/{print $1} $HOST_INFO); do
USER=$(awk -v ip=$IP ip==$1{print $2} $HOST_INFO)
PORT=$(awk -v ip=$IP ip==$1{print $3} $HOST_INFO)
TMP_FILE=/tmp/disk.tmp
ssh -p $PORT $USER@$IP df -h > $TMP_FILE
USE_RATE_LIST=$(awk BEGIN{OFS="="}/^/dev/{print $1,int($5)} $TMP_FILE)
for USE_RATE in $USE_RATE_LIST; do
PART_NAME=${USE_RATE%=}
USE_RATE=${USE_RATE#
=}
if [ $USE_RATE -ge 80 ]; then
echo "Warning: $PART_NAME Partition usage $USE_RATE%!"
fi
done
done

9、检查网站可用性

1)检查URL可用性
方法1:
check_url() {
HTTP_CODE=$(curl -o /dev/null --connect-timeout 3 -s -w "%{http_code}" $1)
if [ $HTTP_CODE -ne 200 ]; then
echo "Warning: $1 Access failure!"
fi
}
方法2:
check_url() {
if ! wget -T 10 --tries=1 --spider $1 >/dev/null 2>&1; then
#-T超时时间,--tries尝试1次,--spider爬虫模式
echo "Warning: $1 Access failure!"
fi
}
使用方法:check_url www.baidu.com

2)判断三次URL可用性
思路与上面检查主机存活状态一样。
方法1:利用循环技巧,如果成功就跳出当前循环,否则执行到最后一行
#!/bin/bash
check_url() {
HTTP_CODE=$(curl -o /dev/null --connect-timeout 3 -s -w "%{http_code}" $1)
if [ $HTTP_CODE -eq 200 ]; then
continue
fi
}
URL_LIST="www.baidu.com www.agasgf.com"
for URL in $URL_LIST; do
check_url $URL
check_url $URL
check_url $URL
echo "Warning: $URL Access failure!"
done
方法2:错误次数保存到变量
#!/bin/bash
URL_LIST="www.baidu.com www.agasgf.com"
for URL in $URL_LIST; do
FAIL_COUNT=0
for ((i=1;i<=3;i++)); do
HTTP_CODE=$(curl -o /dev/null --connect-timeout 3 -s -w "%{http_code}" $URL)
if [ $HTTP_CODE -ne 200 ]; then
let FAIL_COUNT++
else
break
fi
done
if [ $FAIL_COUNT -eq 3 ]; then
echo "Warning: $URL Access failure!"
fi
done
方法3:错误次数保存到数组

#!/bin/bash
URL_LIST="www.baidu.com www.agasgf.com"
for URL in $URL_LIST; do
NUM=1
while [ $NUM -le 3 ]; do
HTTP_CODE=$(curl -o /dev/null --connect-timeout 3 -s -w "%{http_code}" $URL)
if [ $HTTP_CODE -ne 200 ]; then
FAIL_COUNT[$NUM]=$IP #创建数组,以$NUM下标,$IP元素
let NUM++
else
break
fi
done
if [ ${#FAIL_COUNT[]} -eq 3 ]; then
echo "Warning: $URL Access failure!"
unset FAIL_COUNT[
] #清空数组
fi
done

10、检查MySQL主从同步状态

#!/bin/bash
USER=bak
PASSWD=123456
IO_SQLSTATUS=$(mysql -u$USER -p$PASSWD -e show slave statusG |awk -F: /Slave._Running/{gsub(": ",":");print $0} ) #gsub去除冒号后面的空格
for i in $IO_SQL_STATUS; do
THREAD_STATUS_NAME=${i%:
}
THREAD_STATUS=${i#*:}
if [ "$THREAD_STATUS" != "Yes" ]; then
echo "Error: MySQL Master-Slave $THREAD_STATUS_NAME status is $THREAD_STATUS!"
fi
done

11、iptables自动屏蔽访问网站频繁的IP

场景:恶意访问,安全防范

1)屏蔽每分钟访问超过200的IP

方法1:根据访问日志(Nginx为例)

#!/bin/bash
DATE=$(date +%d/%b/%Y:%H:%M)
ABNORMAL_IP=$(tail -n5000 access.log |grep $DATE |awk '{a[$1]++}END{for(i in a)if(a[i]>100)print i}')
#先tail防止文件过大,读取慢,数字可调整每分钟最大的访问量。awk不能直接过滤日志,因为包含特殊字符。
for IP in $ABNORMAL_IP; do
if [ $(iptables -vnL |grep -c "$IP") -eq 0 ]; then
iptables -I INPUT -s $IP -j DROP
fi
done
方法2:通过TCP建立的连接

#!/bin/bash
ABNORMAL_IP=$(netstat -an |awk '$4~/:80$/ && $6~/ESTABLISHED/{gsub(/:[0-9]+/,"",$5);{a[$5]++}}END{for(i in a)if(a[i]>100)print i}')
#gsub是将第五列(客户端IP)的冒号和端口去掉
for IP in $ABNORMAL_IP; do
if [ $(iptables -vnL |grep -c "$IP") -eq 0 ]; then
iptables -I INPUT -s $IP -j DROP
fi
done

2)屏蔽每分钟SSH尝试登录超过10次的IP

方法1:通过lastb获取登录状态:

#!/bin/bash
DATE=$(date +"%a %b %e %H:%M") #星期月天时分 %e单数字时显示7,而%d显示07
ABNORMAL_IP=$(lastb |grep "$DATE" |awk '{a[$3]++}END{for(i in a)if(a[i]>10)print i}')
for IP in $ABNORMAL_IP; do
if [ $(iptables -vnL |grep -c "$IP") -eq 0 ]; then
iptables -I INPUT -s $IP -j DROP
fi
done
方法2:通过日志获取登录状态
#!/bin/bash
DATE=$(date +"%b %d %H")
ABNORMAL_IP="$(tail -n10000 /var/log/auth.log |grep "$DATE" |awk '/Failed/{a[$(NF-3)]++}END{for(i in a)if(a[i]>5)print i}')"
for IP in $ABNORMAL_IP; do
if [ $(iptables -vnL |grep -c "$IP") -eq 0 ]; then
iptables -A INPUT -s $IP -j DROP
echo "$(date +"%F %T") - iptables -A INPUT -s $IP -j DROP" >>~/ssh-login-limit.log
fi
done

12、判断用户输入的是否为IP地址

方法1:

#!/bin/bash
function check_ip(){
IP=$1
VALID_CHECK=$(echo $IP|awk -F. '$1<=255&&$2<=255&&$3<=255&&$4<=255{print "yes"}')
if echo $IP|grep -E "^[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}$">/dev/null; then
if [ $VALID_CHECK == "yes" ]; then
echo "$IP available."
else
echo "$IP not available!"
fi
else
echo "Format error!"
fi
}
check_ip 192.168.1.1
check_ip 256.1.1.1
方法2:
#!/bin/bash
function check_ip(){
IP=$1
if [[ $IP =~ ^[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}$ ]]; then
FIELD1=$(echo $IP|cut -d. -f1)
FIELD2=$(echo $IP|cut -d. -f2)
FIELD3=$(echo $IP|cut -d. -f3)
FIELD4=$(echo $IP|cut -d. -f4)
if [ $FIELD1 -le 255 -a $FIELD2 -le 255 -a $FIELD3 -le 255 -a $FIELD4 -le 255 ]; then
echo "$IP available."
else
echo "$IP not available!"
fi
else
echo "Format error!"
fi
}
check_ip 192.168.1.1
check_ip 256.1.1.1
增加版:
加个死循环,如果IP可用就退出,不可用提示继续输入,并使用awk判断。
#!/bin/bash
function check_ip(){
local IP=$1
VALID_CHECK=$(echo $IP|awk -F. '$1<=255&&$2<=255&&$3<=255&&$4<=255{print "yes"}')
if echo $IP|grep -E "^[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}$" >/dev/null; then
if [ $VALID_CHECK == "yes" ]; then
return 0
else
echo "$IP not available!"
return 1
fi
else
echo "Format error! Please input again."
return 1
fi
}
while true; do
read -p "Please enter IP: " IP
check_ip $IP
[ $? -eq 0 ] && break || continue
done

13、判断用户输入的是否为数字

方法1:
#!/bin/bash
if [[ $1 =~ ^[0-9]+$ ]]; then
echo "Is Number."
else
echo "No Number."
fi
方法2:
#!/bin/bash
if [ $1 -gt 0 ] 2>/dev/null; then
echo "Is Number."
else
echo "No Number."
fi
方法3:
#!/bin/bash
echo $1 |awk '{print $0~/^[0-9]+$/?"Is Number.":"No Number."}' #三目运算符
12.14 找出包含关键字的文件
DIR=$1
KEY=$2
for FILE in $(find $DIR -type f); do
if grep $KEY $FILE &>/dev/null; then
echo "--> $FILE"
fi
done

14、给定目录找出包含关键字的文件

#!/bin/bash
DIR=$1
KEY=$2
for FILE in $(find $DIR -type f); do
if grep $KEY $FILE &>/dev/null; then
echo "--> $FILE"
fi
done

15、监控目录,将新创建的文件名追加到日志中

场景:记录目录下文件操作。

需先安装inotify-tools软件包。

#!/bin/bash
MON_DIR=/opt
inotifywait -mq --format %f -e create $MON_DIR |\
while read files; do
echo $files >> test.log
done

16、给用户提供多个网卡选择
场景:服务器多个网卡时,获取指定网卡,例如网卡流量

#!/bin/bash
function local_nic() {
local NUM ARRAY_LENGTH
NUM=0
for NIC_NAME in $(ls /sys/class/net|grep -vE "lo|docker0"); do
NIC_IP=$(ifconfig $NIC_NAME |awk -F'[: ]+' '/inet addr/{print $4}')
if [ -n "$NIC_IP" ]; then
NIC_IP_ARRAY[$NUM]="$NIC_NAME:$NIC_IP" #将网卡名和对应IP放到数组
let NUM++
fi
done
ARRAY_LENGTH=${#NIC_IP_ARRAY[]}
if [ $ARRAY_LENGTH -eq 1 ]; then #如果数组里面只有一条记录说明就一个网卡
NIC=${NIC_IP_ARRAY[0]%:
}
return 0
elif [ $ARRAY_LENGTH -eq 0 ]; then #如果没有记录说明没有网卡
echo "No available network card!"
exit 1
else
#如果有多条记录则提醒输入选择
for NIC in ${NIC_IP_ARRAY[]}; do
echo $NIC
done
while true; do
read -p "Please enter local use to network card name: " INPUT_NIC_NAME
COUNT=0
for NIC in ${NIC_IP_ARRAY[
]}; do
NIC_NAME=${NIC%:}
if [ $NIC_NAME == "$INPUT_NIC_NAME" ]; then
NIC=${NIC_IP_ARRAY[$COUNT]%:
}
return 0
else
COUNT+=1
fi
done
echo "Not match! Please input again."
done
fi
}
local_nic

17、查看网卡实时流量

适用于CentOS6操作系统。

#!/bin/bash
# Description: Only CentOS6
traffic_unit_conv() {
local traffic=$1
if [ $traffic -gt 1024000 ]; then
printf "%.1f%s" "$(($traffic/1024/1024))" "MB/s"
elif [ $traffic -lt 1024000 ]; then
printf "%.1f%s" "$(($traffic/1024))" "KB/s"
fi
}
NIC=$1
echo -e " In ------ Out"
while true; do
OLD_IN=$(awk -F'[: ]+' '$0~"'$NIC'"{print $3}' /proc/net/dev)
OLD_OUT=$(awk -F'[: ]+' '$0~"'$NIC'"{print $11}' /proc/net/dev)
sleep 1
NEW_IN=$(awk -F'[: ]+' '$0~"'$NIC'"{print $3}' /proc/net/dev)
NEW_OUT=$(awk -F'[: ]+' '$0~"'$NIC'"{print $11}' /proc/net/dev)
IN=$(($NEW_IN-$OLD_IN))
OUT=$(($NEW_OUT-$OLD_OUT))
echo "$(traffic_unit_conv $IN) $(traffic_unit_conv $OUT)"
sleep 1
done
使用:./traffic.sh eth0

18、MySQL数据库备份

#!/bin/bash
DATE=$(date +%F_%H-%M-%S)
HOST=192.168.1.120
DB=test
USER=bak
PASS=123456
MAIL="[email protected] [email protected]"
BACKUP_DIR=/data/db_backup
SQL_FILE=${DB}full$DATE.sql
BAK_FILE=${DB}full$DATE.zip
cd $BACKUP_DIR
if mysqldump -h$HOST -u$USER -p$PASS --single-transaction --routines --triggers -B $DB > $SQL_FILE; then
zip $BAK_FILE $SQL_FILE && rm -f $SQL_FILE
if [ ! -s $BAK_FILE ]; then
echo "$DATE 内容" | mail -s "主题" $MAIL
fi
else
echo "$DATE 内容" | mail -s "主题" $MAIL
fi
find $BACKUP_DIR -name '*.zip' -ctime +14 -exec rm {} \;
19、Nginx服务管理脚本
场景:使用源码包安装Nginx不含带服务管理脚本,也就是不能使用"service nginx start"或"/etc/init.d/nginx start",所以写了以下的服务管理脚本。

#!/bin/bash
# Description: Only support RedHat system
. /etc/init.d/functions
WORD_DIR=/usr/local/nginx
DAEMON=$WORD_DIR/sbin/nginx
CONF=$WORD_DIR/conf/nginx.conf
NAME=nginx
PID=$(awk -F'[; ]+' '/^[^#]/{if($0~/pid;/)print $2}' $CONF)
if [ -z "$PID" ]; then
PID=$WORD_DIR/logs/nginx.pid
else
PID=$WORD_DIR/$PID
fi
stop() {
$DAEMON -s stop
sleep 1
[ ! -f $PID ] && action "
Stopping $NAME" /bin/true || action " Stopping $NAME" /bin/false
}
start() {
$DAEMON
sleep 1
[ -f $PID ] && action "
Starting $NAME" /bin/true || action " Starting $NAME" /bin/false
}
reload() {
$DAEMON -s reload
}
test_config() {
$DAEMON -t
}
case "$1" in
start)
if [ ! -f $PID ]; then
start
else
echo "$NAME is running..."
exit 0
fi
;;
stop)
if [ -f $PID ]; then
stop
else
echo "$NAME not running!"
exit 0
fi
;;
restart)
if [ ! -f $PID ]; then
echo "$NAME not running!"
start
else
stop
start
fi
;;
reload)
reload
;;
testconfig)
test_config
;;
status)
[ -f $PID ] && echo "$NAME is running..." || echo "$NAME not running!"
;;
)
echo "Usage: $0 {start|stop|restart|reload|testconfig|status}"
exit 3
;;
esac

20、用户根据菜单选择要连接的Linux主机
Linux主机SSH连接信息:
**# cat host.txt
Web 192.168.1.10 root 22
DB 192.168.1.11 root 22
内容格式:主机名 IP User Port

#!/bin/bash
PS3="Please input number: "
HOST_FILE=host.txt
while true; do
select NAME in $(awk '{print $1}' $HOST_FILE) quit; do
[ ${NAME:=empty} == "quit" ] && exit 0
IP=$(awk -v NAME=${NAME} '$1==NAME{print $2}' $HOST_FILE)
USER=$(awk -v NAME=${NAME} '$1==NAME{print $3}' $HOST_FILE)
PORT=$(awk -v NAME=${NAME} '$1==NAME{print $4}' $HOST_FILE)
if [ $IP ]; then
echo "Name: $NAME, IP: $IP"
ssh -o StrictHostKeyChecking=no -p $PORT -i id_rsa $USER@$IP # 密钥免交互登录
break
else
echo "Input error, Please enter again!"
break
fi
done
done