【升级openwrt概率性升级失败】

一、升级失败串口打印

近期遇到在升级openwrt概率性升级失败的情况。升级失败主要是升级之前kill进程列表过程,分析log信息基本确认有进程未被杀死,具体原因:结合sysupgrade脚本和stage2脚本从而还没运行至读写flash过程就直接退出升级过程,接着就是直接reboot了

root@Runaiot:/tmp# sy
sync        sysctl      syslogd     sysupgrade
root@Runaiot:/tmp# sysupgrade -F firmware.bin 
Thu Dec  8 17:43:32 CST 2022 upgrade: Saving config files...
Thu Dec  8 17:43:32 CST 2022 upgrade: Commencing upgrade. Closing all shell sessions.
Watchdog handover: fd=3
- watchdog -
killall: telnetd: no process killed
Thu Dec  8 17:43:32 CST 2022 upgrade: Sending TERM to remaining processes ... ubusd askfirst analysis_string bandwidth_serve urngd uhttpd sh syslogd snmpd sh sh ntpd sh cat sh logd sleep rpcd sleep hostapd wpa_supplicant netifd odhcpd
[  346.386249] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 20
[  346.407998] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 13
[  346.414120] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 12
[  346.418816] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 10
[  346.423890] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 5
[  346.431180] device wlan1 left promiscuous mode
[  346.434326] br-lan: port 4(wlan1) entered disabled state
[  346.503481] ath10k_ahb a000000.wifi: peer-unmap-event: unknown peer id 1
[  346.513654] batman_adv: bat0: Interface deactivated: mesh0
[  346.514071] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
[  346.518224] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
***Thu Dec  8 17:43:36 CST 2022 upgrade: Sending KILL to remaining processes ... hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant
Thu Dec  8 17:43:36 CST 2022 upgrade: Failed to kill alormat: Log Type - Time(microsec) - Message - Optional Info***  
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.BF.3.1.1-00126
S - IMAGE_VARIANT_STRING=DAABANAZA
S - OEM_IMAGE_VERSION_STRING=CRM
S - Boot Config, 0x00000021
S - Reset stat

其中主要问题是:

***Thu Dec  8 17:43:36 CST 2022 upgrade: Sending KILL to remaining processes ... hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant hostapd wpa_supplicant
Thu Dec  8 17:43:36 CST 2022 upgrade: Failed to kill alormat: Log Type - Time(microsec) - Message - Optional Info***  

二、问题解决方式

由于主要是hostapd守护进程杀掉后自启动导致偶尔升级失败,只需要修改package/base-files/files/lib/upgrade/stage2脚本的kill_remaining函数,将进程未被完全杀死后exit 1注释掉

kill_remaining() { # [ <signal> [ <loop> ] ]
    local loop_limit=10

    local sig="${1:-TERM}"
    local loop="${2:-0}"
    local run=true
    local stat
    local proc_ppid=$(cut -d' ' -f4  /proc/$$/stat)

    vn "Sending $sig to remaining processes ..."

    while $run; do
        run=false
        for stat in /proc/[0-9]*/stat; do
            [ -f "$stat" ] || continue

            local pid name state ppid rest
            read pid name state ppid rest < $stat
            name="${name#(}"; name="${name%)}"

            # Skip PID1, our parent, ourself and our children
            [ $pid -ne 1 -a $pid -ne $proc_ppid -a $pid -ne $$ -a $ppid -ne $$ ] || continue

            local cmdline
            read cmdline < /proc/$pid/cmdline

            # Skip kernel threads
            [ -n "$cmdline" ] || continue

            _vn " $name"
            kill -$sig $pid 2>/dev/null

            [ $loop -eq 1 ] && run=true
        done

        let loop_limit--                                                                                                                                                                                    
        [ $loop_limit -eq 0 ] && {
            _v
            v "Failed to kill all processes."
            #exit 1
        }
    done
    _v
}

你可能感兴趣的:(openwrt,linux,网络,运维)