一、问题描述
某次权限配置过程中,突然出现ssh断开,后查,ssh无法重启,状态异常,报超时断开,现场环境8.2版本:
polkitd[542]: Unregistered Authentication Agent for unix-process:6501:2207619775 (system bus name :1.1204804, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)
systemd: sshd.service start operation timed out. Terminating.
systemd: sshd.service start operation timed out. Terminating.
sshd[6508]: Received signal 15; terminating.
systemd: Failed to start OpenSSH server daemon.
systemd: Unit sshd.service entered failed state.
systemd: sshd.service failed.
systemd: Failed to start OpenSSH server daemon.
二、过程描述
1)检查日志报错如下:
#有经验表明服务调用异常,可尝试如下调试
mv /usr/lib/systemd/system/sshd.service /usr/lib/systemd/system/sshd.service.bak
systemctl daemon-reload
mv /usr/lib/systemd/system/sshd.service.bak /usr/lib/systemd/system/sshd.service
systemctl start sshd
systemctl enable --now sshd.service
#因本次对/var/run进行过递归授权,检查/var目录权限
ll -d /var/ #现场755权限,可尝试744
drwxr-xr-x. 23 root root 4096 May 24 2022 /var/
ll /var/empty/
d--x--x--x. 2 root root 4096 Apr 15 2020 sshd
ll -d /etc/pki/
drwxr-xr-x. 10 root root 4096 Jul 2 2018 /etc/pki/
ll /etc/pki/
total 32
drwxr-xr-x. 6 root root 4096 Aug 9 2019 CA
drwxr-xr-x. 4 root root 4096 Jul 2 2018 ca-trust
drwxr-xr-x. 2 root root 4096 May 23 2022 java
drwxr-xr-x. 2 root root 4096 Feb 23 2020 nssdb
drwxr-xr-x. 2 root root 4096 Feb 20 2020 nss-legacy
drwxr-xr-x. 2 root root 4096 Jul 2 2018 rpm-gpg
drwx------. 2 root root 4096 Apr 11 2018 rsyslog
drwxr-xr-x. 5 root root 4096 May 23 2022 tls
#因重启Nginx过程中,使用semanager添加端口,怀疑临时selinux生效
getenforce
setenforce 0
#手动启动试下
/etc/rc.d/init.d/sshd start
/usr/sbin/sshd -f /etc/ssh/sshd_config
#测试配置文件正常
sshd -t
#自启动脚本参考
#!/bin/bash
#
# Init file for OpenSSH server daemon
#
# chkconfig: 2345 55 25
# description: OpenSSH server daemon
#
# processname: sshd
# config: /etc/ssh/ssh_host_key
# config: /etc/ssh/ssh_host_key.pub
# config: /etc/ssh/ssh_random_seed
# config: /etc/ssh/sshd_config
# pidfile: /var/run/sshd.pid
# source function library
. /etc/rc.d/init.d/functions
# pull in sysconfig settings
[ -f /etc/sysconfig/sshd ] && . /etc/sysconfig/sshd
RETVAL=0
prog="sshd"
# Some functions to make the below more readable
SSHD=/usr/sbin/sshd
PID_FILE=/var/run/sshd.pid
do_restart_sanity_check()
{
$SSHD -t
RETVAL=$?
if [ $RETVAL -ne 0 ]; then
failure $"Configuration file or keys are invalid"
echo
fi
}
start()
{
# Create keys if necessary
/usr/bin/ssh-keygen -A
if [ -x /sbin/restorecon ]; then
/sbin/restorecon /etc/ssh/ssh_host_rsa_key.pub
/sbin/restorecon /etc/ssh/ssh_host_dsa_key.pub
/sbin/restorecon /etc/ssh/ssh_host_ecdsa_key.pub
fi
echo -n $"Starting $prog:"
$SSHD $OPTIONS && success || failure
RETVAL=$?
[ $RETVAL -eq 0 ] && touch /var/lock/subsys/sshd
echo
}
stop()
{
echo -n $"Stopping $prog:"
killproc $SSHD -TERM
RETVAL=$?
[ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/sshd
echo
}
reload()
{
echo -n $"Reloading $prog:"
killproc $SSHD -HUP
RETVAL=$?
echo
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
stop
start
;;
reload)
reload
;;
condrestart)
if [ -f /var/lock/subsys/sshd ] ; then
do_restart_sanity_check
if [ $RETVAL -eq 0 ] ; then
stop
# avoid race
sleep 3
start
fi
fi
;;
status)
status $SSHD
RETVAL=$?
;;
*)
echo $"Usage: $0 {start|stop|restart|reload|condrestart|status}"
RETVAL=1
esac
exit $RETVAL
2)前一天晚上,ssh无论怎样都无法正常,启动也显示是/usr/sbin/sshd -D [listener] 0 of 10-100 startups,状态始终显示超时,有意思的是,第2天查看的时候,重新启动好了
后来同样的问题发现,弃用systemd下ssh.service后,重启会采用sshd启动脚本启动,正常:
原来启动状态如下:
注:相关经验表明,可编辑makefile文件,配置变量LLIBS,最后增加 -lsystemd,如这样:IBS=-lcrypto -ldl -lutil -lz -lcrypt -lresolv -lsystemd,然后重新编译即可。即默认如果不加的话,用systemd管理启动服务有问题。如果sshd或系统动态库存在文件缺失、损坏、权限错误等问题,就会导致sshd启动失败或冲突,重启。
3)附录:关于polkit,
polkit 是Linux中一个应用程序级别的工具集,用于身份认证管理 (Authorization Manager ),通过定义和审核权限规则,实现不同优先级进程间的通讯:控制决策集中在统一的框架之中,决定低优先级进程是否有权访问高优先级进程。
Polkit 在系统层级进行权限控制,提供了一个低优先级进程和高优先级进程进行通讯的系统。和 sudo 等程序不同,Polkit 并没有赋予进程完全的 root 权限,而是通过一个集中的策略系统进行更精细的授权。
Polkit 定义出一系列操作,例如运行 GParted, 并将用户按照群组或用户名进行划分,例如 wheel 群组用户。了解linux 权限, 然后定义每个操作是否可以由某些用户执行,执行操作前是否需要一些额外的确认,例如通过输入密码确认用户是不是属于某个群组。
polkit在启动一些服务时,有可能会遇到polkit不能正常启动运行的情况,会报出以下错误:
Authorization not available. Checkif polkit service is running or see debug mes
可查看polkit的运行状态发现是failed,尝试重启:
#确认用户名和组名
cat /etc/passwd|grep polkit
polkitd:x:999:998:User for polkitd:/:/sbin/nologin
#查看服务状态
systemctl status polkit
● polkit.service - Authorization Manager
Loaded: loaded (/usr/lib/systemd/system/polkit.service; static; vendor preset: enabled)
Active: active (running) since Tue 2022-05-24 10:43:46 CST; 8 months 12 days ago
Docs: man:polkit(8)
Main PID: 542 (polkitd)
CGroup: /system.slice/polkit.service
└─542 /usr/lib/polkit-1/polkitd --no-debug
Feb 04 00:48:05 Yangguang-011 polkitd[542]: Registered Authentication Agent for unix-process:26570:2208281632 (system bus name :1.1205214 ...S.UTF-8)
Feb 04 00:49:13 Yangguang-011 polkitd[542]: Unregistered Authentication Agent for unix-process:26570:2208281632 (system bus name :1.120521...rom bus)
Feb 04 00:51:42 Yangguang-011 polkitd[542]: Registered Authentication Agent for unix-process:27093:2208303356 (system bus name :1.1205227 ...S.UTF-8)
Feb 04 00:51:42 Yangguang-011 polkitd[542]: Unregistered Authentication Agent for unix-process:27093:2208303356 (system bus name :1.120522...rom bus)
Feb 04 00:54:09 Yangguang-011 polkitd[542]: Registered Authentication Agent for unix-process:27680:2208318010 (system bus name :1.1205240 ...S.UTF-8)
Feb 04 00:55:33 Yangguang-011 polkitd[542]: Unregistered Authentication Agent for unix-process:27680:2208318010 (system bus name :1.120524...rom bus)
Feb 04 10:21:39 Yangguang-011 polkitd[542]: Registered Authentication Agent for unix-process:2065:2211723054 (system bus name :1.1207542 [...S.UTF-8)
Feb 04 10:21:39 Yangguang-011 polkitd[542]: Unregistered Authentication Agent for unix-process:2065:2211723054 (system bus name :1.1207542...rom bus)
Feb 04 10:55:40 Yangguang-011 polkitd[542]: Registered Authentication Agent for unix-process:8172:2211927116 (system bus name :1.1207682 [...S.UTF-8)
Feb 04 10:55:40 Yangguang-011 polkitd[542]: Unregistered Authentication Agent for unix-process:8172:2211927116 (system bus name :1.1207682...rom bus)
Hint: Some lines were ellipsized, use -l to show in full.
systemctl start polkit.service #重启
/usr/lib/polkit-1/polkitd --no-debug & #手动重启
ll /usr/lib/polkit-1/polkitd
-rwxr-xr-x. 1 root root 120432 Jan 26 2022 /usr/lib/polkit-1/polkitd
#检查dbus服务状态
systemctl status dbus.service
● dbus.service - D-Bus System Message Bus
Loaded: loaded (/usr/lib/systemd/system/dbus.service; static; vendor preset: disabled)
Active: active (running) since Tue 2022-05-24 10:43:46 CST; 8 months 12 days ago
Docs: man:dbus-daemon(1)
Main PID: 553 (dbus-daemon)
CGroup: /system.slice/dbus.service
└─553 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
Jul 08 22:04:26 Yangguang-011 dbus[553]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.free...service'
Jul 08 22:04:26 Yangguang-011 dbus[553]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jul 18 10:28:48 Yangguang-011 dbus[553]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.free...service'
Jul 18 10:28:48 Yangguang-011 dbus[553]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Jul 23 11:52:17 Yangguang-011 dbus[553]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.free...service'
Jul 23 11:52:17 Yangguang-011 dbus[553]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
Hint: Some lines were ellipsized, use -l to show in full.
systemctl restart dbus.service #异常的话重启
4)附录:关于sshd服务配置
相关经验表明,出现报错:sshd.service holdoff time over, scheduling restart. 是因为ssh启动后,没有给systemd发消息,systemd就一直等,超时后就重启ssh,导致ssh频繁挂起,但未启动成功,虽然有时看似并不影响登陆使用
处理:修改源码,在源码openssh-8.2p1目录下,在sshd.c这个主函数文件,找到调用server_accept_loop 这个函数的行,增加sd_notify(0, “READY=1”);行,完成后,相应的在源文件开头几行添加引用头文件:#include <systemd/sd-daemon.h>;
/* Signal systemd that we are ready to accept connections */
sd_notify(0, "READY=1");
/* Accept a connection and return in a forked child */
server_accept_loop(&sock_in, &sock_out,&newsock, config_s);
完成后编译安装:文章来源:https://www.toymoban.com/news/detail-596233.html
#默认的依赖中,不包含sd_notify 这个函数,所以这里需要安装依赖的包
yum install systemd-devel
#修改预编译文件makefile,找到变量 LIBS,增加-lsystemd,修改如下:
LIBS=-lcrypto -ldl -lutil -lz -lcrypt -lresolv -lsystemd
#然后
make & make install
或直接删除旧的sshd.service;使用脚本重新生成sshd服务;文章来源地址https://www.toymoban.com/news/detail-596233.html
到了这里,关于ssh 启动失败,状态报:activing(start),timeout exceeding的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!