linux 子进程(在后台启动ssh进程)communicate hang if enable stderr

disbfnqx 于 2023-10-16 发布在 Linux

关注(0)|答案(3)|浏览(140)

我有下一个代码，它会做下一个：

ssh -f -M让ssh在后台启动共享套接字的子进程
1.由于上面是在后台，所以对于第二个ssh连接，我们可以重用socket /tmp/control-channel来连接ssh服务器，而不需要密码。

test.py：

import subprocess
import os
import sys
import stat

ssh_user = "my_user"       # change to your account
ssh_passwd = "my_password" # change to your password

try:
    os.remove("/tmp/control-channel")
except:
    pass

# prepare passwd file
file = open("./passwd","w")
passwd_content = f"#!/bin/sh\necho {ssh_passwd}"
file.write(passwd_content)
file.close()
os.chmod("./passwd", stat.S_IRWXU)

# setup shared ssh socket, put it in background
env = {'SSH_ASKPASS': "./passwd", 'DISPLAY':'', 'SSH_ASKPASS_REQUIRE':'force'}
args = ['ssh', '-f', '-o', 'LogLevel=ERROR', '-x', '-o', 'ConnectTimeout=30', '-o', 'ControlPersist=300', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'StrictHostKeyChecking=no', '-o', 'ServerAliveInterval=15', '-MN', '-S', '/tmp/control-channel', '-p', '22', '-l', ssh_user, 'localhost']
process = subprocess.Popen(args, env=env,
        stdout=subprocess.PIPE,
#        stderr=subprocess.STDOUT,   # uncomment this line to enable stderr will make subprocess hang
        stdin=subprocess.DEVNULL,
        start_new_session=True)
sout, serr = process.communicate()
print(sout)
print(serr)

# use shared socket
args2 = ['ssh', '-o', 'LogLevel=ERROR', '-o', 'ControlPath=/tmp/control-channel', '-p', '22', '-l', ssh_user, 'localhost', 'uname -a']
process2 = subprocess.Popen(args2,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        stdin=subprocess.DEVNULL)
content, _ = process2.communicate()
print(content)

执行：

$ python3 test.py
b''
None
b'Linux shmachine 4.19.0-21-amd64 #1 SMP Debian 4.19.249-2 (2022-06-30) x86_64 GNU/Linux\n'

到目前为止一切顺利，如果我在第一个子进程中取消注解stderr=subprocess.STDOUT，它将挂起：

$ python3 test.py
^CTraceback (most recent call last):
  File "test.py", line 29, in <module>
    sout, serr = process.communicate()
  File "/usr/lib/python3.7/subprocess.py", line 926, in communicate
    stdout = self.stdout.read()
KeyboardInterrupt

我想知道这里有什么问题？
我的环境：

$ python3 --version
Python 3.7.3
$ ssh -V
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1n  15 Mar 2022
$ cat /etc/issue
Debian GNU/Linux 10 \n \l

更新：我看到this post类似于我的问题，但没有答案。
问题2：将communicate更改为wait使其工作，但pipe size which wait use肯定小于memory size which communicate use，所以我仍然想知道为什么我不能使其与communicate一起工作。

linux

来源：https://stackoverflow.com/questions/77206483/subprocess-launch-ssh-process-in-background-communicate-hang-if-enable-stderr

3条答案

按热度按时间

qmelpv7a1#

答案其实就隐藏在python文档子进程.Popen中，在Popen.communicate的文档中，读到以下内容：
Note: The data read is buffered in memory, so do not use this method if the data size is large or unlimited.
因此，一个（潜在的）可行的解决方案（因为我实际上没有一个可以测试的SSH）是向communicate调用添加一个timeout。

赞(0）回复(0）举报 2023-10-16

62lalag42#

我删除了stdout管道集，只留下stderr进行最小的检查，最后使用strace来确认这是在ssh 8.4中修复的Y2020旧ssh的bug。所以巨蟒确实纠正了行为。。
1.我看到它被卡住了，就像下一个：

$ strace python3 test.py
lseek(3, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
read(3, 0x1744f00, 8192)                = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=26779, si_uid=1001, si_status=0, si_utime=1, si_stime=0} ---
read(3,

$ sudo lsof -a -c python -d 3
COMMAND   PID     USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
python3 26778 nxa13855    3r  FIFO   0,12      0t0 9186593 pipe

$ ls -l "/proc/26778/fd/3"
lr-x------ 1 nxa13855 atg 64 Oct  5 20:22 /proc/26778/fd/3 -> 'pipe:[9186593]'
$ lsof | grep 9186593
python3   26778                   nxa13855    3r     FIFO               0,12      0t0    9186593 pipe
ssh       26783                   nxa13855    2w     FIFO               0,12      0t0    9186593 pipe

1.这意味着ssh -f在将ssh置于后台时没有关闭stderr，因此communicate无法读取stderr的一个字节，然后挂起为next：

File "/usr/lib/python3.7/subprocess.py", line 929, in communicate
stderr = self.stderr.read()

我用openssh commit确认

赞(0）回复(0）举报 2023-10-16

jxct1oxe3#

问题是，即使在调用communicate()方法之后，ssh -f命令仍将继续在后台运行。这是因为communicate()方法只等待进程完成对标准输出和标准错误流的写入。它不会等待进程实际终止。
当您将ssh -f命令的标准错误流重定向到其标准输出流时，communicate()方法将永远不会返回。这是因为ssh -f命令只要运行就会继续写入其标准输出流。
要解决此问题，您需要：
1.调用process.terminate()方法显式终止ssh -f命令。
1.使用communicate()方法的timeout关键字参数来指定超时时间，超过该时间，即使进程仍在运行，communicate()方法也会返回。
下面是如何使用process.terminate()方法解决问题的示例：

import subprocess
import os
import stat

ssh_user = "my_user"       # change to your account
ssh_passwd = "my_password" # change to your password

try:
    os.remove("/tmp/control-channel")
except:
    pass

# prepare passwd file
file = open("./passwd","w")
passwd_content = f"#!/bin/sh\necho {ssh_passwd}"
file.write(passwd_content)
file.close()
os.chmod("./passwd", stat.S_IRWXU)

# setup shared ssh socket, put it in background
env = {'SSH_ASKPASS': "./passwd", 'DISPLAY':'', 'SSH_ASKPASS_REQUIRE':'force'}
args = ['ssh', '-f', '-o', 'LogLevel=ERROR', '-x', '-o', 'ConnectTimeout=30', '-o', 'ControlPersist=300', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'StrictHostKeyChecking=no', '-o', 'ServerAliveInterval=15', '-MN', '-S', '/tmp/control-channel', '-p', '22', '-l', ssh_user, 'localhost']
process = subprocess.Popen(args, env=env,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,   # enable stderr
        stdin=subprocess.DEVNULL,
        start_new_session=True)

# wait for the ssh command to start up
process.wait(timeout=10)

# use shared socket
args2 = ['ssh', '-o', 'LogLevel=ERROR', '-o', 'ControlPath=/tmp/control-channel', '-p', '22', '-l', ssh_user, 'localhost', 'uname -a']
process2 = subprocess.Popen(args2,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        stdin=subprocess.DEVNULL)
content, _ = process2.communicate()

# terminate the ssh command
process.terminate()

print(content)

在使用共享套接字之前，此代码将等待ssh -f命令启动长达10秒。如果ssh -f命令没有在10秒内启动，代码将引发TimeoutError异常。
一旦共享套接字可用，代码将使用它连接到SSH服务器并运行uname -a命令。uname -a命令的输出将被打印到控制台。
最后，代码将使用process.terminate()方法终止ssh -f命令。

赞(0）回复(0）举报 2023-10-16

我来回答

linux 子进程(在后台启动ssh进程)communicate hang if enable stderr

3条答案

相关问题

热门标签

最新问答