Python 读取大输出时,Paramiko 通道卡住

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14643861/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 12:02:02  来源:igfitidea点击:

Paramiko channel stucks when reading large ouput

pythonparamiko

提问by vipulb

I have a code where i am executing a command on remote Linux machine and reading the output using Paramiko. The code def looks like this:

我有一个代码,我在远程 Linux 机器上执行命令并使用 Paramiko 读取输出。代码 def 如下所示:

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(IPAddress, username=user['username'], password=user['password'])


chan = self.ssh.get_transport().open_session()

chan.settimeout(10800)

try:
    # Execute thecommand
    chan.exec_command(cmd)

    contents = StringIO.StringIO()

    data = chan.recv(1024)

    # Capturing data from chan buffer.
    while data:
        contents.write(data)
        data = chan.recv(1024)

except socket.timeout:
    raise socket.timeout


output = contents.getvalue()

return output,chan.recv_stderr(600),chan.recv_exit_status()

The above code works for small outputs, but it gets stuck for larger outputs.

上面的代码适用于小输出,但它会卡住更大的输出。

Is there any buffer related issue in here?

这里有任何与缓冲区相关的问题吗?

采纳答案by bruce_w

i see no problem related to stdout channel, but i'm not sure about the way you are handling stderr. Can you confirm, its not the stderr capturing thats causing problem? I'll try out your code and let you know.

我认为没有与 stdout 通道相关的问题,但我不确定您处理 stderr 的方式。你能确认,这不是导致问题的 stderr 捕获吗?我会尝试你的代码并让你知道。

Update: when a command you execute gives lots of messages in STDERR, your code freezes. I'm not sure why, but recv_stderr(600)might be the reason. So capture error stream the same way you capture standard output. something like,

更新:当您执行的命令在 STDERR 中提供大量消息时,您的代码会冻结。我不确定为什么,但recv_stderr(600)可能是原因。因此,捕获错误流的方式与捕获标准输出的方式相同。就像是,

contents_err = StringIO.StringIO()

data_err = chan.recv_stderr(1024)
while data_err:
    contents_err.write(data_err)
    data_err = chan.recv_stderr(1024)

you may even first try and change recv_stderr(600)to recv_stderr(1024)or higher.

您甚至可以先尝试更改recv_stderr(600)recv_stderr(1024)或更高。

回答by Spencer Rathbun

It's easier if you use the high level representation of an open ssh session. Since you already use ssh-clientto open your channel, you can just run your command from there, and avoid the extra work.

如果您使用开放 ssh 会话的高级表示会更容易。由于您已经使用ssh-client打开您的频道,您可以从那里运行您的命令,并避免额外的工作。

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(IPAddress, username=user['username'], password=user['password'])

stdin, stdout, stderr = ssh.exec_command(cmd)
for line in stdout.readlines():
    print line
for line in stderr.readlines():
    print line

You will need to come back and read from these files handles again if you receive additional data afterwards.

如果您之后收到其他数据,您将需要返回并再次读取这些文件句柄。

回答by vipulb

I am posting the final code which worked with inputs from Bruce Wayne( :) )

我正在发布与布鲁斯·韦恩 (Bruce Wayne) 的输入一起使用的最终代码(:))

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(IPAddress, username=user['username'], password=user['password'])

chan = self.ssh.get_transport().open_session()
chan.settimeout(10800)

try:
    # Execute the given command
    chan.exec_command(cmd)

    # To capture Data. Need to read the entire buffer to capture output
    contents = StringIO.StringIO()
    error = StringIO.StringIO()

    while not chan.exit_status_ready():
        if chan.recv_ready():
            data = chan.recv(1024)
            #print "Indside stdout"
            while data:
                contents.write(data)
                data = chan.recv(1024)

        if chan.recv_stderr_ready():            
            error_buff = chan.recv_stderr(1024)
            while error_buff:
                error.write(error_buff)
                error_buff = chan.recv_stderr(1024)

    exit_status = chan.recv_exit_status()

except socket.timeout:
    raise socket.timeout

output = contents.getvalue()
error_value = error.getvalue()

return output, error_value, exit_status

回答by fubupc

Actually I think all above answers can't resolve the real problem:

其实我认为以上所有答案都不能解决真正的问题:

if the remote program produce large amount of stderr output firstthen

如果远程程序首先产生大量的stderr 输出,那么

stdout.readlines()
stderr.readlines()

would hung forever. although

会永远挂着。虽然

stderr.readlines()
stdout.readlines()

would resolve this case, but it will fail in case the remote program produce large amount of stdout output first.

将解决这种情况,但如果远程程序首先产生大量stdout 输出,它将失败。

I don't have a solution yet...

我还没有解决办法...

回答by d0n

To have paramiko commands behave like a subprocess.call you may use this piece of code (tested with python-3.5 and paramiko-2.1.1):

要让 paramiko 命令表现得像 subprocess.call,你可以使用这段代码(用 python-3.5 和 paramiko-2.1.1 测试):

#!/usr/bin/env /usr/bin/python3                                                

import os                                                                  
import sys                                                                                                                    
from paramiko import SSHClient, AutoAddPolicy               
from socket import getfqdn                                       

class SecureSHell(object):                                                 
    reuser = os.environ['USER']                                            
    remote = ''                                                            
    def __init__(self, *args, **kwargs):                                   
        for arg in args:                                                   
            if hasattr(self, arg):                                         
                setattr(self, arg, True)                                   
        for (key, val) in kwargs.items():                                  
            if hasattr(self, key):                                         
                setattr(self, key, val)

    @staticmethod                                                          
    def _ssh_(remote, reuser, port=22):                                    
        if '@' in remote:                                                  
            _reuser, remote = remote.split('@')                            
        _fqdn = getfqdn(remote)                                            
        remote = _fqdn if _fqdn else remote                                
        ssh = SSHClient()                                                  
        ssh.set_missing_host_key_policy(AutoAddPolicy()) 
        ssh.connect(remote, int(port), username=reuser)                                                                     
        return ssh                                                         

    def call(self, cmd, remote=None, reuser=None):                         
        remote = remote if remote else self.remote                         
        reuser = reuser if reuser else self.reuser              
        ssh = self._ssh_(remote, reuser)                                   
        chn = ssh.get_transport().open_session()                           
        chn.settimeout(10800)                                              
        chn.exec_command(cmd)                                              
        while not chn.exit_status_ready():                                 
            if chn.recv_ready():                                           
                och = chn.recv(1024)                                       
                while och:                                                 
                    sys.stdout.write(och.decode())                         
                    och = chn.recv(1024)                                   
            if chn.recv_stderr_ready():                                    
                ech = chn.recv_stderr(1024)                                
                while ech:                                                 
                    sys.stderr.write(och.decode())                         
                    ech = chn.recv_stderr(1024)                            
        return int(chn.recv_exit_status())                                 

ssh = SecureSHell(remote='example.com', user='d0n')                       
ssh.call('find')                                                           

回答by jeremysprofile

TL;DR: Call stdout.readlines()before stderr.readlines()if using ssh.exec_command()

TL;DR:如果使用,stdout.readlines()请先致电stderr.readlines()ssh.exec_command()

If you use @Spencer Rathbun's answer:

如果您使用@Spencer Rathbun 的回答:

sh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(IPAddress, username=user['username'], password=user['password'])

stdin, stdout, stderr = ssh.exec_command(cmd)

You might want to be aware of the limitations that can arise from having large outputs.

您可能希望了解大输出可能带来的限制。

Experimentally, stdin, stdout, stderr = ssh.exec_command(cmd)will not be able to write the full output immediately to stdoutand stderr. More specifically, a buffer appears to hold 2^21(2,097,152) characters before filling up. If anybuffer is full, exec_commandwill block on writing to that buffer, and will stay blocked until that buffer is emptied enough to continue. This means that if your stdoutis too large, you'll hang on reading stderr, as you won't receive EOF in either buffer until it can write the full output.

实验上,stdin, stdout, stderr = ssh.exec_command(cmd)将无法立即将完整输出写入stdoutstderr。更具体地说,缓冲区2^21在填满之前似乎可以容纳(2,097,152) 个字符。如果任何缓冲区已满,exec_command则将阻止写入该缓冲区,并将保持阻塞状态,直到该缓冲区足够空以继续。这意味着如果您stdout的文件太大,您将继续阅读stderr,因为在它可以写入完整输出之前,您不会在任一缓冲区中收到 EOF。

The easy way around this is the one Spencer uses - get all the normal output via stdout.readlines()before trying to read stderr. This will only fail if you have more than 2^21characters in stderr, which is an acceptable limitation in my use case.

解决此问题的简单方法是 Spencer 使用的方法 -stdout.readlines()在尝试读取之前获取所有正常输出stderr。只有当 中的2^21字符数超过时stderr,这才会失败,这在我的用例中是可以接受的限制。

I'm mainly posting this because I'm dumb and spent far, far too long trying to figure out how I broke my code, when the answer was that I was reading from stderrbefore stdoutand my stdoutwas too big to fit in the buffer.

我发布这个主要是因为我很笨并且花了很长时间试图弄清楚我是如何破坏我的代码的,而答案是我stderr之前正在阅读stdout并且我的代码stdout太大而无法放入缓冲区。