在 Python 中，如何将字符串写入远程机器上的文件？

Question

提问by Iron Pillow

On Machine1, I have a Python2.7 script that computes a big (up to 10MB) binary string in RAM that I'd like to write to a disk file on Machine2, which is a remote machine. What is the best way to do this?

在 Machine1 上，我有一个 Python2.7 脚本，它在 RAM 中计算一个大的（最多 10MB）二进制字符串，我想将其写入 Machine2 上的磁盘文件，这是一台远程机器。做这个的最好方式是什么？

Constraints:

约束：

Both machines are Ubuntu 13.04. The connection between them is fast -- they are on the same network.
The destination directory might not yet exist on Machine2, so it might need to be created.
If it's easy, I would like to avoid writing the string from RAM to a temporary disk file on Machine1. Does that eliminate solutions that might use a system call to rsync?
Because the string is binary, it might contain bytes that could be interpreted as a newline. This would seem to rule out solutions that might use a system call to the echo command on Machine2.
I would like this to be as lightweight on Machine2 as possible. Thus, I would like to avoid running services like ftp on Machine2 or engage in other configuration activities there. Plus, I don't understand security that well, and so would like to avoid opening additional ports unless truly necessary.
I have ssh keys set up on Machine1 and Machine2, and would like to use them for authentication.
EDIT: Machine1 is running multiple threads, and so it is possible that more than one thread could attempt to write to the same file on Machine2 at overlapping times. I do not mind the inefficiency caused by having the file written twice (or more) in this case, but the resulting datafile on Machine2 should not be corrupted by simultaneous writes. Maybe an OS lock on Machine2 is needed?

两台机器都是 Ubuntu 13.04。它们之间的连接很快——它们在同一个网络上。
Machine2 上可能尚不存在目标目录，因此可能需要创建它。
如果很容易，我想避免将字符串从 RAM 写入 Machine1 上的临时磁盘文件。这是否消除了可能使用系统调用 rsync 的解决方案？
因为字符串是二进制的，它可能包含可以解释为换行符的字节。这似乎排除了可能对 Machine2 上的 echo 命令使用系统调用的解决方案。
我希望它在 Machine2 上尽可能轻巧。因此，我想避免在 Machine2 上运行 ftp 之类的服务或在那里从事其他配置活动。另外，我不太了解安全性，因此除非确实有必要，否则我希望避免打开其他端口。
我在 Machine1 和 Machine2 上设置了 ssh 密钥，并希望使用它们进行身份验证。
编辑：Machine1 正在运行多个线程，因此多个线程可能会尝试在重叠时间写入 Machine2 上的同一个文件。我不介意在这种情况下将文件写入两次（或更多次）导致效率低下，但是 Machine2 上的结果数据文件不应因同时写入而损坏。也许需要在 Machine2 上锁定操作系统？

I'm rooting for an rsync solution, since it is a self-contained entity that I understand reasonably well, and requires no configuration on Machine2.

我支持 rsync 解决方案，因为它是一个独立的实体，我非常理解，并且不需要在 Machine2 上进行配置。

Answer 1

采纳答案by Erik Kaplun

You open a new SSH process to Machine2 using subprocess.Popenand then you write your data to its STDIN.

您使用 Machine2 打开一个新的 SSH 进程subprocess.Popen，然后将数据写入其 STDIN。

import subprocess

cmd = ['ssh', 'user@machine2',
       'mkdir -p output/dir; cat - > output/dir/file.dat']

p = subprocess.Popen(cmd, stdin=subprocess.PIPE)

your_inmem_data = 'foobarbaz$ python process.py | ssh <the same ssh command>
' * 1024 * 1024

for chunk_ix in range(0, len(your_inmem_data), 1024):
    chunk = your_inmem_data[chunk_ix:chunk_ix + 1024]
    p.stdin.write(chunk)

I've just verified that it works as advertised and copies all of the 10485760 dummy bytes.

我刚刚验证了它的工作原理并复制了所有 10485760 个虚拟字节。

P.S.A potentially cleaner/more elegant solution would be to have the Python program write its output to sys.stdoutinstead and do the piping to sshexternally:

PS一个可能更干净/更优雅的解决方案是让 Python 程序将其输出写入sys.stdout，并在ssh外部执行管道：

import paramiko

def put_file(machinename, username, dirname, filename, data):
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh.connect(machinename, username=username)
    sftp = ssh.open_sftp()
    try:
        sftp.mkdir(dirname)
    except IOError:
        pass
    f = sftp.open(dirname + '/' + filename, 'w')
    f.write(data)
    f.close()
    ssh.close()


data = 'This is arbitrary data\n'.encode('ascii')
put_file('v13', 'rob', '/tmp/dir', 'file.bin', data)

Answer 2

回答by Rob?

Paramikosupports opening files on remote machines:

Paramiko支持在远程机器上打开文件：

from sh import ssh
remote_host = ssh.bake(<remote host>) 
remote_host.dd(_in = <your binary string>, of=<output filename on remote host>)

Answer 3

回答by LeuX

If just calling a subprocess is all you want, maybe sh.pycould be the right thing.

如果您只需要调用一个子进程，那么 sh.py可能是正确的选择。

##代码##

Answer 4

回答by brm

A solution in which you don't explicitly send your data over some connection would be to use sshfs. You can use it to mount a directory from Machine2 somewhere on Machine1 and writing to a file in that directory will automatically result in the data being written to Machine2.

不通过某些连接显式发送数据的解决方案是使用sshfs。您可以使用它从 Machine2 的某个目录挂载 Machine1 上的某个目录，写入该目录中的文件将自动导致数据写入 Machine2。

在 Python 中，如何将字符串写入远程机器上的文件？

提问by Iron Pillow

采纳答案by Erik Kaplun

回答by Rob?

回答by LeuX

回答by brm

相关推荐

最近更新

标签

在 Python 中，如何将字符串写入远程机器上的文件？

提问by Iron Pillow

采纳答案by Erik Kaplun

回答by Rob?

回答by LeuX

回答by brm

相关推荐

Python骰子滚动模拟

Python 如何使用两点的 x 和 y 坐标绘制一条线？

Python defaultdict 的嵌套 defaultdict

Python 尾随斜杠在 Flask 路径规则中触发 404

相关推荐

最近更新

标签