bash 写单行时回声是原子的

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9926616/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 01:54:16  来源:igfitidea点击:

Is echo atomic when writing single lines

bashscriptingconcurrency

提问by LiKao

I am currently trying to get a script to write output from other started commands correctly into a log file. The script will write it's own messages to the log file using echo and there is a method to which I can pipe the lines from the other program.

我目前正在尝试获取一个脚本,以将其他启动命令的输出正确写入日志文件。该脚本将使用 echo 将它自己的消息写入日志文件,并且有一种方法可以将来自其他程序的行通过管道传输到该方法。

The main problem is, that the program which produces the output is started in the background, so my function that does the read may write concurently to the logfile. Could this be a problem? Echo always only writes a single line, so it should not be to hard to ensure atomicity. However I have looked in google and I have found no way to make sure it actually is atomic.

主要问题是,产生输出的程序是在后台启动的,所以我的读取函数可能会同时写入日志文件。这可能是个问题吗?Echo 总是只写一行,所以保证原子性应该不难。但是,我在谷歌中查看过,我发现无法确保它实际上是原子的。

Here is the current script:

这是当前的脚本:

LOG_FILE=/path/to/logfile

write_log() {
  echo "$(date +%Y%m%d%H%M%S);" >> ${LOG_FILE}
}

write_output() {
  while read data; do
    write_log "Message from SUB process: [ $data ]"
  done
}

write_log "Script started"
# do some stuff
call_complicated_program 2>&1 | write_output &
SUB_PID=$!
#do some more stuff
write_log "Script exiting"
wait $SUB_PID

As you can see, the script might write both on it's own as well as because of redirected output. Could this cause havok in the file?

如您所见,脚本可能会自行编写,也可能由于重定向输出而编写。这会导致文件中的破坏吗?

回答by Brian Campbell

echojust a simple wrapper around write(this is a simplification; see edit below for the gory details), so to determine if echo is atomic, it's useful to look up write. From the single UNIX specification:

echo只是一个简单的包装器write(这是一种简化;有关血腥的详细信息,请参阅下面的编辑),因此要确定 echo 是否是原子的,查找 write 很有用。从单一的 UNIX 规范

Atomic/non-atomic: A write is atomic if the whole amount written in one operation is not interleaved with data from any other process. This is useful when there are multiple writers sending data to a single reader. Applications need to know how large a write request can be expected to be performed atomically.This maximum is called {PIPE_BUF}. Thisvolume of IEEE Std 1003.1-2001 does not say whether write requests for more than {PIPE_BUF} bytes are atomic, but requires that writes of {PIPE_BUF}or fewer bytes shall be atomic.

原子性/非原子性:如果在一个操作中写入的全部量不与来自任何其他进程的数据交错,则写入是原子的。当有多个写入器向单个读取器发送数据时,这很有用。应用程序需要知道可以预期以原子方式执行的写入请求有多大。这个最大值称为 {PIPE_BUF}。IEEE Std 1003.1-2001 的本卷并未说明超过 {PIPE_BUF} 个字节的写入请求是否是原子的,但要求 {PIPE_BUF} 或更少字节的写入应该是原子的。

You can check PIPE_BUFon your system with a simple C program. If you're just printing a single line of output, that is not ridiculously long, it should be atomic.

您可以PIPE_BUF使用简单的 C 程序检查您的系统。如果您只是打印单行输出,那不会太长,它应该是原子的。

Here is a simple program to check the value of PIPE_BUF:

这是一个简单的程序来检查 的值PIPE_BUF

#include <limits.h>
#include <stdio.h>

int main(void) {
  printf("%d\n", PIPE_BUF);

  return 0;
}

On Mac OS X, that gives me 512 (the minimum allowed valuefor PIPE_BUF). On Linux, I get 4096. So if your lines are fairly long, make sure you check it on the system in question.

在Mac OS X,这给了我512(在最小允许值PIPE_BUF)。在 Linux 上,我得到 4096。因此,如果您的行相当长,请确保在相关系统上进行检查。

edit to add: I decided to check the implementation of echoin Bash, to confirm that it will print atomically. It turns out, echouses putcharor printfdepending on whether you use the -eoption. These are buffered stdio operations, which means that they fill up a buffer, and actually write it out only when a newline is reached (in line-buffered mode), the buffer is filled (in block-buffered mode), or you explicitly flush the output with fflush. By default, a stream will be in line buffered mode if it is an interactive terminal, and block buffered mode if it is any other file. Bash never sets the buffering type, so for your log file, it should default to block buffering mode. At then end of the echobuiltin, Bash calls fflushto flush the output stream. Thus, the output will always be flushed at the end of echo, but may be flushed earlier if it doesn't fit into the buffer.

编辑添加:我决定检查Bash 中的实现echo,以确认它会自动打印。事实证明,echo使用putcharprintf取决于您是否使用该-e选项。这些是缓冲的 stdio 操作,这意味着它们填充缓冲区,并且仅在到达换行符(在行缓冲模式下)、缓冲区被填充(在块缓冲模式下)或您显式刷新时才真正将其写出输出与fflush. 默认情况下,如果流是交互式终端,则流将处于行缓冲模式,如果是任何其他文件,则将处于阻塞缓冲模式。Bash 从不设置缓冲类型,因此对于您的日志文件,它应该默认为块缓冲模式。在随后结束的echo内置,巴什调用fflush刷新输出流。因此,输出将始终在 结束时刷新echo,但如果它不适合缓冲区,则可能会更早刷新。

The size of the buffer used may be BUFSIZ, though it may be different; BUFSIZis the default size if you set the buffer explicitly using setbuf, but there's no portable way to determine the actual the size of your buffer. There are also no portable guidelines for what BUFSIZis, but when I tested it on Mac OS X and Linux, it was twice the size of PIPE_BUF.

所用缓冲区的大小可能是BUFSIZ,但可能不同;BUFSIZ如果您使用 显式设置缓冲区setbuf,则是默认大小,但没有可移植的方法来确定缓冲区的实际大小。也没有关于什么BUFSIZ是便携式指南,但是当我在 Mac OS X 和 Linux 上测试它时,它的大小是PIPE_BUF.

What does this all mean? Since the output of echois all buffered, it won't actually call the writeuntil the buffer is filled or fflushis called. At that point, the output should be written, and the atomicity guarantee I mentioned above should apply. If the stdout buffer size is larger than PIPE_BUF, then PIPE_BUFwill be the smallest atomic unit that can be written out. If PIPE_BUFis larger than the stdout buffer size, then the stream will write the buffer out when the buffer fills up.

这是什么意思呢?由于 的输出echo都已缓冲,因此write在填充或fflush调用缓冲区之前,它实际上不会调用。在这一点上,输出应该被写入,并且我上面提到的原子性保证应该适用。如果 stdout 缓冲区大小大于PIPE_BUFPIPE_BUF则将是可以写出的最小原子单元。如果PIPE_BUF大于 stdout 缓冲区大小,则当缓冲区填满时,流会将缓冲区写出。

So, echois only guaranteed to atomically write sequences shorter than the smaller of PIPE_BUFand the size of the stdout buffer, which is most likely BUFSIZ. On most systems, BUFSIZis larger that PIPE_BUF.

因此,echo只能保证以原子方式写入短于PIPE_BUFstdout 缓冲区的较小者和大小的序列,这很可能是BUFSIZ. 在大多数系统上,BUFSIZ大于PIPE_BUF.

tl;dr: echowill atomically output lines, as long as those lines are short enough. On modern systems, you're probably safe up to 512 bytes, but it's not possible to determine the limit portably.

tl;drecho将自动输出行,只要这些行足够短。在现代系统上,最多 512 字节可能是安全的,但无法轻松确定限制。

回答by pizza

There is no involuntary file locking, but the >> operator is safe, the > operator is unsafe. So your practice is safe to do.

没有非自愿的文件锁定,但是 >> 操作符是安全的, > 操作符是不安全的。所以你的练习是安全的。

回答by Bruno Bronosky

I tried the approach from user:pizzaand could not get it to work like the answer from user:Brian Campbell. Let me know if I am doing something work and I'll update the answer. (And yes this is an answer because I'm actually giving a complete working demo.)

我尝试了user:pizza的方法,但无法让它像user:Brian Campbell的回答一样工作。如果我正在做某事,请告诉我,我会更新答案。(是的,这是一个答案,因为我实际上正在提供一个完整的工作演示。)

basic concurrency

基本并发

This just illustrates the problem

这只是说明问题

$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 &) done | grep GET
> GET / HTTP/1.1
>>  GET / HTTP/1.1
GET / HTTP/1.1
>>>  GET / HTTP/1.1
>>GET / HTTP/1.1

using echo on each line of output

在每一行输出上使用 echo

This solves the problem using Brian Campbell's method. (Note that the length of the line for which this works is limited.)

这解决了使用Brian Campbell 的方法的问题。(请注意,此操作的行的长度是有限的。)

$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 | while read; do echo "${REPLY}"; done &) done | grep GET
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1
> GET / HTTP/1.1

redirecting the for loop to append to stdout

重定向 for 循环以附加到标准输出

Instinct should tell you that this is not going to work because it redirects after all the the output of the forked curls have been merged.

本能会告诉你这不会起作用,因为它会在所有分叉卷发的输出合并后重定向。

$ for n in {1..5}; do (curl -svo /dev/null example.com 2>&1 &) done >> /dev/stdout | grep GET
> GET / HTTP/1.1
> GET / HTTP/1.1
>> >GET / HTTP/1.1
 >  GET / HTTP/1.1
 GET / HTTP/1.1

redirecting each curl to append to stdout

重定向每个 curl 以附加到标准输出

I suspect this failure is do to the fact that the entire content of each curl is being redirected and the size is greater than what the kernel is willing to block for. I have not taken the time to confirm that, but what Brian Campbell did share seems to support it.

我怀疑这个失败是因为每个 curl 的整个内容都被重定向并且大小大于内核愿意阻止的大小。我还没有花时间确认这一点,但布赖恩坎贝尔分享的内容似乎支持它。

$ for n in {1..5}; do (curl -svo /dev/null example.com >>/dev/stdout 2>&1 &) done | grep GET
>>  GET / HTTP/1.1
GET / HTTP/1.1
> GET / HTTP/1.1
GET / HTTP/1.1
> GET / HTTP/1.1