git 如何测量Linux中命令的IOPS?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24442386/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 10:09:22  来源:igfitidea点击:

How to measure IOPS for a command in linux?

linuxgitcommand-linebenchmarkingprovisioned-iops

提问by dukeofgaming

I'm working on a simulation model, where I want to determine when the storage IOPS capacity becomes a bottleneck (e.g. and HDD has ~150 IOPS, while an SSD can have 150,000). So I'm trying to come up with a way to benchmark IOPS in a command (git) for some of it's different operations (push, pull, merge, clone).

我正在研究一个模拟模型,我想在其中确定存储 IOPS 容量何时成为瓶颈(例如,HDD 具有 ~150 IOPS,而 SSD 可以具有 150,000)。所以我试图想出一种方法来在命令 (git) 中对一些不同的操作(推、拉、合并、克隆)中的 IOPS 进行基准测试。

So far, I have found tools like iostat, however, I am not sure how to limit the report to what a single command does.

到目前为止,我已经找到了 iostat 之类的工具,但是,我不确定如何将报告限制为单个命令的作用。

The best idea I can come up with is to determine my HDD IOPS capacity, use time on the actual command, see how long it lasts, multiply that by IOPS and those are my IOPS:

我能想到的最好的主意是确定我的 HDD IOPS 容量,在实际命令上使用时间,查看它持续多长时间,将其乘以 IOPS,这些就是我的 IOPS:

HDD ->150 IOPS
time df -h

real    0m0.032s

150 * .032 = 4.8 IOPS

But, this is of course very stupid, because the duration of the execution may have been related to CPU usage rather than HDD usage, so unless usage of HDD was 100% for that time, it makes no sense to measure things like that.

但是,这当然是非常愚蠢的,因为执行的持续时间可能与 CPU 使用率而不是 HDD 使用率有关,所以除非当时 HDD 的使用率是 100%,否则这样衡量是没有意义的。

So, how can I measure the IOPS for a command?

那么,如何测量命令的 IOPS?

采纳答案by pndc

There are multiple time(1) commands on a typical Linux system; the default is a bash(1) builtin which is somewhat basic. There is also /usr/bin/timewhich you can run by either calling it exactly like that, or telling bash(1) to not use aliases and builtins by prefixing it with a backslash thus: \time. Debian has it in the "time" package which is installed by default, Ubuntu is likely identical, and other distributions will be quite similar.

典型的 Linux 系统上有多个 time(1) 命令;默认是 bash(1) 内置的,它有点基本。还有/usr/bin/time,你可以通过调用它完全一样,或告诉的bash(1)不被前面加上一个反斜线因此,它使用别名和建宏运行:\time。Debian 在默认安装的“time”包中有它,Ubuntu 可能是相同的,其他发行版将非常相似。

Invoking it in a similar fashion to the shell builtin is already more verbose and informative, albeit perhaps more opaque unless you're already familiar with what the numbers really mean:

以类似于 shell 内置函数的方式调用它已经更加冗长和信息丰富,尽管可能更加不透明,除非您已经熟悉这些数字的真正含义:

$ \time df
[output elided]
0.00user 0.00system 0:00.01elapsed 66%CPU (0avgtext+0avgdata 864maxresident)k
0inputs+0outputs (0major+261minor)pagefaults 0swaps

However, I'd like to draw your attention to the man page which lists the -foption to customise the output format, and in particular the %wformat which counts the number of times the process gave up its CPU timeslice for I/O:

但是,我想提请您注意手册页,其中列出了-f自定义输出格式的选项,特别是%w计算进程为 I/O 放弃 CPU 时间片的次数的格式:

$ \time -f 'ios=%w' du Maildir >/dev/null
ios=184
$ \time -f 'ios=%w' du Maildir >/dev/null
ios=1

Note that the first run stopped for I/O 184 times, but the second run stopped just once. The first figure is credible, as there are 124 directories in my ~/Maildir: the reading of the directory and the inode gives roughly two IOPS per directory, less a bit because some inodes were likely next to each other and read in one operation, plus some extra again for mapping in the du(1) binary, shared libraries, and so on.

请注意,第一次运行因 I/O 停止了 184 次,但第二次运行仅停止了一次。第一个数字是可信的,因为我的目录中有 124 个目录~/Maildir:目录和 inode 的读取每个目录大约提供两个 IOPS,少一点,因为一些 inode 可能彼此相邻并在一个操作中读取,加上一些额外的再次用于映射 du(1) 二进制文件、共享库等。

The second figure is of course lower due to Linux's disk cache. So the final piece is to flush the cache. sync(1) is a familiar command which flushes dirty writes to disk, but doesn't flush the read cache. You can flush that one by writing 3 to /proc/sys/vm/drop_caches. (Other values are also occasionally useful, but you want 3 here.) As a non-root user, the simplest way to do this is:

由于 Linux 的磁盘缓存,第二个数字当然更低。所以最后一步是刷新缓存。sync(1) 是一个熟悉的命令,它将脏写入刷新到磁盘,但不刷新读取缓存。您可以通过将 3 写入/proc/sys/vm/drop_caches. (其他值偶尔也有用,但您需要 3 在这里。)作为非 root 用户,最简单的方法是:

echo 3 | sudo tee /proc/sys/vm/drop_caches

Combining that with /usr/bin/timeshould allow you to build the scripts you need to benchmark the commands you're interested in.

将其与/usr/bin/time应该允许您构建对您感兴趣的命令进行基准测试所需的脚本。

As a minor aside, tee(1) is used because this won't work:

作为未成年人,使用 tee(1) 是因为这不起作用:

sudo echo 3 >/proc/sys/vm/drop_caches

The reason? Although the echo(1) runs as root, the redirection is as your normal user account, which doesn't have write permissions to drop_caches. tee(1) effectively does the redirection as root.

原因?尽管 echo(1) 以 root 身份运行,重定向是作为您的普通用户帐户,它没有对drop_caches. tee(1) 以 root 身份有效地进行重定向。

回答by Christophe Vu-Brugier

The iotopcommand collects I/O usage information about processes on Linux. By default, it is an interactive command but you can run it in batch mode with -b/ --batch. Also, you can a list of processes with -p/ --pid. Thus, you can monitor the activity of a gitcommand with:

iotop命令收集有关 Linux 上进程的 I/O 使用信息。默认情况下,它是一个交互式命令,但您可以使用-b/以批处理模式运行它--batch。此外,您可以使用-p/列出进程--pid。因此,您可以使用以下git命令监视命令的活动:

$ sudo iotop -p $(pidof git) -b

You can change the delay with -d/ --delay.

您可以使用-d/更改延迟--delay

回答by totti

You can use pidstat:
pidstat -d 2
More specifically pidstat -d 2 | grep COMMANDor pidstat -C COMMANDNAME -d 2

您可以使用 pidstat:
pidstat -d 2
更具体地说pidstat -d 2 | grep COMMANDpidstat -C COMMANDNAME -d 2

The pidstatcommand is used for monitoring individual tasks currently being managed by the Linux kernel. It writes to standard output activities for every task selected with option -p or for every task managed by the Linux kernel if option -p ALL has been used. Not selecting any tasks is equivalent to specifying -p ALL but only active tasks (tasks with non-zero statistics values) will appear in the report. The pidstat command can also be used for monitoring the child processes of selected tasks.

pidstat命令用于监视当前由 Linux 内核管理的单个任务。它为使用选项 -p 选择的每个任务或 Linux 内核管理的每个任务(如果已使用选项 -p ALL)写入标准输出活动。不选择任何任务等同于指定 -p ALL 但只有活动任务(具有非零统计值的任务)会出现在报告中。pidstat 命令还可用于监视选定任务的子进程。

-C commDisplay only tasks whose command name includes the stringcomm. This string can be a regular expression.

-C comm 仅显示命令名称包含 stringcomm 的任务。该字符串可以是正则表达式。