Linux How to use grep efficiently?

Question

提问by Legend

I have a large number of small files to be searched. I have been looking for a good de-facto multi-threaded version of grepbut could not find anything. How can I improve my usage of grep? As of now I am doing this:

grep -R "string" >> Strings

Answer 1

采纳答案by Legend

If you have xargs installed on a multi-core processor, you can benefit from the following just in case someone is interested.

Environment:

Processor: Dual Quad-core 2.4GHz
Memory: 32 GB
Number of files: 584450
Total Size: ~ 35 GB

Tests:

1. Find the necessary files, pipe them to xargs and tell it to execute 8 instances.

time find ./ -name "*.ext" -print0 | xargs -0 -n1 -P8 grep -H "string" >> Strings_find8

real    3m24.358s
user    1m27.654s
sys     9m40.316s

2. Find the necessary files, pipe them to xargs and tell it to execute 4 instances.

time find ./ -name "*.ext" -print0 | xargs -0 -n1 -P4 grep -H "string" >> Strings

real    16m3.051s
user    0m56.012s
sys     8m42.540s

3. Suggested by @Stephen: Find the necessary files and use + instead of xargs

time find ./ -name "*.ext" -exec grep -H "string" {} \+ >> Strings

real    53m45.438s
user    0m5.829s
sys     0m40.778s

4. Regular recursive grep.

grep -R "string" >> Strings

real    235m12.823s
user    38m57.763s
sys     38m8.301s

For my purposes, the first command worked just fine.

Answer 2

回答by Karthik Gurusamy

Wondering why -n1is used below won't it be faster to use a higher value (say -n8? or leave it out so xargs will do the right thing)?

xargs -0 -n1 -P8 grep -H "string"

Seems it will be more efficient to give each grep that's forked to process on more than one file (I assume -n1 will give only one file name in argv for the grep) -- as I see it, we should be able to give the highest n possible on the system (based on argc/argvmax length limitation). So the setup cost of bringing up a new grep process is not incurred more often.

Linux How to use grep efficiently?

提问by Legend

采纳答案by Legend

回答by Karthik Gurusamy

相关推荐

最近更新

标签

Linux How to use grep efficiently?

提问by Legend

采纳答案by Legend

回答by Karthik Gurusamy

相关推荐

C#中同步接口和实现注释的方法

在 Linux 中将主机名添加到 /etc/hosts

如何在 Linux 中将 txt 转换为 .gz / .z

全局键盘挂钩 (C#)

相关推荐

最近更新

标签