bash Shell 脚本与 C 性能
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/13088807/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Shell script vs C performance
提问by Kohakukun
I was wondering how bad would be the impact in the performance of a program migrated to shell script from C.
我想知道从 C 迁移到 shell 脚本的程序对性能的影响会有多糟糕。
I have intensive I/O operations.
我有密集的 I/O 操作。
For example, in C, I have a loop reading from a filesystem file and writing into another one. I'm taking parts of each line without any consistent relation. I'm doing this using pointers. A really simple program.
例如,在 C 中,我有一个循环读取文件系统文件并写入另一个文件。我在没有任何一致关系的情况下对每一行进行部分处理。我正在使用指针来做这件事。一个非常简单的程序。
In the Shell script, to move through a line, I'm using ${var:(char):(num_bytes)}. After I finish processing each line I just concatenate it to another file.
在 Shell 脚本中,要移动一行,我使用${var:(char):(num_bytes)}. 处理完每一行后,我只是将它连接到另一个文件。
"$out" >> "$filename"
The program does something like:
该程序执行以下操作:
while read line; do
    out="$out${line:10:16}.${line:45:2}"
    out="$out${line:106:61}"
    out="$out${line:189:3}"
    out="$out${line:215:15}"
    ...
    echo "$out" >> "outFileName"
done < "$fileName"
The problem is, C takes like half a minute to process a 400MB file and the shell script takes 15 minutes.
问题是,C 处理一个 400MB 的文件需要半分钟,而 shell 脚本需要 15 分钟。
I don't know if I'm doing something wrong or not using the right operator in the shell script.
我不知道是我做错了什么还是没有在 shell 脚本中使用正确的运算符。
Edit: I cannot use awk since there is not a pattern to process the line
编辑:我不能使用 awk 因为没有处理该行的模式
I tried commenting the "echo $out" >> "$outFileName" but it doesn't gets much better. I think the problem is the ${line:106:61} operation. Any suggestions?
我试着评论 "echo $out" >> "$outFileName" 但它并没有变得更好。我认为问题在于 ${line:106:61} 操作。有什么建议?
Thanks for your help.
谢谢你的帮助。
采纳答案by Kohakukun
As donitor and Dietrich sugested, I did a little research about the AWK language and, again, as they said, it was a total success. here is a little example of the AWK program:
正如donitor 和Dietrich 所说,我对AWK 语言进行了一些研究,而且正如他们所说,这完全是成功的。这是 AWK 程序的一个小例子:
#!/bin/awk -f
{
    option=substr(while read line; do
    echo "${line:10:16}.${line:45:2}${line:106:61}${line:189:3}${line:215:15}..." 
done < "$fileName" > "$outFileName"
, 5, 9);
    if (option=="SOMETHING"){
        type=substr(cut -c 10-26,45-46,106-166 "$fileName" > "$outFileName"
, 80, 1)
        if (type=="A"){
            type="01";
        }else if (type=="B"){
            type="02";
        }else if (type=="C"){
            type="03";
        }
        print substr(##代码##, 7, 3) substr(##代码##, 49, 8) substr(##代码##, 86, 8) type\
        substr(##代码##, 568, 30) >> ARGV[2]
    }
}
And it works like a charm. It takes barely 1 minute to process a 500mb file
它就像一个魅力。处理一个 500mb 的文件只需要 1 分钟
回答by Brian Agnew
回答by Jens
What's wrong with the C program? Is it broken? Too hard to maintain? Too inflexible? You are more of a Shell than a C expert?
C程序有什么问题?坏了吗?太难维护?太不灵活?您更像是一个 Shell 而不是 C 专家?
If it ain't broke, don't fix it.
如果它没有坏,请不要修理它。
A look at Perl might be an option, too. Easier than C to modify and still speedy I/O; and it's much harder to create useless forks in Perl than in the shell.
看看 Perl 也可能是一种选择。比 C 更容易修改并且仍然快速 I/O;并且在 Perl 中创建无用的分支比在 shell 中要困难得多。
If you told us exactly what the C program does, maybe there's a simple and faster-than-light solution with sed, grep, awk or other gizmos in the Unix tool box. In other words, tell us what you actually want to achieve, don't ask us to solve some random problem you ran into while pursuing what you think is a step towards your actual goal.
如果您确切地告诉我们 C 程序是做什么的,也许在 Unix 工具箱中可以使用 sed、grep、awk 或其他小工具提供简单且快速的解决方案。换句话说,告诉我们您真正想要达到的目标,不要要求我们解决您在追求您认为是朝着实际目标迈出的一步时遇到的一些随机问题。
Alright, one problem with your shell script is the repeated openin echo "$out" >> "outFileName". Use this instead:
好吧,一个问题,您的shell脚本重复open在echo "$out" >> "outFileName"。改用这个:
As an alternative, simply use the cututility (but note that it doesn't insert the dot after the first part):
作为替代方案,只需使用该cut实用程序(但请注意,它不会在第一部分之后插入点):
You get the idea?
你明白了吗?

