Shell/Bash 解析文本文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/22546395/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Shell/Bash parsing text file
提问by FanaticD
I have this text file, which looks like this
我有这个文本文件,看起来像这样
Item:
SubItem01
SubItem02
SubItem03
Item2:
SubItem0201
SubItem0202
Item3:
SubItem0301
...etc...
And I need is to get it to look like this:
我需要的是让它看起来像这样:
Item=>SubItem01
Item=>SubItem02
Item=>SubItem03
Item2=>SubItem0201
Item2=>SubItem0202
Item3=>SubItem0301
I am aware of the fact, that I need two for loops to get this. I did some tests, but... well, it didn't end up well.
我知道这样一个事实,我需要两个 for 循环才能得到这个。我做了一些测试,但是……好吧,结果并不好。
for(( c=1; c<=lineCount; c++ ))
do
var=`sed -n "${c}p" TMPFILE`
echo "$var"
if [[ "$var" == *:* ]];
then
printf "%s->" $var
else
printf "%s\n"
fi
done
Could anyone please kick me back on the road? I tried bunch of variete ways, but I am not getting anywhere. Thanks.
有人能把我踢回路上吗?我尝试了很多不同的方式,但我一无所获。谢谢。
回答by Digital Trauma
回答by jaypal singh
Text parsing is best done with awk
:
文本解析最好使用awk
:
$ awk '/:$/{sub(/:$/,"");h=awk '/:/{s=;next}{print s OFS awk -F: '/^Item/{ITM=} !/^Item/{print ITM"=>"@(collect)
@left:
@ (collect)
@right
@ (until)
@(skip):
@ (end)
@(end)
@(output)
@ (repeat)
@ (repeat)
@left=>@right
@ (end)
@ (end)
@(end)
$ txr regroup.txr data.txt
Item=>SubItem01
Item=>SubItem02
Item=>SubItem03
Item2=>SubItem0201
Item2=>SubItem0202
Item3=>SubItem0301
}'
}' FS=: OFS="=>" file
;next}{print h"=>"##代码##}' file
Item=>SubItem01
Item=>SubItem02
Item=>SubItem03
Item2=>SubItem0201
Item2=>SubItem0202
Item3=>SubItem0301
回答by BMW
Using awk
使用 awk
##代码##回答by Cole Tierney
Here's another awk
alternative:
这是另一种awk
选择:
If a line begins with 'Item', save the item name in ITM. If the line does notbegin with 'Item', print the previously saved item name (ITM), '=>', and the sub item. Splitting on : makes it easier to get the item name.
如果一行以“Item”开头,则将项目名称保存在 ITM 中。如果该行不以“Item”开头,则打印之前保存的项目名称 (ITM)、“=>”和子项目。拆分 : 使获取项目名称更容易。
The assumption is that groups of subitems will always be preceded by an Item: entry, so the variable ITM should always have the name of the current group.
假设子项组将始终以 Item: 条目开头,因此变量 ITM 应始终具有当前组的名称。