bash 是否可以在 awk 中使用两个不同的字段分隔符并将两者的值存储在变量中?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12047613/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is it possible to use two different Field Separators in awk and store values from both in variables?
提问by Jayson
I guess the general question I have is, is it possible to give awk a field separator, store one of the tokens in a variable, then give awk another field separator, and store one of the tokens in a second variable, then print out both the variable values? It seems like the variables store a reference to the $nth token, not the value itself.
我想我的一般问题是,是否可以给 awk 一个字段分隔符,将其中一个标记存储在一个变量中,然后给 awk 另一个字段分隔符,并将其中一个标记存储在第二个变量中,然后打印出两者变量值?变量似乎存储对 $nth 标记的引用,而不是值本身。
The specific example I had in mind more or less follows this form: {Animal}, {species} class
我想到的具体例子或多或少遵循这种形式:{Animal}, {species} class
Cat, Felis catus MAMMAL
Dog, Canis lupus familiaris MAMMAL
Peregrine Falcon, Falco peregrinus AVIAN
...
and you want it to output something like:
并且您希望它输出如下内容:
Cat MAMMAL
Dog MAMMAL
Peregrine Falcon AVIAN
...
Where what you want is something that fits the form: {Animal} class
你想要的是符合形式的东西:{Animal} 类
with something being enclosed in {}'s meaning it could have any number of spaces.
用 {} 括起来的意思是它可以有任意数量的空格。
My original idea was I would have something like this:
我最初的想法是我会有这样的事情:
cat test.txt | awk '{FS=","}; {animal=}; {FS=" "}; {class=$NF}; {print animal, class}; > animals.txt
I expect the variable "animal" to store what's to the left of the comma, and "class" to to have the class type of that animal, so MAMMAL, etc. But what ends up happening is that only the last used Field separator is applied, so this would break for things that have spaces in the name, like Peregrine Falcon, etc.
我希望变量“animal”存储逗号左侧的内容,而“class”存储该动物的类类型,因此 MAMMAL 等。但最终发生的是只有最后使用的字段分隔符是应用,所以这会破坏名称中有空格的东西,比如 Peregrine Falcon 等。
so it would look something like
所以它看起来像
Cat, MAMMAL
Dog, MAMMAL
Peregrine AVIAN
回答by Steve
One way using awk:
一种使用方式awk:
awk -F, '{ n = split(,array," "); printf "%s, %s\n", , array[n] }' file.txt
Results:
结果:
Cat, MAMMAL
Dog, MAMMAL
Peregrine Falcon, AVIAN
回答by Thor
The field separator for awkcan be any regular expression, but in this case it might be easier to use the record separator, setting it to [,\n]will alternate between the fields you want:
的字段分隔符awk可以是任何正则表达式,但在这种情况下,使用记录分隔符可能更容易,将其设置为[,\n]将在您想要的字段之间交替:
awk -v RS='[,\n]' 'NR % 2 { printf("%s, ", awk '{cl=$NF; split(paste -d, <(cut -d, -f1 input.txt) <(awk '{print $NF}' input.txt)
,a,", "); printf("%s, %s\n", a[1], cl)}' test.txt
) } NR % 2 == 0 { print $NF }'
So even fields are output in their entirety, and odd fields only output the last field.
所以偶数场全部输出,奇数场只输出最后一场。
回答by ghoti
You can always split()inside your awk script. You can also manipulate fields causing the entire line to be re-parsed. For example, this gets the results in your question:
你总是可以split()在你的 awk 脚本中。您还可以操作导致整行重新解析的字段。例如,这会得到您问题中的结果:
Cat,MAMMAL
Dog,MAMMAL
Peregrine Falcon,AVIAN
回答by kev
cutthe first columnawkget the last columnpastethem together
cut第一列awk获取最后一列paste他们在一起

