bash:将五行输入组合到每行输出

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12075260/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 03:06:07  来源:igfitidea点击:

bash: combine five lines of input to each line of output

bashshellunix

提问by Raj

I have a input file as follows:

我有一个输入文件如下:

MB1 00134141 
MB1 12415085 
MB1 13253590
MB1 10598105
MB1 01141484
...
...
MB1 10598105

I want to combine 5 lines and merge it into one line. I want my bash script to process the bash script to produce output as follows -

我想合并 5 行并将其合并为一行。我希望我的 bash 脚本处理 bash 脚本以产生如下输出 -

MB1 00134141 MB1 12415085 MB1 13253590 MB1 10598105 MB1 01141484
...
...
...                                                 

I have written following script and it works but it is slow for file of size 23051 lines. Can I write a better code to make it faster?

我已经编写了以下脚本并且它可以工作,但是对于 23051 行的文件来说它很慢。我可以写一个更好的代码来使它更快吗?

#!/bin/bash
file=timing.csv
x=0
while [ $x -lt $(cat $file | wc -l) ]
do
   line=`head -n $x $file | tail -n 1`
   echo -n $line " "
   let "remainder = $x % 5"
   if [ "$remainder" -eq 0 ] 
   then
        echo ""
   fi
   let x=x+1
done
exit 0

I tried to execute the following command but it messes up some numbers.

我试图执行以下命令,但它弄乱了一些数字。

cat timing_deleted.csv | pr -at5

采纳答案by Charles Duffy

In pure bash, with no external processes (for speed):

在纯 bash 中,没有外部进程(为了速度):

while true; do
  out=()
  for (( i=0; i<5; i++ )); do
    read && out+=( "$REPLY" )
  done
  if (( ${#out[@]} > 0 )); then
    printf '%s ' "${out[@]}"
    echo
  fi
  if (( ${#out[@]} < 5 )); then break; fi
done <input-file >output-file

This correctly handles files where the number of lines is not a multiple of 5.

这可以正确处理行数不是 5 的倍数的文件。

回答by user3721740

Using tr:

使用 tr:

cat input_file | tr "\n" " "

回答by chepner

Use the paste command:

使用粘贴命令:

 paste -d ' ' - - - - - < tmp.txt


pasteis far better, but I couldn't bring myself to delete my previous mapfile-based solution.

paste好多了,但我不能让自己删除我以前的mapfile基于解决方案。

[UPDATE: mapfilereads too many lines prior to version 4.2.35when used with -n]

[更新:mapfile在与4.2.35版本一起使用时读取太多行-n]

#!/bin/bash
file=timing.csv
while true; do
    mapfile -t -n 5 arr
    (( ${#arr} > 0 )) || break
    echo "${arr[*]}"
done < "$file"
exit 0

We can't do while mapfile ...; dobecause mapfileexists with status 0 even when it doesn't read any input.

我们不能这样做,while mapfile ...; do因为mapfile即使它不读取任何输入,也存在状态 0。

回答by newfurniturey

You can use xargs, if your input always contains a consistent number of spaces per line:

您可以使用xargs, 如果您的输入始终包含每行一致数量的空格:

cat timing_deleted.csv | xargs -n 10

This will take the input from cat timing_deleted.csvand combine the input on 10 (-n 10) whitespace characters. The spaces in each column, such as MB1 00134141, count as a whitespace character - as well as the newline at the end of each line. So, for 5 lines, you'll need to use 10.

这将从cat timing_deleted.csv10 ( -n 10) 个空白字符中获取输入并组合输入。每列中的空格,例如MB1 00134141,算作空白字符 - 以及每行末尾的换行符。因此,对于 5 行,您需要使用 10。

EDIT
As commented by Charles, you can skip the usage of catand directly push the data into xargswith:

编辑
正如查尔斯所评论的,您可以跳过使用cat并将数据直接推送到xargs

xargs -n 10 < timing_deleted.csv

I didn't notice any performance gains using a really large file, but it doesn't require multiple commands.

我没有注意到使用非常大的文件有任何性能提升,但它不需要多个命令。

回答by perreal

Using sed, but this one will not process last few lines that do not add to a factor of 5:

使用 sed,但这不会处理最后几行不加到 5 的因素:

 sed 'N;N;N;N;s/\n/ /g;' input_file

The Ncommand reads the next line and appends it to the current line, preserving the newline. This script reads four additional lines for each line it reads, accumulating chunks of 5 lines in the buffer. For each such chunk, it replaces all of the newlines with a space.

N命令读取下一行并将其附加到当前行,保留换行符。此脚本为它读取的每一行读取额外的四行,在缓冲区中累积 5 行的块。对于每个这样的块,它用一个空格替换所有的换行符。

回答by someone

A awk script would do that. A sed replace too, I guess. I don't know sed well, so here you go.

awk 脚本可以做到这一点。我猜也是 sed 替换。我不太了解 sed,所以你去吧。

NF{ 
    if(i>=5){
        line = line "\n";
        i=0;
    }else{
        line = line " " 
    awk -f merge.awk filetomerge.txt
; i++; } } END{ print line; }

Call that, say, merge.awk. Here is how you invoque it :

称其为merge.awk。这是您调用它的方式:

##代码##

or cat filetomerge.txt | awk -f merge.awk

或者 cat filetomerge.txt | awk -f merge.awk

Should be rather fast too.

应该也比较快。