bash 如何使用bash从单独的文件粘贴列?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16910057/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to paste columns from separate files using bash?
提问by blehman
Using the following data:
使用以下数据:
$cat date1.csv
Bob,2013-06-03T17:18:07
James,2013-06-03T17:18:07
Kevin,2013-06-03T17:18:07
$cat date2.csv
2012-12-02T18:30:31
2012-12-02T18:28:37
2013-06-01T12:16:05
How can date1.csv and date2.csv files be merged? Output desired:
如何合并 date1.csv 和 date2.csv 文件?所需的输出:
$cat merge-date1-date2.csv
Bob,2013-06-03T17:18:07,2012-12-02T18:30:31
James,2013-06-03T17:18:07,2012-12-02T18:28:37
Kevin,2013-06-03T17:18:07,2013-06-01T12:16:05
Please note: the best solution will be able to quickly manage a massive number of lines.
请注意:最好的解决方案将能够快速管理大量线路。
回答by Carl Norum
You were on track with paste(1)
:
你在以下方面走上正轨paste(1)
:
$ paste -d , date1.csv date2.csv
Bob,2013-06-03T17:18:07,2012-12-02T18:30:31
James,2013-06-03T17:18:07,2012-12-02T18:28:37
Kevin,2013-06-03T17:18:07,2013-06-01T12:16:05
It's a bit unclear from your question if there are leading spaces on those lines. If you want to get rid of that in the final output, you can use cut(1)
to snip it off before pasting:
从您的问题中有点不清楚这些行上是否有前导空格。如果你想在最终输出中去掉它,你可以cut(1)
在粘贴之前用它剪掉:
$ cut -c 2- date2.csv | paste -d , date1.csv -
Bob,2013-06-03T17:18:07,2012-12-02T18:30:31
James,2013-06-03T17:18:07,2012-12-02T18:28:37
Kevin,2013-06-03T17:18:07,2013-06-01T12:16:05
回答by jaypal singh
Another way of doing it is with pr
另一种方法是使用pr
pr -mts, file1 file2
Test:
测试:
[jaypal:~/Temp] cat file1
Bob,2013-06-03T17:18:07
James,2013-06-03T17:18:07
Kevin,2013-06-03T17:18:07
[jaypal:~/Temp] cat file2
2012-12-02T18:30:31
2012-12-02T18:28:37
2013-06-01T12:16:05
[jaypal:~/Temp] pr -mts, file1 file2
Bob,2013-06-03T17:18:07,2012-12-02T18:30:31
James,2013-06-03T17:18:07,2012-12-02T18:28:37
Kevin,2013-06-03T17:18:07,2013-06-01T12:16:05
回答by blehman
I wanted to extend jaypal's solution as I've ran into a need to edit files prior to merging the columns.
我想扩展 jaypal 的解决方案,因为我需要在合并列之前编辑文件。
$cat date1.csv
Bob,2013-06-03T17:18:07
James,2013-06-03T17:18:07
Kevin,2013-06-03T17:18:07
$cat date2.csv
2012-12-02T18:30:31
2012-12-02T18:28:37
2013-06-01T12:16:05
Merging column 1 from date1.csv with column 1 from date2.csv can be accomplished as follows:
将 date1.csv 中的第 1 列与 date2.csv 中的第 1 列合并可以按如下方式完成:
$pr -mts, <(cut -d, -f1 date1.csv) date2.csv
Bob,2012-12-02T18:30:31
James,2012-12-02T18:28:37
Kevin,2013-06-01T12:16:05
You can apply further edits with a pipe if desired:
如果需要,您可以使用管道应用进一步的编辑:
$pr -mts, <(cut -d, -f1 date1.csv | sort) date2.csv
Anyway, this has been handy for me and just wanted pass along the knowledge. Hope it helps someone.
无论如何,这对我来说很方便,只是想传递知识。希望它可以帮助某人。
回答by David Ries
If you just want to paste specific columns of different files side-by-side, you can use a combination of paste and cut.
如果您只想并排粘贴不同文件的特定列,则可以使用粘贴和剪切的组合。
For example, if you have three files with the same lines, only differing in some columns that you want to bring together:
例如,如果您有三个具有相同行的文件,仅在要合并的某些列中有所不同:
$ head file1.csv
chr1H 1 240 RLC 2 138 239 0.5774059
chr1H 641 1787 RLC 12 1135 1146 0.9904014
chr1H 2009 3436 RLC 15 1413 1427 0.9901892
chr1H 4935 6106 RLG 12 1060 1171 0.9052092
chr1H 11523 11997 RLG 4 371 474 0.7827004
chr1H 11998 12882 RLX 9 776 884 0.8778281
chr1H 20340 21529 RLC 13 1177 1189 0.9899075
chr1H 27889 36240 RLC 82 8118 8351 0.9720991
chr1H 36241 39978 RLC 36 3542 3737 0.9478191
chr1H 40384 41273 RLX 10 880 889 0.9898763
$ head file2.csv
chr1H 1 240 RLC 1 39 239 0.1631799
chr1H 641 1787 RLC 11 1049 1146 0.9153578
chr1H 2009 3436 RLC 6 594 1427 0.4162579
chr1H 4935 6106 RLG 11 995 1171 0.8497011
chr1H 11523 11997 RLG 3 275 474 0.5801688
chr1H 11998 12882 RLX 4 378 884 0.4276018
chr1H 20340 21529 RLC 11 979 1189 0.8233810
chr1H 27889 36240 RLC 74 7238 8351 0.8667225
chr1H 36241 39978 RLC 31 3047 3737 0.8153599
chr1H 40384 41273 RLX 10 880 889 0.9898763
$ head file3.csv
chr1H 1 240 RLC 2 138 239 0.5774059
chr1H 641 1787 RLC 12 1135 1146 0.9904014
chr1H 2009 3436 RLC 15 1413 1427 0.9901892
chr1H 4935 6106 RLG 12 1060 1171 0.9052092
chr1H 11523 11997 RLG 4 371 474 0.7827004
chr1H 11998 12882 RLX 9 776 884 0.8778281
chr1H 20340 21529 RLC 13 1177 1189 0.9899075
chr1H 27889 36240 RLC 82 8118 8351 0.9720991
chr1H 36241 39978 RLC 36 3542 3737 0.9478191
chr1H 40384 41273 RLX 10 880 889 0.9898763
The first for columns of the files are identical. We want to keep these, but additionally paste the 8th column of each file side-by-side:
文件列的第一个是相同的。我们想保留这些,但另外并排粘贴每个文件的第 8 列:
$ paste file1.csv file2.csv file3.csv | cut -f 1,2,3,4,8,16,24 | head
results in:
结果是:
chr1H 1 240 RLC 0.5774059 0.1631799 0.0000000
chr1H 641 1787 RLC 0.9904014 0.9153578 0.6448517
chr1H 2009 3436 RLC 0.9901892 0.4162579 0.2081289
chr1H 4935 6106 RLG 0.9052092 0.8497011 0.1690862
chr1H 11523 11997 RLG 0.7827004 0.5801688 0.0000000
chr1H 11998 12882 RLX 0.8778281 0.4276018 0.1119910
chr1H 20340 21529 RLC 0.9899075 0.8233810 0.1068124
chr1H 27889 36240 RLC 0.9720991 0.8667225 0.4043827
chr1H 36241 39978 RLC 0.9478191 0.8153599 0.3914905
chr1H 40384 41273 RLX 0.9898763 0.9898763 0.3217098
This needs almost no memory and is probably as fast as it gets.
这几乎不需要内存,并且可能尽可能快。