bash 将文本文件拆分为多个文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16273069/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 05:17:44  来源:igfitidea点击:

Split text file into multiple files

bashunixawk

提问by shalini

I am having large text file having 1000 abstracts with empty line in between each abstract . I want to split this file into 1000 text files. My file looks like

我有一个大文本文件,其中包含 1000 个摘要,每个摘要之间有空行。我想将此文件拆分为 1000 个文本文件。我的文件看起来像

16503654    Three-dimensional structure of neuropeptide k bound to dodecylphosphocholine micelles.      Neuropeptide K (NPK), an N-terminally extended form of neurokinin A (NKA), represents the most potent and longest lasting vasodepressor and cardiomodulatory tachykinin reported thus far.  

16504520    Computer-aided analysis of the interactions of glutamine synthetase with its inhibitors.        Mechanism of inhibition of glutamine synthetase (EC 6.3.1.2; GS) by phosphinothricin and its analogues was studied in some detail using molecular modeling methods. 

回答by Alper

You can use split and set "NUMBER lines per output file" to 2. Each file would have one text line and one empty line.

您可以使用 split 并将“每个输出文件的 NUMBER 行”设置为 2。每个文件将有一个文本行和一个空行。

split -l 2 file

回答by Guru

Something like this:

像这样的东西:

awk 'NF{print > ;close();}' file

This will create 1000 files with filename being the abstract number. This awk code writes the records to a file whose name is retrieved from the 1st field($1). This is only done only if the number of fields is more than 0(NF)

这将创建 1000 个文件,文件名是抽象编号。此 awk 代码将记录写入一个文件,该文件的名称是从第一个字段 ($1) 中检索到的。仅当字段数大于 0(NF) 时才这样做

回答by FreudianSlip

You could always use the csplit command. This is a file splitter but based on a regex.

您始终可以使用 csplit 命令。这是一个文件拆分器,但基于正则表达式。

something along the lines of :

类似的东西:

csplit -ks -f /tmp/files INPUTFILENAMEGOESHERE '/^$/'

It is untested and may need a little tweaking though.

它未经测试,但可能需要稍作调整。

CSPLIT

CSPLIT