Linux 如何将一个文本文件拆分为多个 *.txt 文件？

Question

提问by Kris

I got a text file file.txt(12 MBs) containing:

我得到了一个文本文件file.txt（12 MB），其中包含：

something1
something2
something3
something4
(...)

Is there any way to split file.txtin to 12 *.txt files let say file2.txt, file3.txt, file4.txt(...) ?

有什么办法可以分成file.txt12 个 *.txt 文件，比如说file2.txt, file3.txt, file4.txt(...) ？

Answer 1

采纳答案by CS Pei

You can use the linux bash core utility split

您可以使用 linux bash 核心实用程序 split

split -b 1M -d  file.txt file

Note that Mor MBboth are OK but size is different. MB is 1000 * 1000, M is 1024^2

请注意，M或MB两者都可以，但大小不同。MB 是 1000 * 1000，M 是 1024^2

If you want to separate by lines you can use -lparameter.

如果要按行分隔，可以使用-l参数。

UPDATE

更新

a=(`wc -l yourfile`) ; lines=`echo $(($a/12)) | bc -l` ; split -l $lines -d  file.txt file

Another solution as suggested by Kirill, you can do something like the following

Kirill建议的另一种解决方案，您可以执行以下操作

split -n l/12 file.txt

Note that is lnot one, split -nhas a few options, like N, k/N, l/k/N, r/N, r/k/N.

请注意，l不是one，split -n有几个选项，例如N、k/N、l/k/N、r/N、r/k/N。

Answer 2

回答by konsolebox

Using bash:

使用 bash：

readarray -t LINES < file.txt
COUNT=${#LINES[@]}
for I in "${!LINES[@]}"; do
    INDEX=$(( (I * 12 - 1) / COUNT + 1 ))
    echo "${LINES[I]}" >> "file${INDEX}.txt"
done

Using awk:

使用 awk：

awk '{
    a[NR] = $ split -l 100 input_file output_file

}
END {
    for (i = 1; i in a; ++i) {
        x = (i * 12 - 1) / NR + 1
        sub(/\..*$/, "", x)
        print a[i] > "file" x ".txt"
    }
}' file.txt

Unlike splitthis one makes sure that number of lines are most even.

与split此不同的是，确保行数最均匀。

Answer 3

回答by amruta takawale

split -b=1M -d  file.txt file --additional-suffix=.txt

where -lis the number of lines in each files. This will create:

其中-l是每个文件中的行数。这将创建：

output_fileaa
output_fileab
output_fileac
output_filead
....

输出文件aa
output_fileab
output_fileac
output_filead
....

Answer 4

回答by schoon

John's answer won't produce .txt files as the OP wants. Use:

约翰的回答不会像 OP 那样生成 .txt 文件。用：

> split -b 10M -d  system.log system_split.log

Answer 5

回答by Nicolas D

Regardless to what is said above, on my ubuntu 16 i had to do :

不管上面说的是什么，在我的 ubuntu 16 上，我必须这样做：

awk -vc=1 'NR%1000000==0{++c}{print split -d -l NUM_LINES really_big_file.txt split_files.txt.
 > c".txt"}' Datafile.txt

for filename in *.txt; do mv "$filename" "Prefix_$filename"; done;

Please note the space between -b and the value

请注意 -b 和值之间的空格

Answer 6

回答by Morgan32

Try something like this:

尝试这样的事情：

$ ls -laF
total 1391952
drwxr-xr-x 2 user.name group         40 Sep 14 15:43 ./
drwxr-xr-x 3 user.name group       4096 Sep 14 15:39 ../
-rw-r--r-- 1 user.name group 1425352817 Sep 14 14:01 really_big_file.txt

Answer 7

回答by Ryan

I agree with @CS Pei, however this didn't work for me:

我同意@CS Pei，但这对我不起作用：

split -b=1M -d file.txt file

...as the =after -bthrew it off. Instead, I simply deleted it and left no space between it and the variable, and used lowercase "m":

......因为=后来-b把它扔掉了。相反，我只是删除了它并且在它和变量之间不留空格，并使用小写的“m”：

split -b1m -d file.txt file

And to append ".txt", we use what @schoon said:

并附加“.txt”，我们使用@schoon 所说的：

split -b=1m -d file.txt file --additional-suffix=.txt

I had a 188.5MB txt file and I used this command [but with -b5mfor 5.2MB files], and it returned 35 split files all of which were txt files and 5.2MB except the last which was 5.0MB. Now, since I wanted my lines to stay whole, I wanted to split the main file every 1 million lines, but the splitcommand didn't allow me to even do -100000let alone "-1000000, so large numbers of lines to split will not work.

我有一个 188.5MB 的 txt 文件，我使用了这个命令 [但-b5m用于 5.2MB 的文件]，它返回了 35 个拆分文件，所有这些文件都是 txt 文件和 5.2MB，除了最后一个是 5.0MB。现在，由于我希望我的行保持完整，我想每 100 万行拆分一次主文件，但是该split命令甚至不允许我做-100000更不用说 " -1000000，因此要拆分的大量行将不起作用。

Answer 8

回答by stackoverflowuser2010

On my Linux system (Red Hat Enterprise 6.9), the splitcommand does not have the command-line options for either -nor --additional-suffix.

在我的 Linux 系统（Red Hat Enterprise 6.9）上，该split命令没有-n或的命令行选项--additional-suffix。

Instead, I've used this:

相反，我使用了这个：

$ split -d -l 30000 really_big_file.txt split_files.txt.

where -dis to add a numeric suffix to the end of the split_files.txt.and -lspecifies the number of lines per file.

where-d是在末尾添加数字后缀split_files.txt.并-l指定每个文件的行数。

For example, suppose I have a really big file like this:

例如，假设我有一个非常大的文件，如下所示：

$ ls -laF
total 2783904
drwxr-xr-x 2 user.name group        156 Sep 14 15:43 ./
drwxr-xr-x 3 user.name group       4096 Sep 14 15:39 ../
-rw-r--r-- 1 user.name group 1425352817 Sep 14 14:01 really_big_file.txt
-rw-r--r-- 1 user.name group  428604626 Sep 14 15:43 split_files.txt.00
-rw-r--r-- 1 user.name group  427152423 Sep 14 15:43 split_files.txt.01
-rw-r--r-- 1 user.name group  427141443 Sep 14 15:43 split_files.txt.02
-rw-r--r-- 1 user.name group  142454325 Sep 14 15:43 split_files.txt.03


$ wc -l *.txt*
    100000 really_big_file.txt
     30000 split_files.txt.00
     30000 split_files.txt.01
     30000 split_files.txt.02
     10000 split_files.txt.03
    200000 total

This file has 100,000 lines, and I want to split it into files with at most 30,000 lines. This command will run the split and append an integer at the end of the output file pattern split_files.txt..

这个文件有 100,000 行，我想把它分成最多 30,000 行的文件。此命令将运行拆分并在输出文件 pattern 的末尾附加一个整数split_files.txt.。

##代码##

The resulting files are split correctly with at most 30,000 lines per file.

结果文件被正确分割，每个文件最多 30,000 行。

##代码##

Answer 9

回答by bcag2

If each part have the same lines number, for example 22, here my solution:
split --numeric-suffixes=2 --additional-suffix=.txt -l 22 file.txt file
and you obtain file2.txtwith the first 22 lines, file3.txtthe 22 next line…

如果每个部分都有相同的行数，例如 22，这里是我的解决方案：
split --numeric-suffixes=2 --additional-suffix=.txt -l 22 file.txt file
你获得file2.txt前 22 行，file3.txt下 22 行......

Thank @hamruta-takawale, @dror-s and @stackoverflowuser2010

感谢@hamruta-takawale、@dror-s 和@stackoverflowuser2010

Linux 如何将一个文本文件拆分为多个 *.txt 文件？

提问by Kris

采纳答案by CS Pei

回答by konsolebox

回答by amruta takawale

回答by schoon

回答by Nicolas D

回答by Morgan32

回答by Ryan

回答by stackoverflowuser2010

回答by bcag2

相关推荐

最近更新

标签

Linux 如何将一个文本文件拆分为多个 *.txt 文件？

提问by Kris

采纳答案by CS Pei

回答by konsolebox

回答by amruta takawale

回答by schoon

回答by Nicolas D

回答by Morgan32

回答by Ryan

回答by stackoverflowuser2010

回答by bcag2

相关推荐

C# 将图像转换为灰度

在 Linux 中将文件从一个用户主目录复制到另一个用户主目录

Linux 配置：错误：安装 Ruby 1.9.3 时 C 编译器无法创建可执行文件

C# 如何检查一个数组是否包含另一个数组的任何项目

相关推荐

最近更新

标签