windows 批量拆分文本文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23593556/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 11:34:32  来源:igfitidea点击:

Batch split a text file

windowsfilebatch-filetextcmd

提问by 09stephenb

I have this batch file to split a txt file:

我有这个批处理文件来拆分 txt 文件:

@echo off
for /f "tokens=1*delims=:" %%a in ('findstr /n "^" "PASSWORD.txt"') do for /f "delims=~" %%c in ("%%~b") do >"text%%a.txt" echo(%%c
pause

It works but it splits it line by line. How do i make it split it every 5000 lines. Thanks in advance.

它有效,但它逐行拆分。我如何让它每 5000 行分割一次。提前致谢。

Edit:

编辑:

I have just tried this:

@echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=passwordAll.txt
REM Edit this value to change the number of lines per file.
SET LPF=50000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile

REM Do not change beyond this line.

SET SFX=%BFN:~-3%

SET /A LineNum=0
SET /A FileNum=1

For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1

echo %%l >> %SFN%!FileNum!.%SFX%

if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)

)
endlocal
Pause
exit

But i get an error saying: Not enough storage is available to process this command

但我收到一条错误消息: Not enough storage is available to process this command

回答by MC ND

This will give you the a basic skeleton. Adapt as needed

这会给你一个基本的骨架。根据需要进行调整

@echo off
    setlocal enableextensions disabledelayedexpansion

    set "nLines=5000"
    set "line=0"

    for /f "usebackq delims=" %%a in ("passwords.txt") do (
        set /a "file=line/%nLines%", "line+=1"
        setlocal enabledelayedexpansion
        for %%b in (!file!) do (
            endlocal
            >>"passwords_%%b.txt" echo(%%a
        )
    )

    endlocal

EDITED

已编辑

As the comments indicated, a 4.3GB file is hard to manage. for /fneeds to load the full file into memory, and the buffer needed is twice this size as the file is converted to unicode in memory.

正如评论所指出的,一个 4.3GB 的文件很难管理。for /f需要将整个文件加载到内存中,当文件在内存中转换为 unicode 时,所需的缓冲区是这个大小的两倍。

This is a fully ad hoc solution. I've not tested it over a file that high, but at least in theory it should work (unless 5000 lines needs a lot of memory, it depends of the line length)

这是一个完全临时的解决方案。我没有在这么高的文件上测试过它,但至少在理论上它应该可以工作(除非 5000 行需要大量内存,这取决于行长)

AND, with such a file it will be SLOW

而且,有了这样的文件,它会很慢

@echo off
    setlocal enableextensions disabledelayedexpansion

    set "line=0"
    set "tempFile=%temp%\passwords.tmp"

    findstr /n "^" passwords.txt > "%tempFile%"
    for /f %%a in ('type passwords.txt ^| find /c /v "" ') do set /a "nFiles=%%a/5000"

    for /l %%a in (0 1 %nFiles%) do (
        set /a "e1=%%a*5", "e2=e1+1", "e3=e2+1", "e4=e3+1", "e5=e4+1"
        setlocal enabledelayedexpansion
        if %%a equ 0 (
            set "e=/c:"[1-9]:" /c:"[1-9][0-9]:" /c:"[1-9][0-9][0-9]:" /c:"!e2![0-9][0-9][0-9]:" /c:"!e3![0-9][0-9][0-9]:" /c:"!e4![0-9][0-9][0-9]:" /c:"!e5![0-9][0-9][0-9]:" "
        ) else (
            set "e=/c:"!e1![0-9][0-9][0-9]:" /c:"!e2![0-9][0-9][0-9]:" /c:"!e3![0-9][0-9][0-9]:" /c:"!e4![0-9][0-9][0-9]:" /c:"!e5![0-9][0-9][0-9]:" "
        )
        for /f "delims=" %%e in ("!e!") do (
            endlocal & (for /f "tokens=1,* delims=:" %%b in ('findstr /r /b %%e "%tempFile%"') do @echo(%%c)>passwords_%%a.txt
        )
    )

    del "%tempFile%" >nul 2>nul

    endlocal

EDITED, again: Previous code will not correctly work for lines starting with a colon, as it has been used as a delimiter in the forcommand to separate line numbers from data.

再次编辑:以前的代码对于以冒号开头的行将无法正常工作,因为它已在for命令中用作分隔符以将行号与数据分开。

For an alternative, still pure batch but still SLOW

对于另一种选择,仍然是纯批次但仍然很慢

@echo off
    setlocal enableextensions disabledelayedexpansion

    set "nLines=5000"
    set "line=0"
    for /f %%a in ('type passwords.txt^|find /c /v ""') do set "fileLines=%%a"

    < "passwords.txt" (for /l %%a in (1 1 %fileLines%) do (
        set /p "data="
        set /a "file=line/%nLines%", "line+=1"
        setlocal enabledelayedexpansion
        >>"passwords_!file!.txt" echo(!data!
        endlocal
    ))

    endlocal

回答by foxidrive

Test this: the input file is "file.txt"and output files are "splitfile-5000.txt"for example.

测试一下:例如输入文件是"file.txt"和输出文件是"splitfile-5000.txt"

This uses a helper batch file called findrepl.bat- download from: https://www.dropbox.com/s/rfdldmcb6vwi9xc/findrepl.bat

这使用了一个名为findrepl.bat- 下载的帮助程序批处理文件:https: //www.dropbox.com/s/rfdldmcb6vwi9xc/findrepl.bat

Place findrepl.batin the same folder as the batch file or on the path.

findrepl.bat在同一文件夹中的批处理文件或路径上。

@echo off
:: splits file.txt into 5000 line chunks. 
set chunks=5000

set /a s=1-chunks
:loop
set /a s=s+chunks
set /a e=s+chunks-1
echo %s% to %e%
call findrepl /o:%s%:%e% <"file.txt" >"splitfile-%e%.txt"
for %%b in ("splitfile-%e%.txt") do (if %%~zb EQU 0 del "splitfile-%e%.txt" & goto :done)
goto :loop
:done
pause

A limitation is the number of lines in the file and the real largest number is 2^31 - 1where batch math tops out.

限制是文件中的行数,而真正的最大行数是2^31 - 1批处理数学最高的地方。

回答by Aacini

@echo off
setlocal EnableDelayedExpansion

findstr /N "^" PASSWORD.txt > temp.txt
set part=0
call :splitFile < temp.txt
del temp.txt
goto :EOF

:splitFile
set /A part+=1
(for /L %%i in (1,1,5000) do (
   set "line="
   set /P line=
   if defined line echo(!line:*:=!
)) >  text%part%.txt
if defined line goto splitFile
exit /B

If the input file has not empty lines, previous method may be modified in order to run faster.

如果输入文件没有空行,可以修改之前的方法,以便更快地运行。