使用 Windows 批处理从文件中删除尾随空格?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9310711/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 09:14:29  来源:igfitidea点击:

Remove trailing spaces from a file using Windows batch?

windowsbatch-filecmd

提问by HeinrichStack

How could I trim all trailing spaces from a text file using the Windows command prompt?

如何使用 Windows 命令提示符从文本文件中修剪所有尾随空格?

回答by dbenham

The DosTips RTRIM function that Ben Hocking cites can be used to create a script that can right trim each line in a text file. However, the function is relatively slow.

Ben Hocking 引用的 DosTips RTRIM 函数可用于创建一个脚本,该脚本可以正确修剪文本文件中的每一行。但是,该功能相对较慢。

DosTips user (and moderator) aGerman developed a very efficient right trim algorithm. He implemented the algorithm as a batch "macro" - an interesting concept of storing complex mini scripts in environment variables that can be executed from memory. The macros with arguments are a major discussion topic in and of themselves that is not relevent to this question.

DosTips 用户(和版主)aGerman 开发了一种非常有效的正确修剪算法。他将该算法实现为批处理“宏”——一个将复杂的迷你脚本存储在可以从内存中执行的环境变量中的有趣概念。带参数的宏本身就是一个主要的讨论话题,与这个问题无关。

I have extracted aGerman's algorithm and put it in the following batch script. The script expects the name of a text file as the only parameter and proceeds to right trim the spaces off each line in the file.

我已经提取了 aGerman 的算法并将其放入以下批处理脚本中。该脚本期望文本文件的名称作为唯一参数,并继续正确修剪文件中每一行的空格。

@echo off
setlocal enableDelayedExpansion
set "spcs= "
for /l %%n in (1 1 12) do set "spcs=!spcs!!spcs!"
findstr /n "^" "%~1" >"%~1.tmp"
setlocal disableDelayedExpansion
(
  for /f "usebackq delims=" %%L in ("%~1.tmp") do (
    set "ln=%%L"
    setlocal enableDelayedExpansion
    set "ln=!ln:*:=!"
    set /a "n=4096"
    for /l %%i in (1 1 13) do (
      if defined ln for %%n in (!n!) do (
        if "!ln:~-%%n!"=="!spcs:~-%%n!" set "ln=!ln:~0,-%%n!"
        set /a "n/=2"
      )
    )
    echo(!ln!
    endlocal
  )
) >"%~1"
del "%~1.tmp" 2>nul

Assuming the script is called rtrimFile.bat, then it can be called from the command line as follows:

假设脚本名为rtrimFile.bat,那么可以从命令行调用它,如下所示:

rtrimFile "fileName.txt"

A note about performance
The original DosTips rtrim function performs a linear search and defaults to trimming a maximum of 32 spaces. It has to iterate once per space.

关于性能
的说明 原始 DosTips rtrim 函数执行线性搜索并默认修剪最多 32 个空格。每个空间必须迭代一次。

aGerman's algorithm uses a binary search and it is able to trim the maximum string size allowed by batch (up to ~8k spaces) in 13 iterations.

aGerman 的算法使用二分搜索,它能够在 13 次迭代中调整批处理允许的最大字符串大小(最多约 8k 个空格)。

Unfotunately, batch is very SLOW when it comes to processing text. Even with the efficient rtrim function, it takes ~70 seconds to trim a 1MB file on my machine. The problem is, just reading and writing the file without any modification takes significant time. This answer uses a FOR loop to read the file, coupled with FINDSTR to prefix each line with the line number so that blank lines are preserved. It toggles delayed expansion to prevent !from being corrupted, and uses a search and replace operation to remove the line number prefix from each line. All that before it even begins to do the rtrim.

不幸的是,批处理在处理文本时非常慢。即使使用高效的 rtrim 功能,在我的机器上修剪一个 1MB 的文件也需要大约 70 秒。问题是,在没有任何修改的情况下读取和写入文件需要大量时间。此答案使用 FOR 循环来读取文件,并结合 FINDSTR 为每行添加行号前缀,以便保留空行。它切换延迟扩展以防止!被破坏,并使用搜索和替换操作从每行中删除行号前缀。所有这一切甚至在它开始进行 rtrim 之前。

Performance could be nearly doubled by using an alternate file read mechanism that uses set /p. However, the set /p method is limited to ~1k bytes per line, and it strips trailing control characters from each line.

通过使用使用set /p. 但是, set /p 方法限制为每行约 1k 字节,并且它会从每行中去除尾随控制字符。

If you need to regularly trim large files, then even a doubling of performance is probably not adequate. Time to download (if possible) any one of many utilities that could process the file in the blink of an eye.

如果您需要定期修剪大文件,那么即使性能提高一倍也可能不够。是时候下载(如果可能)可以在眨眼间处理文件的许多实用程序中的任何一个。

If you can't use non-native software, then you can try VBScript or JScript excecuted via the CSCRIPT batch command. Either one would be MUCH faster.

如果您不能使用非本地软件,那么您可以尝试通过 CSCRIPT 批处理命令执行 VBScript 或 JScript。任何一个都会快得多。

UPDATE - Fast solution with JREPL.BAT

更新 - 使用 JREPL.BAT 的快速解决方案

JREPL.BATis a regular expression find/replace utility that can very efficiently solve the problem. It is pure script (hybrid batch/JScript) that runs natively on any Windows machine from XP onward. No 3rd party exe files are needed.

JREPL.BAT是一个正则表达式查找/替换实用程序,可以非常有效地解决问题。它是纯脚本(混合批处理/JScript),可以在 XP 以后的任何 Windows 机器上本地运行。不需要第 3 方 exe 文件。

With JREPL.BAT somewhere within your PATH, you can strip trailing spaces from file "test.txt" with this simple command:

使用 PATH 中某处的 JREPL.BAT,您可以使用以下简单命令从文件“test.txt”中去除尾随空格:

jrepl " +$" "" /f test.txt /o -

If you put the command within a batch script, then you must precede the command with CALL:

如果将命令放在批处理脚本中,则必须在命令之前使用 CALL:

call jrepl " +$" "" /f test.txt /o -

回答by paxdiablo

Go get yourself a copy of CygWinor the sedpackagefrom GnuWin32.

去让自己的副本CygWin的sed的GnuWin32

Then use that with the command:

然后将其与命令一起使用:

sed "s/ *$//" inputFile >outputFile

回答by Ben Hocking

Dos Tips has an implementation of RTrimthat works for batch files:

Dos Tips 有一个适用于批处理文件的 RTrim 实现

:rTrim string char max -- strips white spaces (or other characters) from the end of a string
::                     -- string [in,out] - string variable to be trimmed
::                     -- char   [in,opt] - character to be trimmed, default is space
::                     -- max    [in,opt] - maximum number of characters to be trimmed from the end, default is 32
:$created 20060101 :$changed 20080219 :$categories StringManipulation
:$source http://www.dostips.com
SETLOCAL ENABLEDELAYEDEXPANSION
call set string=%%%~1%%
set char=%~2
set max=%~3
if "%char%"=="" set char= &rem one space
if "%max%"=="" set max=32
for /l %%a in (1,1,%max%) do if "!string:~-1!"=="%char%" set string=!string:~0,-1!
( ENDLOCAL & REM RETURN VALUES
    IF "%~1" NEQ "" SET %~1=%string%
)
EXIT /b

If you're not used to using functions in batch files, read this.

如果您不习惯在批处理文件中使用函数,请阅读此

回答by aschipfl

There is a nice trick to remove trailing spaces based on this answerof user Aacini; I modified it so that all other spaces occurring in the string are preserved. So here is the code:

根据用户Aacini 的这个答案,有一个很好的技巧可以删除尾随空格;我修改了它,以便保留字符串中出现的所有其他空格。所以这里是代码:

@echo off
setlocal EnableDelayedExpansion

rem // This is the input string:
set "x=  This is   a text  string     containing  many   spaces.   "

rem // Ensure there is at least one trailing space; then initialise auxiliary variables:
set "y=%x% " & set "wd=" & set "sp="

rem // Now here is the algorithm:
set "y=%y: =" & (if defined wd (set "y=!y!!sp!!wd!" & set "sp= ") else (set "sp=!sp! ")) & set "wd=%"

rem // Return messages:
echo  input: "%x%"
echo output: "%y%"

endlocal

However, this approach fails when a character of the set ^, !, "occurs in the string.

然而,当设定的字符这种方法失败^!"出现的字符串中。

回答by aschipfl

I just found a very nice solution for trimming off white-spaces of a string:
Have you ever called a sub-routine using calland expanded all arguments using %*? You will notice that any leading and/or trailing white-spaces are removed. Any white-spaces occurring in between other characters are preserved; so are all the other command token separators ,, ;, =and also the non-break space (character code 0xFF). This effect I am going to utilise for my script:

我刚刚找到了一个非常好的解决方案来修剪字符串的空格:
您是否曾经使用 调用子例程call并使用扩展所有参数%*?您会注意到任何前导和/或尾随空格都被删除了。保留其他字符之间出现的任何空格;所以是所有其他命令令牌隔板,;=并且还非休息空间(字符代码0xFF)。我将在我的脚本中使用这种效果:

@echo off

set "STR="
set /P STR="Enter string: "

rem /* Enable Delayed Expansion to avoid trouble with
rem    special characters: `&`, `<`, `>`, `|`, `^` */
setlocal EnableDelayedExpansion
echo You entered: `!STR!`
call :TRIM !STR!
echo And trimmed: `!RES!`
endlocal

exit /B

:TRIM
set "RES=%*"
exit /B

This script expects a string entered by the user which is then trimmed. This can of course also be applied on lines of a file (which the original question is about, but reading such line by line using for /Fis shown in other answers anyway, so I skip this herein). To trim the string on one side only, add a single character to the opposite side prior to trimming and remove it afterwards.

此脚本需要用户输入的字符串,然后对其进行修剪。这当然也可以应用于文件的行(原始问题是关于它的,但是for /F无论如何在其他答案中都显示了逐行阅读这样的内容,所以我在这里跳过这个)。要仅在一侧修剪字符串,请在修剪之前在另一侧添加一个字符,然后将其删除。

This approach has got some limitations though: it does not handle characters %, !, ^and "properly. To overcome this, several intermediate string manipulation operations become required:

这种方法虽然得到了一定的局限性:它不处理的字符%!^"正常。为了克服这个问题,需要几个中间字符串操作操作:

@echo off
setlocal EnableExtensions DisableDelayedExpansion

set "STR="
set /P STR="Enter string: "

setlocal EnableDelayedExpansion
echo You entered: `!STR!`
set "STR=!STR:%%=%%%%!"
set "STR=!STR:"=""!^"
if not "%STR%"=="%STR:!=%" set "STR=!STR:^=^^^^!"
set "STR=%STR:!=^^^!%"
call :TRIM !STR!
set "RES=!RES:""="!^"
echo And trimmed: `!RES!`
endlocal

endlocal
exit /B

:TRIM
set "RES=%*"
exit /B

Update:I just realised that the characters &, <, >and |still cause trouble. Once I find a solution I am going to come back here and fix the code accordingly...

更新:我刚刚意识到的字符&<>|仍然作祟。一旦我找到解决方案,我将回到这里并相应地修复代码......

回答by John

Good tool for removing trailing spaces in files in windows: http://mountwhite.net/en/spaces.html

删除 Windows 文件中尾随空格的好工具:http: //mountwhite.net/en/spaces.html

回答by anatoly techtonik

I use this Python 2 script to print lines with trailing whitespace and remove them manually:

我使用这个 Python 2 脚本打印带有尾随空格的行并手动删除它们:

#!/usr/bin/env python2
import sys

if not sys.argv[1:]:
  sys.exit('usage: whitespace.py <filename>')

for no, line in enumerate(open(sys.argv[1], 'rb').read().splitlines()):
  if line.endswith(' '):
    print no+1, line

I know that Python is not preinstalled for Windows, but at least it works cross-platform.

我知道 Python 没有为 Windows 预装,但至少它可以跨平台工作。