Windows 命令解释器 (CMD.EXE) 如何解析脚本?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4094699/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How does the Windows Command Interpreter (CMD.EXE) parse scripts?
提问by Benoit
I ran into ss64.comwhich provides good help regarding how to write batch scripts that the Windows Command Interpreter will run.
我遇到了ss64.com,它在如何编写 Windows 命令解释器将运行的批处理脚本方面提供了很好的帮助。
However, I have been unable to find a good explanation of the grammarof batch scripts, how things expand or do not expand, and how to escape things.
但是,对于批处理脚本的语法,事物如何扩展或不扩展,以及如何逃避事物,我一直无法找到很好的解释。
Here are sample questions that I have not been able to solve:
以下是我无法解决的示例问题:
- How is the quote system managed? I made a TinyPerlscript
(foreach $i (@ARGV) { print '*' . $i ; }
), compiled it and called it this way :my_script.exe "a ""b"" c"
→ output is*a "b*c
my_script.exe """a b c"""
→ output it*"a*b*c"
- How does the internal
echo
command work? What is expanded inside that command? - Why do I have to use
for [...] %%I
in file scripts, butfor [...] %I
in interactive sessions? - What are the escape characters, and in what context? How to escape a percent sign? For example, how can I echo
%PROCESSOR_ARCHITECTURE%
literally? I found thatecho.exe %""PROCESSOR_ARCHITECTURE%
works, is there a better solution? - How do pairs of
%
match? Example:set b=a
,echo %a %b% c%
→%a a c%
set a =b
,echo %a %b% c%
→bb c%
- How do I ensure a variable passes to a command as a single argument if ever this variable contains double quotes?
- How are variables stored when using the
set
command? For example, if I doset a=a" b
and thenecho.%a%
I obtaina" b
. If I however useecho.exe
from the UnxUtils, I geta b
. How comes%a%
expands in a different way?
- 报价系统是如何管理的?我制作了一个TinyPerl脚本
(foreach $i (@ARGV) { print '*' . $i ; }
),编译它并以这种方式调用它:my_script.exe "a ""b"" c"
→ 输出是*a "b*c
my_script.exe """a b c"""
→ 输出*"a*b*c"
- 内部
echo
命令如何工作?该命令中扩展了什么? - 为什么我必须
for [...] %%I
在文件脚本中使用,而for [...] %I
在交互式会话中使用? - 什么是转义字符,在什么情况下?如何转义百分号?例如,我如何
%PROCESSOR_ARCHITECTURE%
从字面上回声?我发现echo.exe %""PROCESSOR_ARCHITECTURE%
有效,有没有更好的解决方案? - 双如何
%
搭配?例子:set b=a
,echo %a %b% c%
→%a a c%
set a =b
,echo %a %b% c%
→bb c%
- 如果该变量包含双引号,如何确保变量作为单个参数传递给命令?
- 使用
set
命令时如何存储变量?例如,如果我这样做set a=a" b
,然后echo.%a%
我获得a" b
. 但是,如果我echo.exe
从 UnxUtils使用,我会得到a b
. 如何%a%
以不同的方式扩展?
Thank you for your lights.
谢谢你的灯。
回答by jeb
We performed experiments to investigate the grammar of batch scripts. We also investigated differences between batch and command line mode.
我们进行了实验来研究批处理脚本的语法。我们还研究了批处理模式和命令行模式之间的差异。
Batch Line Parser:
批处理行解析器:
Here is a brief overview of phases in the batch file line parser:
以下是批处理文件行解析器中阶段的简要概述:
Phase 0) Read Line:
阶段 0) 读取行:
Phase 1) Percent Expansion:
阶段 1) 百分比扩张:
Phase 2) Process special characters, tokenize, and build a cached command block:This is a complex process that is affected by things such as quotes, special characters, token delimiters, and caret escapes.
阶段 2) 处理特殊字符、标记化并构建缓存命令块:这是一个复杂的过程,受引号、特殊字符、标记分隔符和脱字符转义等因素的影响。
Phase 3) Echo the parsed command(s)Only if the command block did not begin with @
, and ECHO was ON at the start of the preceding step.
阶段 3) 回显解析的命令仅当命令块不以 开头@
,并且 ECHO 在上一步开始时为 ON。
Phase 4) FOR %X
variable expansion:Only if a FOR command is active and the commands after DO are being processed.
阶段 4) FOR%X
变量扩展:仅当 FOR 命令处于活动状态并且正在处理 DO 之后的命令时。
Phase 5) Delayed Expansion:Only if delayed expansion is enabled
阶段 5)延迟扩展:仅当启用延迟扩展时
Phase 5.3) Pipe processing:Only if commands are on either side of a pipe
阶段 5.3) 管道处理:仅当命令位于管道的任一侧时
Phase 5.5) Execute Redirection:
阶段 5.5) 执行重定向:
Phase 6) CALL processing/Caret doubling:Only if the command token is CALL
阶段 6) CALL 处理/插入符加倍:仅当命令标记为 CALL 时
Phase 7) Execute:The command is executed
阶段7)执行:命令被执行
Here are details for each phase:
以下是每个阶段的详细信息:
Note that the phases described below are only a model of how the batch parser works. The actual cmd.exe internals may not reflect these phases. But this model is effective at predicting behavior of batch scripts.
请注意,下面描述的阶段只是批处理解析器如何工作的模型。实际的 cmd.exe 内部结构可能无法反映这些阶段。但是这个模型在预测批处理脚本的行为方面是有效的。
Phase 0) Read Line:Read line of input through first <LF>
.
阶段 0) 读取行:通过 first 读取输入行<LF>
。
- When reading a line to be parsed as a command,
<Ctrl-Z>
(0x1A) is read as<LF>
(LineFeed 0x0A) - When GOTO or CALL reads lines while scanning for a :label,
<Ctrl-Z>
, is treated as itself - it is notconverted to<LF>
- 读取一行要解析为命令时,
<Ctrl-Z>
(0x1A)读取为<LF>
(LineFeed 0x0A) - 当 GOTO 或 CALL 在扫描 :label 时读取行时,
<Ctrl-Z>
, 被视为自身 - 它不会转换为<LF>
Phase 1) Percent Expansion:
阶段 1) 百分比扩张:
- A double
%%
is replaced by a single%
- Expansion of arguments (
%*
,%1
,%2
, etc.) - Expansion of
%var%
, if var does not exist replace it with nothing - Line is truncated at first
<LF>
not within%var%
expansion - For a complete explanation read the first half of this from dbenham Same thread: Percent Phase
- 双人
%%
被单人取代%
- 的参数扩展(
%*
,%1
,%2
等) - 的扩展
%var%
,如果 var 不存在,则将其替换为空 - 行首先被截断,
<LF>
不在%var%
扩展范围内 - 要获得完整的解释,请阅读 dbenham 的前半部分相同的线程:Percent Phase
Phase 2) Process special characters, tokenize, and build a cached command block:This is a complex process that is affected by things such as quotes, special characters, token delimiters, and caret escapes. What follows is an approximation of this process.
阶段 2) 处理特殊字符、标记化并构建缓存命令块:这是一个复杂的过程,受引号、特殊字符、标记分隔符和脱字符转义等因素的影响。下面是这个过程的近似。
There are concepts that are important throughout this phase.
在这个阶段,有些概念很重要。
- A token is simply a string of characters that is treated as a unit.
- Tokens are separated by token delimiters. The standard token delimiters are
<space>
<tab>
;
,
=
<0x0B>
<0x0C>
and<0xFF>
Consecutive token delimiters are treated as one - there are no empty tokens between token delimiters - There are no token delimiters within a quoted string. The entire quoted string is always treated as part of a single token. A single token may consist of a combination of quoted strings and unquoted characters.
- 令牌只是被视为一个单元的字符串。
- 令牌由令牌定界符分隔。标准标记分隔符是
<space>
<tab>
;
,
=
<0x0B>
<0x0C>
和<0xFF>
连续标记分隔符被视为一个 - 标记分隔符之间没有空标记 - 带引号的字符串中没有标记分隔符。整个带引号的字符串始终被视为单个标记的一部分。单个标记可能由带引号的字符串和不带引号的字符的组合组成。
The following characters may have special meaning in this phase, depending on context: <CR>
^
(
@
&
|
<
>
<LF>
<space>
<tab>
;
,
=
<0x0B>
<0x0C>
<0xFF>
以下字符在此阶段可能具有特殊含义,具体取决于上下文: <CR>
^
(
@
&
|
<
>
<LF>
<space>
<tab>
;
,
=
<0x0B>
<0x0C>
<0xFF>
Look at each character from left to right:
从左到右查看每个字符:
- If
<CR>
then remove it, as if it were never there (except for weird redirection behavior) - If a caret (
^
), the next character is escaped, and the escaping caret is removed. Escaped characters lose all special meaning (except for<LF>
). - If a quote (
"
), toggle the quote flag. If the quote flag is active, then only"
and<LF>
are special. All other characters lose their special meaning until the next quote toggles the quote flag off. It is not possible to escape the closing quote. All quoted characters are always within the same token. <LF>
always turns off the quote flag. Other behaviors vary depending on context, but quotes never alter the behavior of<LF>
.- Escaped
<LF>
<LF>
is stripped- The next character is escaped. If at the end of line buffer, then the next line is read and processed by phases 1 and 1.5 and appended to the current one before escaping the next character. If the next character is
<LF>
, then it is treated as a literal, meaning this process is not recursive.
- Unescaped
<LF>
not within parentheses<LF>
is stripped and parsing of the current line is terminated.- Any remaining characters in the line buffer are simply ignored.
- Unescaped
<LF>
within a FOR IN parenthesized block<LF>
is converted into a<space>
- If at the end of the line buffer, then the next line is read and appended to the current one.
- Unescaped
<LF>
within a parenthesized command block<LF>
is converted into<LF><space>
, and the<space>
is treated as part of the next line of the command block.- If at the end of line buffer, then the next line is read and appended to the space.
- Escaped
- If one of the special characters
&
|
<
or>
, split the line at this point in order to handle pipes, command concatenation, and redirection.- In the case of a pipe (
|
), each side is a separate command (or command block) that gets special handling in phase 5.3 - In the case of
&
,&&
, or||
command concatenation, each side of the concatenation is treated as a separate command. - In the case of
<
,<<
,>
, or>>
redirection, the redirection clause is parsed, temporarily removed, and then appended to the end of the current command. A redirection clause consists of an optional file handle digit, the redirection operator, and the redirection destination token.- If the token that precedes the redirection operator is a single unescaped digit, then the digit specifies the file handle to be redirected. If the handle token is not found, then output redirection defaults to 1 (stdout), and input redirection defaults to 0 (stdin).
- In the case of a pipe (
- If the very first token for this command (prior to moving redirection to the end) begins with
@
, then the@
has special meaning. (@
is not special in any other context)- The special
@
is removed. - If ECHO is ON, then this command, along with any following concatenated commands on this line, are excluded from the phase 3 echo. If the
@
is before an opening(
, then the entire parenthesized block is excluded from the phase 3 echo.
- The special
- Process parenthesis (provides for compound statements across multiple lines):
- If the parser is not looking for a command token, then
(
is not special. - If the parser is looking for a command token and finds
(
, then start a new compound statement and increment the parenthesis counter - If the parenthesis counter is > 0 then
)
terminates the compound statement and decrements the parenthesis counter. - If the line end is reached and the parenthesis counter is > 0 then the next line will be appended to the compound statement (starts again with phase 0)
- If the parenthesis counter is 0 and the parser is looking for a command, then
)
functions similar to aREM
statement as long as it is immediately followed by a token delimiter, special character, newline, or end-of-file- All special characters lose their meaning except
^
(line concatenation is possible) - Once the end of the logical line is reached, the entire "command" is discarded.
- All special characters lose their meaning except
- If the parser is not looking for a command token, then
- Each command is parsed into a series of tokens. The first token is always treated as a command token (after special
@
have been stripped and redirection moved to the end).- Leading token delimiters prior to the command token are stripped
- When parsing the command token,
(
functions as a command token delimiter, in addition to the standard token delimiters - The handling of subsequent tokens depends on the command.
- Most commands simply concatenate all arguments after the command token into a single argument token. All argument token delimiters are preserved. Argument options are typically not parsed until phase 7.
- Three commands get special handling - IF, FOR, and REM
- IF is split into two or three distinct parts that are processed independently. A syntax error in the IF construction will result in a fatal syntax error.
- The comparison operation is the actual command that flows all the way through to phase 7
- All IF options are fully parsed in phase 2.
- Consecutive token delimiters collapse into a single space.
- Depending on the comparison operator, there will be one or two value tokens that are identified.
- The True command block is the set of commands after the condition, and is parsed like any other command block. If ELSE is to be used, then the True block must be parenthesized.
- The optional False command block is the set of commands after ELSE. Again, this command block is parsed normally.
- The True and False command blocks do not automatically flow into the subsequent phases. Their subsequent processing is controled by phase 7.
- The comparison operation is the actual command that flows all the way through to phase 7
- FOR is split in two after the DO. A syntax error in the FOR construction will result in a fatal syntax error.
- The portion through DO is the actual FOR iteration command that flows all the way through phase 7
- All FOR options are fully parsed in phase 2.
- The IN parenthesized clause treats
<LF>
as<space>
. After the IN clause is parsed, all tokens are concatenated together to form a single token. - Consecutive unescaped/unquoted token delimiters collapse into a single space throughout the FOR command through DO.
- The portion after DO is a command block that is parsed normally. Subsequent processing of the DO command block is controled by the iteration in phase 7.
- The portion through DO is the actual FOR iteration command that flows all the way through phase 7
- REM detected in phase 2 is treated dramatically different than all other commands.
- Only one argument token is parsed - the parser ignores characters after the first argument token.
- The REM command may appear in phase 3 output, but the command is never executed, and the original argument text is echoed - escaping carets are not removed, except...
- If there is only one argument token that ends with an unescaped
^
that ends the line, then the argument token is thrown away, and the subsequent line is parsed and appended to the REM. This repeats until there is more than one token, or the last character is not^
.
- If there is only one argument token that ends with an unescaped
- IF is split into two or three distinct parts that are processed independently. A syntax error in the IF construction will result in a fatal syntax error.
- If the command token begins with
:
, and this is the first round of phase 2 (not a restart due to CALL in phase 6) then- The token is normally treated as an Unexecuted Label.
- The remainder of the line is parsed, however
)
,<
,>
,&
and|
no longer have special meaning. The entire remainder of the line is considered to be part of the label "command". - The
^
continues to be special, meaning that line continuation can be used to append the subsequent line to the label. - An Unexecuted Labelwithin a parenthesized block will result in a fatal syntax error unless it is immediately followed by a command or Executed Labelon the next line.
(
no longer has special meaning for the first command that follows the Unexecuted Label.
- The command is aborted after label parsing is complete. Subsequent phases do not take place for the label
- The remainder of the line is parsed, however
- There are three exceptions that can cause a label found in phase 2 to be treated as an Executed Labelthat continues parsing through phase 7.
- There is redirection that precedes the label token, and there is a
|
pipe or&
,&&
, or||
command concatenation on the line. - There is redirection that precedes the label token, and the command is within a parenthesized block.
- The label token is the very first command on a line within a parenthesized block, and the line above ended with an Unexecuted Label.
- There is redirection that precedes the label token, and there is a
- The following occurs when an Executed Labelis discovered in phase 2
- The label, its arguments, and its redirection are all excluded from any echo output in phase 3
- Any subsequent concatenated commands on the line are fully parsed and executed.
- For more information about Executed Labelsvs. Unexecuted Labels, see https://www.dostips.com/forum/viewtopic.php?f=3&t=3803&p=55405#p55405
- The token is normally treated as an Unexecuted Label.
- 如果
<CR>
然后删除它,就好像它从来没有存在过(除了奇怪的重定向行为) - 如果是插入符号 (
^
),则转义下一个字符,并删除转义插入符号。转义字符失去所有特殊含义(除了<LF>
)。 - 如果是引号 (
"
),则切换引号标志。如果引用标志处于活动状态,则 only"
和<LF>
是特殊的。所有其他字符都会失去它们的特殊含义,直到下一个引号将引号标志关闭。无法逃避结束语。所有引用的字符总是在同一个标记内。 <LF>
总是关闭引用标志。其他行为因上下文而异,但引号永远不会改变<LF>
.- 逃脱
<LF>
<LF>
被剥离- 下一个字符被转义。如果在行缓冲区的末尾,则下一行由阶段 1 和 1.5 读取和处理,并在转义下一个字符之前附加到当前行。如果下一个字符是
<LF>
,则将其视为文字,这意味着此过程不是递归的。
<LF>
未转义不在括号内<LF>
被剥离并终止当前行的解析。- 行缓冲区中的任何剩余字符都将被忽略。
<LF>
在带括号的 FOR IN 块内未 转义<LF>
被转换成一个<space>
- 如果在行缓冲区的末尾,则读取下一行并将其附加到当前行。
<LF>
在带括号的命令块内未 转义<LF>
被转换为<LF><space>
,并且<space>
被视为命令块下一行的一部分。- 如果在行缓冲区的末尾,则读取下一行并将其附加到空间。
- 逃脱
- 如果是特殊字符
&
|
<
or 之一>
,则在此时拆分行以处理管道、命令连接和重定向。- 在管道 (
|
)的情况下,每一侧都是一个单独的命令(或命令块),在阶段 5.3 中得到特殊处理 - 在、 或命令串联的情况下
&
,串联的每一侧都被视为一个单独的命令。&&
||
- 在、、 或重定向的情况下
<
,重定向子句被解析、临时删除,然后附加到当前命令的末尾。重定向子句由可选的文件句柄数字、重定向运算符和重定向目标标记组成。<<
>
>>
- 如果重定向运算符之前的标记是单个未转义的数字,则该数字指定要重定向的文件句柄。如果没有找到句柄标记,则输出重定向默认为 1 (stdout),输入重定向默认为 0 (stdin)。
- 在管道 (
- 如果此命令的第一个标记(在将重定向移动到末尾之前)以 开头
@
,则@
具有特殊含义。(@
在任何其他上下文中都不特殊)- 特殊
@
被删除。 - 如果 ECHO 为 ON,则此命令以及此行上的任何后续串联命令都将从阶段 3 回声中排除。如果
@
是在开头 之前(
,则整个括号中的块将从阶段 3 回声中排除。
- 特殊
- 过程括号(提供跨多行的复合语句):
- 如果解析器不是在寻找命令令牌,那么
(
它并不特殊。 - 如果解析器正在查找命令标记并找到
(
,则开始一个新的复合语句并增加括号计数器 - 如果括号计数器 > 0,则
)
终止复合语句并递减括号计数器。 - 如果到达行尾并且括号计数器 > 0,则下一行将附加到复合语句中(再次从阶段 0 开始)
- 如果括号计数器为 0 并且解析器正在查找命令,则
)
功能类似于REM
语句,只要它后面紧跟一个标记定界符、特殊字符、换行符或文件结束符- 所有特殊字符都失去了意义,除了
^
(行连接是可能的) - 一旦到达逻辑行的末尾,整个“命令”就被丢弃。
- 所有特殊字符都失去了意义,除了
- 如果解析器不是在寻找命令令牌,那么
- 每个命令都被解析为一系列标记。第一个标记始终被视为命令标记(在特殊
@
被剥离并将重定向移动到末尾之后)。- 命令标记之前的前导标记定界符被剥离
- 解析命令令牌时,
(
除了标准令牌分隔符之外,还用作命令令牌分隔符 - 后续令牌的处理取决于命令。
- 大多数命令只是将命令标记之后的所有参数连接成一个参数标记。保留所有参数标记分隔符。参数选项通常直到第 7 阶段才会被解析。
- 三个命令得到特殊处理 - IF、FOR 和 REM
- IF 分为两个或三个独立处理的不同部分。IF 结构中的语法错误将导致致命的语法错误。
- 比较操作是一直流到第 7 阶段的实际命令
- 所有 IF 选项都在第 2 阶段完全解析。
- 连续的标记分隔符折叠成一个空格。
- 根据比较运算符的不同,将标识一或两个值标记。
- True 命令方块是条件之后的一组命令,并且像任何其他命令方块一样被解析。如果要使用 ELSE,则 True 块必须加括号。
- 可选的 False 命令块是 ELSE 之后的命令集。同样,这个命令方块被正常解析。
- True 和 False 命令方块不会自动流入后续阶段。它们的后续处理由阶段 7 控制。
- 比较操作是一直流到第 7 阶段的实际命令
- FOR 在 DO 之后一分为二。FOR 结构中的语法错误将导致致命的语法错误。
- 通过 DO 的部分是贯穿第 7 阶段的实际 FOR 迭代命令
- 所有 FOR 选项都在第 2 阶段完全解析。
- 中括号条款对待
<LF>
的<space>
。解析 IN 子句后,所有标记连接在一起形成单个标记。 - 连续的未转义/未加引号的标记定界符通过 DO 在整个 FOR 命令中折叠为一个空格。
- DO后面的部分是正常解析的命令方块。DO 命令块的后续处理由阶段 7 中的迭代控制。
- 通过 DO 的部分是贯穿第 7 阶段的实际 FOR 迭代命令
- 在阶段 2 中检测到的 REM 与所有其他命令的处理方式截然不同。
- 仅解析一个参数标记 - 解析器忽略第一个参数标记之后的字符。
- REM 命令可能会出现在第 3 阶段的输出中,但该命令永远不会执行,并且会回显原始参数文本 - 不会删除转义插入符号,除了...
- 如果只有一个参数标记以未转义的
^
结束行结束,则参数标记被丢弃,随后的行被解析并附加到 REM。这会重复直到有多个标记,或者最后一个字符不是^
。
- 如果只有一个参数标记以未转义的
- IF 分为两个或三个独立处理的不同部分。IF 结构中的语法错误将导致致命的语法错误。
- 如果命令令牌以 开头
:
,并且这是第 2 阶段的第一轮(由于第 6 阶段中的 CALL 而不是重新启动),则- 令牌通常被视为未执行的标签。
- 该行的剩余部分进行解析,但是
)
,<
,>
,&
和|
不再有特殊的意义。该行的整个其余部分被认为是标签“命令”的一部分。 - 将
^
继续是特殊的,这意味着续行可用于追加下一行的标签。 - 括号内的未执行标签将导致致命的语法错误,除非它紧跟在下一行的命令或已执行标签之后。
(
对于跟在Unexecuted Label 之后的第一个命令不再具有特殊意义。
- 该命令在标签解析完成后中止。标签不会发生后续阶段
- 该行的剩余部分进行解析,但是
- 存在三个异常会导致在阶段 2 中找到的标签被视为已执行标签,并继续解析到阶段 7。
- 在标签标记之前有重定向,并且在行上有一个
|
管道或&
、&&
、 或||
命令连接。 - 在标签标记之前有重定向,并且命令位于带括号的块内。
- 标签标记是带括号的块内一行上的第一个命令,上面的行以Unexecuted Label结尾。
- 在标签标记之前有重定向,并且在行上有一个
- 在阶段 2 中发现
Executed Label时会发生以下情况
- 标签、它的参数和它的重定向都被排除在阶段 3 的任何回声输出之外
- 该行上的任何后续连接命令都将被完全解析和执行。
- 有关已执行标签与未执行标签的更多信息,请参阅https://www.dostips.com/forum/viewtopic.php?f=3&t=3803&p=55405#p55405
- 令牌通常被视为未执行的标签。
Phase 3) Echo the parsed command(s)Only if the command block did not begin with @
, and ECHO was ON at the start of the preceding step.
阶段 3) 回显解析的命令仅当命令块不以 开头@
,并且 ECHO 在上一步开始时为 ON。
Phase 4) FOR %X
variable expansion:Only if a FOR command is active and the commands after DO are being processed.
阶段 4) FOR%X
变量扩展:仅当 FOR 命令处于活动状态并且正在处理 DO 之后的命令时。
- At this point, phase 1 of batch processing will have already converted a FOR variable like
%%X
into%X
. The command line has different percent expansion rules for phase 1. This is the reason that command lines use%X
but batch files use%%X
for FOR variables. - FOR variable names are case sensitive, but
~modifiers
are not case sensitive. ~modifiers
take precedence over variable names. If a character following~
is both a modifier and a valid FOR variable name, and there exists a subsequent character that is an active FOR variable name, then the character is interpreted as a modifier.- FOR variable names are global, but only within the context of a DO clause. If a routine is CALLed from within a FOR DO clause, then the FOR variables are not expanded within the CALLed routine. But if the routine has its own FOR command, then allcurrently defined FOR variables are accessible to the inner DO commands.
- FOR variable names can be reused within nested FORs. The inner FOR value takes precedence, but once the INNER FOR closes, then the outer FOR value is restored.
- If ECHO was ON at the start of this phase, then phase 3) is repeated to show the parsed DO commands after the FOR variables have been expanded.
- 在这一点上,批量处理的阶段1将已像转换一个FOR变量
%%X
成%X
。命令行对于阶段 1 具有不同的百分比扩展规则。这就是命令行使用%X
而批处理文件%%X
用于 FOR 变量的原因。 - FOR 变量名区分大小写,但
~modifiers
不区分大小写。 ~modifiers
优先于变量名。如果后面的字符~
既是修饰符又是有效的 FOR 变量名,并且存在作为活动 FOR 变量名的后续字符,则该字符被解释为修饰符。- FOR 变量名称是全局的,但仅在 DO 子句的上下文中。如果从 FOR DO 子句中调用例程,则不会在 CALL 例程中扩展 FOR 变量。但是如果例程有自己的 FOR 命令,那么所有当前定义的 FOR 变量都可以被内部 DO 命令访问。
- FOR 变量名可以在嵌套的 FOR 中重复使用。内部 FOR 值优先,但是一旦 INNER FOR 关闭,外部 FOR 值就会恢复。
- 如果 ECHO 在此阶段开始时为 ON,则重复阶段 3) 以在扩展 FOR 变量后显示解析的 DO 命令。
---- From this point onward, each command identified in phase 2 is processed separately.
---- Phases 5 through 7 are completed for one command before moving on to the next.
---- 从这一点开始,阶段 2 中识别的每个命令都被单独处理。
---- 在继续执行下一个命令之前,一个命令的第 5 到第 7 阶段已完成。
Phase 5) Delayed Expansion:Only if delayed expansion is on, the command is not in a parenthesized block on either side of a pipe, and the command is not a "naked" batch script(script name without parentheses, CALL, command concatenation, or pipe).
阶段 5)延迟扩展:仅当延迟扩展打开时,命令不在管道两侧的带括号的块中,并且该命令不是“裸”的批处理脚本(脚本名称不带括号,CALL,命令串联,或管道)。
- Each token for a command is parsed for delayed expansion independently.
- Most commands parse two or more tokens - the command token, the arguments token, and each redirection destination token.
- The FOR command parses the IN clause token only.
- The IF command parses the comparison values only - either one or two, depending on the comparison operator.
- For each parsed token, first check if it contains any
!
. If not, then the token is not parsed - important for^
characters. If the token does contain!
, then scan each character from left to right:- If it is a caret (
^
) the next character has no special meaning, the caret itself is removed - If it is an exclamation mark, search for the next exclamation mark (carets are not observed anymore), expand to the value of the variable.
- Consecutive opening
!
are collapsed into a single!
- Any remaining unpaired
!
is removed
- Consecutive opening
- Expanding vars at this stage is "safe", because special characters are not detected anymore (even
<CR>
or<LF>
) - For a more complete explanation, read the 2nd half of this from dbenham same thread - Exclamation Point Phase
- If it is a caret (
- 命令的每个标记都被独立解析以进行延迟扩展。
- 大多数命令解析两个或多个标记 - 命令标记、参数标记和每个重定向目标标记。
- FOR 命令仅解析 IN 子句标记。
- IF 命令仅解析比较值 - 一个或两个,具体取决于比较运算符。
- 对于每个解析的标记,首先检查它是否包含任何
!
. 如果不是,则不会解析令牌 - 这对^
字符很重要。如果令牌确实包含!
,则从左到右扫描每个字符:- 如果是插入符号 (
^
) 下一个字符没有特殊含义,插入符号本身被删除 - 如果是感叹号,则搜索下一个感叹号(不再观察到插入符号),扩展到变量的值。
- 连续的开口
!
折叠成一个!
!
删除任何剩余的未配对
- 连续的开口
- 在此阶段扩展 vars 是“安全的”,因为不再检测到特殊字符(甚至
<CR>
或<LF>
) - 有关更完整的解释,请阅读 dbenham同一线程中的第二部分 - Exclamation Point Phase
- 如果是插入符号 (
Phase 5.3) Pipe processing:Only if commands are on either side of a pipe
Each side of the pipe is processed independently and asynchronously.
阶段 5.3) 管道处理:仅当命令位于管道的任一侧时,管道的
每一侧都被独立且异步地处理。
- If command is internal to cmd.exe, or it is a batch file, or if it is a parenthesized command block, then it is executed in a new cmd.exe thread via
%comspec% /S /D /c" commandBlock"
, so the command block gets a phase restart, but this time in command line mode.- If a parenthesized command block, then all
<LF>
with a command before and after are converted to<space>&
. Other<LF>
are stripped.
- If a parenthesized command block, then all
- This is the end of processing for the pipe commands.
- See Why does delayed expansion fail when inside a piped block of code?for more about pipe parsing and processing
- 如果命令在 cmd.exe 内部,或者它是一个批处理文件,或者如果它是一个带括号的命令块,那么它会在一个新的 cmd.exe 线程中通过 执行
%comspec% /S /D /c" commandBlock"
,因此命令块会重新启动阶段,但这次在命令行模式下。- 如果是括号括起来的命令块,那么所有
<LF>
带有命令前后的都转换为<space>&
. 其他的<LF>
都被剥光了。
- 如果是括号括起来的命令块,那么所有
- 管道命令的处理到此结束。
- 请参阅为什么在管道代码块中延迟扩展会失败?有关管道解析和处理的更多信息
Phase 5.5) Execute Redirection:Any redirection that was discovered in phase 2 is now executed.
阶段 5.5) 执行重定向:现在执行在阶段 2 中发现的任何重定向。
- The results of phases 4 and 5 can impact the redirection that was discovered in phase 2.
- If the redirection fails, then the remainder of the command is aborted. Note that failed redirection does not set ERRORLEVEL to 1 unless
||
is used.
- 阶段 4 和 5 的结果可能会影响阶段 2 中发现的重定向。
- 如果重定向失败,则中止命令的其余部分。请注意,除非
||
使用,否则失败的重定向不会将 ERRORLEVEL 设置为 1。
Phase 6) CALL processing/Caret doubling:Only if the command token is CALL, or if the text before the first occurring standard token delimiter is CALL. If CALL is parsed from a larger command token, then the unused portion is prepended to the arguments token before proceeding.
阶段 6) CALL 处理/插入符加倍:仅当命令标记为 CALL,或者第一个出现的标准标记定界符之前的文本为 CALL 时。如果从更大的命令标记解析 CALL,则在继续之前未使用的部分被添加到参数标记之前。
- Scan the arguments token for an unquoted
/?
. If found anywhere within the tokens, then abort phase 6 and proceed to Phase 7, where the HELP for CALL will be printed. - Remove the first
CALL
, so multiple CALL's can be stacked - Double all carets
- Restart phases 1, 1.5, and 2, but do not continue to phase 3
- Any doubled carets are reduced back to one caret as long as they are not quoted. But unfortunately, quoted carets remain doubled.
- Phase 1 changes a bit
- Expansion errors in step 1.2 or 1.3 abort the CALL, but the error is not fatal - batch processing continues.
- Phase 2 tasks are altered a bit
- Any newly appearing unquoted, unescaped redirection that was not detected in the first round of phase 2 is detected, but it is removed (including the file name) without actually performing the redirection
- Any newly appearing unquoted, unescaped caret at the end of the line is removed without performing line continuation
- The CALL is aborted without error if any of the following are detected
- Newly appearing unquoted, unescaped
&
or|
- The resultant command token begins with unquoted, unescaped
(
- The very first token after the removed CALL began with
@
- Newly appearing unquoted, unescaped
- If the resultant command is a seemingly valid IF or FOR, then execution will subsequently fail with an error stating that
IF
orFOR
is not recognized as an internal or external command. - Of course the CALL is not aborted in this 2nd round of phase 2 if the resultant command token is a label beginning with
:
.
- If the resultant command token is CALL, then restart Phase 6 (repeats until no more CALL)
- If the resultant command token is a batch script or a :label, then execution of the CALL is fully handled by the remainder of Phase 6.
- Push the current batch script file position on the call stack so that execution can resume from the correct position when the CALL is completed.
- Setup the %0, %1, %2, ...%N and %* argument tokens for the CALL, using all resultant tokens
- If the command token is a label that begins with
:
, then- Restart Phase 5. This can impact what :label is CALLed. But since the %0 etc. tokens have already been setup, it will not alter the arguments that are passed to the CALLed routine.
- Execute GOTO label to position the file pointer at the beginning of the subroutine (ignore any other tokens that may follow the :label) See Phase 7 for rules on how GOTO works.
- If the :label token is missing, or the :label is not found, then the call stack is immediately popped to restore the saved file position, and the CALL is aborted.
- If the :label happens to contain /?, then GOTO help is printed instead of searching for the :label. The file pointer does not move, such that code after the CALL is executed twice, once in the CALL context, and then again after the CALL return. See Why CALL prints the GOTO help message in this script?And why command after that are executed twice?for more info.
- Else transfer control to the specified batch script.
- Execution of the CALLed :label or script continues until either EXIT /B or end-of-file is reached, at which point the CALL stack is popped and execution resumes from the saved file position.
Phase 7 is not executed for CALLed scripts or :labels.
- Else the result of phase 6 falls through into phase 7 for execution.
- 扫描未引用的参数标记
/?
。如果在令牌内的任何位置找到,则中止阶段 6 并继续进行阶段 7,在该阶段将打印 CALL 的帮助。 - 删除第一个
CALL
,因此可以堆叠多个 CALL - 加倍所有插入符号
- 重新开始阶段 1、1.5 和 2,但不要继续进行阶段 3
- 只要没有引用,任何双重插入符号都会减少回一个插入符号。但不幸的是,引用的插入符号仍然翻了一番。
- 第一阶段略有变化
- 步骤 1.2 或 1.3 中的扩展错误会中止 CALL,但该错误不是致命的 - 批处理继续。
- 阶段 2 的任务略有改变
- 任何在第一轮阶段 2 中未检测到的新出现的未加引号、未转义的重定向都会被检测到,但会被删除(包括文件名),而不会实际执行重定向
- 行尾任何新出现的未加引号、未转义的插入符号都将被删除,而不执行续行
- 如果检测到以下任何一项,CALL 将无错误地中止
- 新出现的未加引号、未转义
&
或|
- 生成的命令标记以未加引号、未转义的开头
(
- 删除 CALL 后的第一个标记以
@
- 新出现的未加引号、未转义
- 如果结果命令是一个看似有效的 IF 或 FOR,则执行随后将失败,并显示错误,说明
IF
或FOR
未被识别为内部或外部命令。 - 当然,如果生成的命令令牌是一个以 开头的标签,则在第 2 阶段的第二轮中不会中止 CALL
:
。
- 如果结果命令令牌是 CALL,则重新启动第 6 阶段(重复直到不再有 CALL)
- 如果生成的命令标记是批处理脚本或 :label,则 CALL 的执行完全由阶段 6 的其余部分处理。
- 将当前批处理脚本文件位置推送到调用堆栈上,以便在 CALL 完成时可以从正确的位置恢复执行。
- 使用所有结果标记为 CALL 设置 %0、%1、%2、...%N 和 %* 参数标记
- 如果命令标记是以 开头的标签
:
,则- 重新启动第 5 阶段。这可能会影响 :label 被调用的内容。但是由于 %0 等标记已经设置,它不会改变传递给 CALLed 例程的参数。
- 执行 GOTO label 将文件指针定位在子例程的开头(忽略可能跟随 :label 的任何其他标记) 有关 GOTO 如何工作的规则,请参见阶段 7。
- 如果 :label 标记丢失,或者 :label 未找到,则立即弹出调用堆栈以恢复保存的文件位置,并中止 CALL。
- 如果 :label 恰好包含 /?,则打印 GOTO 帮助而不是搜索 :label。文件指针不会移动,因此 CALL 之后的代码被执行两次,一次在 CALL 上下文中,然后在 CALL 返回之后再次执行。请参阅为什么 CALL 在此脚本中打印 GOTO 帮助消息?为什么此后的命令执行两次?了解更多信息。
- 否则将控制转移到指定的批处理脚本。
- CALLed :label 或脚本的执行将继续,直到到达 EXIT /B 或文件结尾,此时 CALL 堆栈被弹出并从保存的文件位置恢复执行。
不会为 CALLed 脚本或 :labels 执行阶段 7。
- 否则,第 6 阶段的结果将落入第 7 阶段执行。
Phase 7) Execute:The command is executed
阶段7)执行:命令被执行
- 7.1 - Execute internal command- If the command token is quoted, then skip this step. Otherwise, attempt to parse out an internal command and execute.
- The following tests are made to determine if an unquoted command token represents an internal command:
- If the command token exactly matches an internal command, then execute it.
- Else break the command token before the first occurrence of
+
/
[
]
<space>
<tab>
,
;
or=
If the preceding text is an internal command, then remember that command- If in command line mode, or if the command is from a parenthesized block, IF true or false command block, FOR DO command block, or involved with command concatenation, then execute the internal command
- Else (must be a stand-alone command in batch mode) scan the current folder and the PATH for a .COM, .EXE, .BAT, or .CMD file whose base name matches the original command token
- If the first matching file is a .BAT or .CMD, then goto 7.3.exec and execute that script
- Else (match not found or first match is .EXE or .COM) execute the remembered internal command
- Else break the command token before the first occurrence of
.
\
or:
If the preceding text is not an internal command, then goto 7.2
Else the preceding text may be an internal command. Remember this command. - Break the command token before the first occurrence of
+
/
[
]
<space>
<tab>
,
;
or=
If the preceding text is a path to an existing file, then goto 7.2
Else execute the remembered internal command.
- If an internal command is parsed from a larger command token, then the unused portion of the command token is included in the argument list
- Just because a command token is parsed as an internal command does not mean that it will execute successfully. Each internal command has its own rules as to how the arguments and options are parsed, and what syntax is allowed.
- All internal commands will print help instead of performing their function if
/?
is detected. Most recognize/?
if it appears anywhere in the arguments. But a few commands like ECHO and SET only print help if the first argument token begins with/?
. - SET has some interesting semantics:
- If a SET command has a quote before the variable name and extensions are enabled
set "name=content" ignored
-->value=content
then the text between the first equal sign and the last quote is used as the content (first equal and last quote excluded). Text after the last quote is ignored. If there is no quote after the equal sign, then the rest of the line is used as content. - If a SET command does not have a quote before the name
set name="content" not ignored
-->value="content" not ignored
then the entire remainder of the line after the equal is used as content, including any and all quotes that may be present.
- If a SET command has a quote before the variable name and extensions are enabled
- An IF comparison is evaluated, and depending on whether the condition is true or false, the appropriate already parsed dependent command block is processed, starting with phase 5.
- The IN clause of a FOR command is iterated appropriately.
- If this is a FOR /F that iterates the output of a command block, then:
- The IN clause is executed in a new cmd.exe process via CMD /C.
- The command block must go through the entire parsing process a second time, but this time in a command line context
- ECHO will start out ON, and delayed expansion will usually start out disabled (dependent on the registry setting)
- All environment changes made by the IN clause command block will be lost once the child cmd.exe process terminates
- For each iteration:
- The FOR variable values are defined
- The already parsed DO command block is then processed, starting with phase 4.
- If this is a FOR /F that iterates the output of a command block, then:
- GOTO uses the following logic to locate the :label
- The label is parsed from the first argument token
- The script is scanned for the next occurrence of the label
- The scan starts from the current file position
- If the end of file is reached, then the scan loops back to the beginning of the file and continues to the original starting point.
- The scan stops at the first occurrence of the label that it finds, and the file pointer is set to the line immediately following the label. Execution of the script resumes from that point. Note that a successful true GOTO will immediately abort any parsed block of code, including FOR loops.
- If the label is not found, or the label token is missing, then the GOTO fails, an error message is printed, and the call stack is popped. This effectively functions as an EXIT /B, except any already parsed commands in the current command block that follow the GOTO are still executed, but in the context of the CALLer (the context that exists after EXIT /B)
- See https://www.dostips.com/forum/viewtopic.php?f=3&t=3803for a more precise description of the rules used for parsing labels.
- RENAME and COPY both accept wildcards for the source and target paths. But Microsoft does a terrible job documenting how the wildcards work, especially for the target path. A useful set of wildcard rules may be found at How does the Windows RENAME command interpret wildcards?
- The following tests are made to determine if an unquoted command token represents an internal command:
- 7.2 - Execute volume change- Else if the command token does not begin with a quote, is exactly two characters long, and the 2nd character is a colon, then change the volume
- All argument tokens are ignored
- If the volume specified by the first character cannot be found, then abort with an error
- A command token of
::
will always result in an error unless SUBST is used to define a volume for::
If SUBST is used to define a volume for::
, then the volume will be changed, it will not be treated as a label.
- 7.3 - Execute external command- Else try to treat the command as an external command.
- If in command line mode and the command is not quoted and does not begin with a volume specification, white-space,
,
,;
,=
or+
then break the command token at the first occurrence of<space>
,
;
or=
and prepend the remainder to the argument token(s). - If the 2nd character of the command token is a colon, then verify the volume specified by the 1st character can be found.
If the volume cannot be found, then abort with an error. - If in batch mode and the command token begins with
:
, then goto 7.4
Note that if the label token begins with::
, then this will not be reached because the preceding step will have aborted with an error unless SUBST is used to define a volume for::
. - Identify the external command to execute.
- This is a complex process that may involve the current volume, current directory, PATH variable, PATHEXT variable, and or file associations.
- If a valid external command cannot be identified, then abort with an error.
- If in command line mode and the command token begins with
:
, then goto 7.4
Note that this is rarely reached because the preceding step will have aborted with an error unless the command token begins with::
, and SUBST is used to define a volume for::
, and the entire command token is a valid path to an external command. - 7.3.exec- Execute the external command.
- If in command line mode and the command is not quoted and does not begin with a volume specification, white-space,
- 7.4 - Ignore a label- Ignore the command and all its arguments if the command token begins with
:
.
Rules in 7.2 and 7.3 may prevent a label from reaching this point.
- 7.1 - 执行内部命令- 如果命令标记被引用,则跳过此步骤。否则,尝试解析出内部命令并执行。
- 进行以下测试以确定未加引号的命令标记是否代表内部命令:
- 如果命令标记与内部命令完全匹配,则执行它。
- 否则在第一次出现
+
/
[
]
<space>
<tab>
,
;
或之前中断命令标记=
如果前面的文本是内部命令,则记住该命令- 如果在命令行模式下,或者如果命令来自括号中的块,IF true 或 false 命令块,FOR DO 命令块,或涉及命令串联,则执行内部命令
- 否则(必须是批处理模式下的独立命令)扫描当前文件夹和 PATH 以查找基本名称与原始命令标记匹配的 .COM、.EXE、.BAT 或 .CMD 文件
- 如果第一个匹配文件是 .BAT 或 .CMD,则转到 7.3.exec 并执行该脚本
- 否则(找不到匹配项或第一个匹配项是 .EXE 或 .COM)执行记住的内部命令
- Else 在第一次出现
.
\
or之前中断命令标记:
如果前面的文本不是内部命令,则转到 7.2
Else 前面的文本可能是内部命令。记住这个命令。 - 在第一次出现
+
/
[
]
<space>
<tab>
,
;
or之前中断命令标记=
如果前面的文本是现有文件的路径,则转到 7.2
否则执行记住的内部命令。
- 如果从更大的命令标记解析内部命令,则命令标记的未使用部分包含在参数列表中
- 仅仅因为命令令牌被解析为内部命令并不意味着它会成功执行。关于如何解析参数和选项以及允许使用什么语法,每个内部命令都有自己的规则。
- 如果
/?
检测到,所有内部命令将打印帮助而不是执行其功能。大多数人都知道/?
它是否出现在参数中的任何地方。但是像 ECHO 和 SET 这样的一些命令只在第一个参数标记以/?
. - SET 有一些有趣的语义:
- 如果 SET 命令在启用变量名称和扩展名之前有引号
set "name=content" ignored
-->value=content
那么第一个等号和最后一个引号之间的文本将用作内容(排除第一个等号和最后一个引号)。忽略最后一个引号后的文本。如果等号后没有引号,则该行的其余部分用作内容。 - 如果 SET 命令在名称前没有引号
set name="content" not ignored
-->value="content" not ignored
则等于之后的整个行的其余部分用作内容,包括可能存在的任何和所有引号。
- 如果 SET 命令在启用变量名称和扩展名之前有引号
- 评估 IF 比较,并根据条件是真还是假,处理适当的已经解析的相关命令块,从阶段 5 开始。
- FOR 命令的 IN 子句被适当地迭代。
- 如果这是一个迭代命令块输出的 FOR /F,则:
- IN 子句通过 CMD /C 在新的 cmd.exe 进程中执行。
- 命令方块必须第二次完成整个解析过程,但这次是在命令行上下文中
- ECHO 开始时为 ON,延迟扩展通常开始时为禁用(取决于注册表设置)
- 一旦子 cmd.exe 进程终止,IN 子句命令块所做的所有环境更改都将丢失
- 对于每次迭代:
- FOR 变量值已定义
- 然后处理已经解析的 DO 命令块,从阶段 4 开始。
- 如果这是一个迭代命令块输出的 FOR /F,则:
- GOTO 使用以下逻辑来定位 :label
- 标签是从第一个参数标记中解析出来的
- 扫描脚本以查找下一次出现的标签
- 扫描从当前文件位置开始
- 如果到达文件末尾,则扫描循环回到文件的开头并继续到原始起点。
- 扫描在它找到的标签第一次出现时停止,文件指针被设置为紧跟标签的行。脚本的执行从该点恢复。请注意,成功的真正 GOTO 将立即中止任何已解析的代码块,包括 FOR 循环。
- 如果没有找到标签,或者缺少标签标记,则 GOTO 失败,打印错误消息,并弹出调用堆栈。这有效地用作 EXIT /B,除了在 GOTO 之后的当前命令块中任何已解析的命令仍会执行,但在 CALLer 的上下文中(存在于 EXIT /B 之后的上下文)
- 有关用于解析标签的规则的更精确描述,请参阅https://www.dostips.com/forum/viewtopic.php?f=3&t=3803。
- RENAME 和 COPY 都接受源路径和目标路径的通配符。但是微软在记录通配符的工作方式方面做得很糟糕,特别是对于目标路径。可以在 Windows RENAME 命令如何解释通配符中找到一组有用的通配符规则?
- 进行以下测试以确定未加引号的命令标记是否代表内部命令:
- 7.2 - 执行音量更改- 否则,如果命令标记不以引号开头,正好是两个字符长,并且第二个字符是冒号,则更改音量
- 所有参数标记都被忽略
- 如果找不到第一个字符指定的卷,则中止并出错
::
除非使用 SUBST 为 定义卷,否则命令标记 of将始终导致错误。::
如果使用 SUBST 为 定义卷::
,则该卷将被更改,不会被视为标签。
- 7.3 - 执行外部命令- 否则尝试将命令视为外部命令。
- 如果在命令行模式和命令不被引用,并且不与卷说明书中,空白开始,
,
,;
,=
或+
然后中断在第一次出现的令牌的命令<space>
,
;
或=
与预先设置其余的参数标记(或多个)。 - 如果命令标记的第二个字符是冒号,则验证可以找到第一个字符指定的卷。
如果找不到该卷,则中止并出现错误。 - 如果在批处理模式下并且命令标记以 开头
:
,则转到 7.4
请注意,如果标签标记以 开头::
,则不会到达,因为除非使用 SUBST 为 定义卷,否则前面的步骤将因错误而中止::
。 - 确定要执行的外部命令。
- 这是一个复杂的过程,可能涉及当前卷、当前目录、PATH 变量、PATHEXT 变量和/或文件关联。
- 如果无法识别有效的外部命令,则中止错误。
- 如果在命令行模式和令牌开头的命令
:
,然后转到7.4
注意,这是很少到达因为前面步骤将有一个错误终止,除非命令令牌开头::
,和SUBST被用来定义一个体积::
,并且整个命令令牌是外部命令的有效路径。 - 7.3.exec- 执行外部命令。
- 如果在命令行模式和命令不被引用,并且不与卷说明书中,空白开始,
- 7.4 - 忽略标签- 如果命令标记以
:
.开头,则忽略命令及其所有参数。
7.2 和 7.3 中的规则可能会阻止标签到达这一点。
Command Line Parser:
命令行解析器:
Works like the BatchLine-Parser, except:
像 BatchLine-Parser 一样工作,除了:
Phase 1) Percent Expansion:
阶段 1) 百分比扩张:
- No
%*
,%1
etc. argument expansion - If var is undefined, then
%var%
is left unchanged. - No special handling of
%%
. If var=content, then%%var%%
expands to%content%
.
- 不
%*
,%1
等等说法扩张 - 如果 var 未定义,则
%var%
保持不变。 - 没有特殊处理
%%
。如果 var=content,则%%var%%
扩展为%content%
。
Phase 3) Echo the parsed command(s)
阶段 3) 回显解析的命令
- This is not performed after phase 2. It is only performed after phase 4 for the FOR DO command block.
- 这不会在第 2 阶段之后执行。它仅在 FOR DO 命令块的第 4 阶段之后执行。
Phase 5) Delayed Expansion:only if DelayedExpansion is enabled
阶段 5) Delayed Expansion:仅当启用 DelayedExpansion 时
- If var is undefined, then
!var!
is left unchanged.
- 如果 var 未定义,则
!var!
保持不变。
Phase 7) Execute Command
阶段 7) 执行命令
- Attempts to CALL or GOTO a :label result in an error.
- As already documented in phase 7, an executed label may result in an error under different scenarios.
- Batch executed labels can only cause an error if they begin with
::
- Command line executed labels almost always result in an error
- Batch executed labels can only cause an error if they begin with
- 尝试调用或转到 :label 会导致错误。
- 正如阶段 7 中已经记录的那样,执行的标签在不同的场景下可能会导致错误。
- 批量执行的标签只有在以
::
- 命令行执行的标签几乎总是会导致错误
- 批量执行的标签只有在以
Parsing of integer values
解析整数值
There are many different contexts where cmd.exe parses integer values from strings, and the rules are inconsistent:
cmd.exe 解析字符串中的整数值有很多不同的上下文,并且规则不一致:
SET /A
IF
%var:~n,m%
(variable substring expansion)FOR /F "TOKENS=n"
FOR /F "SKIP=n"
FOR /L %%A in (n1 n2 n3)
EXIT [/B] n
SET /A
IF
%var:~n,m%
(可变子串扩展)FOR /F "TOKENS=n"
FOR /F "SKIP=n"
FOR /L %%A in (n1 n2 n3)
EXIT [/B] n
Details for these rules may be found at Rules for how CMD.EXE parses numbers
这些规则的详细信息可以在CMD.EXE 如何解析数字的规则中找到
For anyone wishing to improve the cmd.exe parsing rules, there is a discussion topic on the DosTips forumwhere issues can be reported and suggestions made.
对于任何希望改进 cmd.exe 解析规则的人,DosTips 论坛上有一个讨论主题,可以报告问题并提出建议。
Hope it helps
Jan Erik (jeb) - Original author and discoverer of phases
Dave Benham (dbenham) - Much additional content and editing
希望它可以帮助
Jan Erik (jeb) - 阶段的原作者和发现者
Dave Benham (dbenham) - 许多额外的内容和编辑
回答by Mike Clark
When invoking a command from a command window, tokenization of the command line arguments is not done by cmd.exe
(a.k.a. "the shell"). Most often the tokenization is done by the newly formed processes' C/C++ runtime, but this is not necessarily so -- for example, if the new process was not written in C/C++, or if the new process chooses to ignore argv
and process the raw commandline for itself (e.g. with GetCommandLine()). At the OS level, Windows passes command lines untokenized as a single string to new processes. This is in contrast to most *nix shells, where the shell tokenizes arguments in a consistent, predictable way before passing them to the newly formed process. All this means that you may experience wildly divergent argument tokenization behavior across different programs on Windows, as individual programs often take argument tokenization into their own hands.
从命令窗口调用命令时,命令行参数的标记化不是由cmd.exe
(又名“shell”)完成的。大多数情况下,标记化是由新形成的进程的 C/C++ 运行时完成的,但不一定如此——例如,如果新进程不是用 C/C++ 编写的,或者新进程选择忽略argv
并处理自己的原始命令行(例如使用GetCommandLine())。在操作系统级别,Windows 将未标记的命令行作为单个字符串传递给新进程。这与大多数 *nix shell 形成对比,在大多数 *nix shell 中,shell 在将参数传递给新形成的进程之前以一致的、可预测的方式标记参数。所有这一切意味着您可能会在 Windows 上的不同程序中遇到截然不同的参数标记化行为,因为单个程序通常将参数标记化掌握在自己的手中。
If it sounds like anarchy, it kind of is. However, since a large number of Windows programs doutilize the Microsoft C/C++ runtime's argv
, it may be generally useful to understand how the MSVCRT tokenizesarguments. Here is an excerpt:
如果这听起来像无政府状态,那确实是。但是,由于大量 Windows 程序确实使用了 Microsoft C/C++ 运行时的argv
,因此了解MSVCRT 如何标记参数通常很有用。这是摘录:
- Arguments are delimited by white space, which is either a space or a tab.
- A string surrounded by double quotation marks is interpreted as a single argument, regardless of white space contained within. A quoted string can be embedded in an argument. Note that the caret (^) is not recognized as an escape character or delimiter.
- A double quotation mark preceded by a backslash, \", is interpreted as a literal double quotation mark (").
- Backslashes are interpreted literally, unless they immediately precede a double quotation mark.
- If an even number of backslashes is followed by a double quotation mark, then one backslash () is placed in the argv array for every pair of backslashes (\), and the double quotation mark (") is interpreted as a string delimiter.
- If an odd number of backslashes is followed by a double quotation mark, then one backslash () is placed in the argv array for every pair of backslashes (\) and the double quotation mark is interpreted as an escape sequence by the remaining backslash, causing a literal double quotation mark (") to be placed in argv.
- 参数由空格分隔,空格或制表符。
- 被双引号包围的字符串被解释为单个参数,而不管其中包含的空格。带引号的字符串可以嵌入到参数中。请注意,插入符号 (^) 不被识别为转义字符或分隔符。
- 以反斜杠 \" 开头的双引号被解释为文字双引号 (")。
- 反斜杠按字面解释,除非它们紧跟在双引号之前。
- 如果偶数个反斜杠后跟一个双引号,那么每对反斜杠 (\) 都会在 argv 数组中放置一个反斜杠 (),并将双引号 (") 解释为字符串分隔符。
- 如果奇数个反斜杠后跟一个双引号,那么每对反斜杠 (\) 都会在 argv 数组中放置一个反斜杠 () 并且双引号被剩余的反斜杠解释为转义序列,导致要放置在 argv 中的文字双引号 (")。
The Microsoft "batch language" (.bat
) is no exception to this anarchic environment, and it has developed its own unique rules for tokenization and escaping. It also looks like cmd.exe's command prompt does do some preprocessing of the command line argument (mostly for variable substitution and escaping) before passing the argument off to the newly executing process. You can read more about the low-level details of the batch language and cmd escaping in the excellent answers by jeb and dbenham on this page.
Microsoft 的“批处理语言” ( .bat
) 也不例外这种无政府状态的环境,它开发了自己独特的标记化和转义规则。看起来 cmd.exe 的命令提示符在将参数传递给新执行的进程之前确实对命令行参数进行了一些预处理(主要用于变量替换和转义)。您可以在此页面上的 jeb 和 dbenham 的优秀答案中阅读有关批处理语言和 cmd 转义的低级详细信息的更多信息。
Let's build a simple command line utility in C and see what it says about your test cases:
让我们用 C 构建一个简单的命令行实用程序,看看它对您的测试用例有何看法:
int main(int argc, char* argv[]) {
int i;
for (i = 0; i < argc; i++) {
printf("argv[%d][%s]\n", i, argv[i]);
}
return 0;
}
(Notes: argv[0] is always the name of the executable, and is omitted below for brevity. Tested on Windows XP SP3. Compiled with Visual Studio 2005.)
(注意:argv[0] 始终是可执行文件的名称,为简洁起见在下面省略。在 Windows XP SP3 上测试。使用 Visual Studio 2005 编译。)
> test.exe "a ""b"" c"
argv[1][a "b" c]
> test.exe """a b c"""
argv[1]["a b c"]
> test.exe "a"" b c
argv[1][a" b c]
And a few of my own tests:
还有一些我自己的测试:
> test.exe a "b" c
argv[1][a]
argv[2][b]
argv[3][c]
> test.exe a "b c" "d e
argv[1][a]
argv[2][b c]
argv[3][d e]
> test.exe a \"b\" c
argv[1][a]
argv[2]["b"]
argv[3][c]
回答by dbenham
Percent Expansion Rules
百分比扩展规则
Here is an expanded explanation of Phase 1 in jeb's answer(Valid for both batch mode and command line mode).
这是jeb 回答中第 1 阶段的扩展说明(对批处理模式和命令行模式均有效)。
Phase 1) Percent ExpansionStarting from left, scan each character for %
or <LF>
. If found then
阶段 1) 百分比扩展从左侧开始,扫描每个字符以查找%
或<LF>
。如果找到那么
- 1.05 (truncate line at
<LF>
)- If the character is
<LF>
then- Drop (ignore) the remainder of the line from the
<LF>
onward - Goto Phase 1.5 (Strip
<CR>
)
- Drop (ignore) the remainder of the line from the
- Else the character must be
%
, so proceed to 1.1
- If the character is
- 1.1 (escape
%
)skipped if command line mode- If batch mode and followed by another
%
then
Replace%%
with single%
and continue scan
- If batch mode and followed by another
- 1.2 (expand argument)skipped if command line mode
- Else if batch mode then
- If followed by
*
and command extensions are enabled then
Replace%*
with the text of all command line arguments (Replace with nothing if there are no arguments) and continue scan. - Else if followed by
<digit>
then
Replace%<digit>
with argument value (replace with nothing if undefined) and continue scan. - Else if followed by
~
and command extensions are enabled then- If followed by optional valid list of argument modifiers followed by required
<digit>
then
Replace%~[modifiers]<digit>
with modified argument value (replace with nothing if not defined or if specified $PATH: modifier is not defined) and continue scan.
Note: modifiers are case insensitive and can appear multiple times in any order, except $PATH: modifier can only appear once and must be the last modifier before the<digit>
- Else invalid modified argument syntax raises fatal error: All parsed commands are aborted, and batch processing aborts if in batch mode!
- If followed by optional valid list of argument modifiers followed by required
- If followed by
- Else if batch mode then
- 1.3 (expand variable)
- Else if command extensions are disabled then
Look at next string of characters, breaking before%
or end of buffer, and call them VAR (may be an empty list)- If next character is
%
then- If VAR is defined then
Replace%VAR%
with value of VAR and continue scan - Else if batch mode then
Remove%VAR%
and continue scan - Else goto 1.4
- If VAR is defined then
- Else goto 1.4
- If next character is
- Else if command extensions are enabled then
Look at next string of characters, breaking before%
:
or end of buffer, and call them VAR (may be an empty list). If VAR breaks before:
and the subsequent character is%
then include:
as the last character in VAR and break before%
.- If next character is
%
then- If VAR is defined then
Replace%VAR%
with value of VAR and continue scan - Else if batch mode then
Remove%VAR%
and continue scan - Else goto 1.4
- If VAR is defined then
- Else if next character is
:
then- If VAR is undefined then
- If batch mode then
Remove%VAR:
and continue scan. - Else goto 1.4
- If batch mode then
- Else if next character is
~
then- If next string of characters matches pattern of
[integer][,[integer]]%
then
Replace%VAR:~[integer][,[integer]]%
with substring of value of VAR (possibly resulting in empty string) and continue scan. - Else goto 1.4
- If next string of characters matches pattern of
- Else if followed by
=
or*=
then
Invalid variable search and replace syntax raises fatal error: All parsed commands are aborted, and batch processing aborts if in batch mode! - Else if next string of characters matches pattern of
[*]search=[replace]%
, where search may include any set of characters except=
, and replace may include any set of characters except%
, then
Replace%VAR:[*]search=[replace]%
with value of VAR after performing search and replace (possibly resulting in empty string) and continue scan - Else goto 1.4
- If VAR is undefined then
- If next character is
- Else if command extensions are disabled then
- 1.4 (strip %)
- Else If batch mode then
Remove%
and continue scan starting with the next character after the%
- Else preserve the leading
%
and continue scan starting with the next character after the preserved leading%
- Else If batch mode then
- 1.05(在 处截断线
<LF>
)- 如果字符是
<LF>
那么- 降(忽略)的行的剩余部分从所述
<LF>
向前 - 转到第 1.5 阶段(条带
<CR>
)
- 降(忽略)的行的剩余部分从所述
- 否则字符必须是
%
,所以继续 1.1
- 如果字符是
- 1.1 (escape
%
)跳过命令行模式- 如果批处理模式,然后是另一个
%
然后
替换%%
为单个%
并继续扫描
- 如果批处理模式,然后是另一个
- 1.2(扩展参数)在命令行模式下跳过
- 否则如果批处理模式那么
- 如果
*
启用了后跟和命令扩展,则
替换%*
为所有命令行参数的文本(如果没有参数则替换为空)并继续扫描。 - Else if 后跟
<digit>
then
替换%<digit>
为参数值(如果未定义则替换为空)并继续扫描。 - 否则,如果后跟
~
和命令扩展已启用,则- 如果后面是可选的有效参数修饰符列表,然后是必需的,
<digit>
则
替换%~[modifiers]<digit>
为修改后的参数值(如果未定义或指定 $PATH: 修饰符未定义,则替换为空)并继续扫描。
注意:修饰符不区分大小写,可以以任何顺序出现多次,除了 $PATH: 修饰符只能出现一次,并且必须是最后一个修饰符之前<digit>
- 否则无效的修改参数语法会引发致命错误:所有解析的命令都被中止,如果在批处理模式下批处理将中止!
- 如果后面是可选的有效参数修饰符列表,然后是必需的,
- 如果
- 否则如果批处理模式那么
- 1.3(展开变量)
- 否则,如果命令扩展被禁用,则
查看下一个字符串,在%
缓冲区之前或结束之前中断,并将它们称为 VAR(可能是一个空列表)- 如果下一个字符是
%
那么- 如果定义了 VAR,则
用%VAR%
VAR 值替换并继续扫描 - 否则,如果批处理模式,则
删除%VAR%
并继续扫描 - 否则转到 1.4
- 如果定义了 VAR,则
- 否则转到 1.4
- 如果下一个字符是
- 否则,如果启用了命令扩展,则
查看下一个字符串,在%
:
缓冲区之前或结束之前中断,并将它们称为 VAR(可能是一个空列表)。如果 VAR 在之前中断:
并且随后的字符%
则:
作为 VAR 中的最后一个字符并在 之前中断%
。- 如果下一个字符是
%
那么- 如果定义了 VAR,则
用%VAR%
VAR 值替换并继续扫描 - 否则,如果批处理模式,则
删除%VAR%
并继续扫描 - 否则转到 1.4
- 如果定义了 VAR,则
- 否则,如果下一个字符是
:
那么- 如果 VAR 未定义,则
- 如果是批处理模式,则
删除%VAR:
并继续扫描。 - 否则转到 1.4
- 如果是批处理模式,则
- 否则,如果下一个字符是
~
那么- 如果下一个字符串与模式匹配,
[integer][,[integer]]%
则
替换%VAR:~[integer][,[integer]]%
为 VAR 值的子字符串(可能导致空字符串)并继续扫描。 - 否则转到 1.4
- 如果下一个字符串与模式匹配,
- 否则,如果后跟
=
或*=
然后
无效的变量搜索和替换语法会引发致命错误:所有解析的命令都被中止,如果在批处理模式下批处理将中止! - 否则,如果字符的下一个字符串匹配的图案
[*]search=[replace]%
,其中,搜索可以包括任意组除了字符=
,和替换可以包括除了任何字符集合%
,然后
替换%VAR:[*]search=[replace]%
执行搜索之后与VAR的值和替换(可能导致空字符串),并继续扫描 - 否则转到 1.4
- 如果 VAR 未定义,则
- 如果下一个字符是
- 否则,如果命令扩展被禁用,则
- 1.4(带材百分比)
- 否则如果批处理模式然后
删除%
并从下一个字符开始继续扫描%
- 否则保留前导
%
并从保留前导后的下一个字符开始继续扫描%
- 否则如果批处理模式然后
The above helps explain why this batch
以上有助于解释为什么这批
@echo off
setlocal enableDelayedExpansion
set "1var=varA"
set "~f1var=varB"
call :test "arg1"
exit /b
::
:test "arg1"
echo %%1var%% = %1var%
echo ^^^!1var^^^! = !1var!
echo --------
echo %%~f1var%% = %~f1var%
echo ^^^!~f1var^^^! = !~f1var!
exit /b
Gives these results:
给出这些结果:
%1var% = "arg1"var
!1var! = varA
--------
%~f1var% = P:\arg1var
!~f1var! = varB
Note 1- Phase 1 occurs prior to the recognition of REM statements. This is very important because it means even a remark can generate a fatal error if it has invalid argument expansion syntax or invalid variable search and replace syntax!
注 1- 第 1 阶段发生在 REM 语句的识别之前。这一点非常重要,因为这意味着,如果一个注释具有无效的参数扩展语法或无效的变量搜索和替换语法,甚至可能会产生致命错误!
@echo off
rem %~x This generates a fatal argument expansion error
echo this line is never reached
Note 2- Another interesting consequence of the % parsing rules: Variables containing : in the name can be defined, but they cannot be expanded unless command extensions are disabled. There is one exception - a variable name containing a single colon at the end can be expanded while command extensions are enabled. However, you cannot perform substring or search and replace operations on variable names ending with a colon. The batch file below (courtesy of jeb) demonstrates this behavior
注 2- % 解析规则的另一个有趣结果:可以定义名称中包含 : 的变量,但除非禁用命令扩展,否则它们不能扩展。有一个例外 - 在启用命令扩展时,可以扩展末尾包含单个冒号的变量名称。但是,您不能对以冒号结尾的变量名执行子字符串或搜索和替换操作。下面的批处理文件(由 jeb 提供)演示了这种行为
@echo off
setlocal
set var=content
set var:=Special
set var::=double colon
set var:~0,2=tricky
set var::~0,2=unfortunate
echo %var%
echo %var:%
echo %var::%
echo %var:~0,2%
echo %var::~0,2%
echo Now with DisableExtensions
setlocal DisableExtensions
echo %var%
echo %var:%
echo %var::%
echo %var:~0,2%
echo %var::~0,2%
Note 3- An interesting outcome of the order of the parsing rules that jeb lays out in his post: When performing find and replace with delayed expansion, special characters in both the find and replace terms must be escaped or quoted. But the situation is different for percent expansion - the find term must not be escaped (though it can be quoted). The percent replace string may or may not require escape or quote, depending on your intent.
注 3- jeb 在他的博文中列出的解析规则顺序的一个有趣结果:当使用延迟扩展执行查找和替换时,查找和替换项中的特殊字符必须被转义或引用。但是百分比扩展的情况有所不同 - 不能转义查找项(尽管它可以被引用)。百分比替换字符串可能需要也可能不需要转义或引号,具体取决于您的意图。
@echo off
setlocal enableDelayedExpansion
set "var=this & that"
echo %var:&=and%
echo "%var:&=and%"
echo !var:^&=and!
echo "!var:&=and!"
Delayed Expansion Rules
延迟扩展规则
Here is an expanded, and more accurate explanation of phase 5 in jeb's answer(Valid for both batch mode and command line mode)
这是jeb 答案中第 5 阶段的扩展且更准确的解释(适用于批处理模式和命令行模式)
Phase 5) Delayed Expansion
阶段 5)延迟扩张
This phase is skipped if any of the following conditions apply:
如果以下任何条件适用,则跳过此阶段:
- Delayed expansion is disabled.
- The command is within a parenthesized block on either side of a pipe.
- The incoming command token is a "naked" batch script, meaning it is not associated with
CALL
, parenthesized block, any form of command concatenation (&
,&&
or||
), or a pipe|
.
- 延迟扩展被禁用。
- 该命令位于管道两侧的括号内。
- 传入的命令标记是一个“裸”的批处理脚本,这意味着它不与
CALL
、带括号的块、任何形式的命令串联(&
、&&
或||
)或管道相关联|
。
The delayed expansion process is applied to tokens independently. A command may have multiple tokens:
延迟扩展过程独立应用于令牌。一个命令可能有多个令牌:
- The command token. For most commands the command name itself is a token. But a few commands have specialized regions that are considered a TOKEN for phase 5.
for ... in(TOKEN) do
if defined TOKEN
if exists TOKEN
if errorlevel TOKEN
if cmdextversion TOKEN
if TOKEN comparison TOKEN
, where comparison is one of==
,equ
,neq
,lss
,leq
,gtr
, orgeq
- The arguments token
- The destination token of redirection (one per redirection)
- 命令令牌。对于大多数命令,命令名称本身就是一个标记。但是一些命令具有被认为是阶段 5 的令牌的专门区域。
for ... in(TOKEN) do
if defined TOKEN
if exists TOKEN
if errorlevel TOKEN
if cmdextversion TOKEN
if TOKEN comparison TOKEN
, 其中比较是==
,equ
,neq
,lss
,leq
,gtr
, 或geq
- 参数标记
- 重定向的目标令牌(每个重定向一个)
No change is made to tokens that do not contain !
.
不包含!
.
For each token that does contain at least one !
, scan each character from left to right for ^
or !
, and if found, then
对于每个包含至少一个 的标记,!
从左到右扫描每个字符以查找^
or !
,如果找到,则
- 5.1 (caret escape)Needed for
!
or^
literals- If character is a caret
^
then- Remove the
^
- Scan the next character and preserve it as a literal
- Continue the scan
- Remove the
- If character is a caret
- 5.2 (expand variable)
- If character is
!
, then- If command extensions are disabled then
Look at next string of characters, breaking before!
or<LF>
, and call them VAR (may be an empty list)- If next character is
!
then- If VAR is defined, then
Replace!VAR!
with value of VAR and continue scan - Else if batch mode then
Remove!VAR!
and continue scan - Else goto 5.2.1
- If VAR is defined, then
- Else goto 5.2.1
- If next character is
- Else if command extensions are enabled then
Look at next string of characters, breaking before!
,:
, or<LF>
, and call them VAR (may be an empty list). If VAR breaks before:
and the subsequent character is!
then include:
as the last character in VAR and break before!
- If next character is
!
then- If VAR exists, then
Replace!VAR!
with value of VAR and continue scan - Else if batch mode then
Remove!VAR!
and continue scan - Else goto 5.2.1
- If VAR exists, then
- Else if next character is
:
then- If VAR is undefined then
- If batch mode then
Remove!VAR:
and continue scan - Else goto 5.2.1
- If batch mode then
- Else if next character is
~
then- If next string of characters matches pattern of
[integer][,[integer]]!
then Replace!VAR:~[integer][,[integer]]!
with substring of value of VAR (possibly resulting in empty string) and continue scan. - Else goto 5.2.1
- If next string of characters matches pattern of
- Else if next string of characters matches pattern of
[*]search=[replace]!
, where search may include any set of characters except=
, and replace may include any set of characters except!
, then
Replace!VAR:[*]search=[replace]!
with value of VAR after performing search and replace (possibly resulting in an empty string) and continue scan - Else goto 5.2.1
- If VAR is undefined then
- Else goto 5.2.1
- If next character is
- 5.2.1
- If batch mode then remove the leading
!
Else preserve the leading!
- Continue the scan starting with the next character after the preserved leading
!
- If batch mode then remove the leading
- If command extensions are disabled then
- If character is
- 5.1(脱字符)需要
!
或^
文字- 如果字符是插入符号,
^
则- 去除那个
^
- 扫描下一个字符并将其保留为文字
- 继续扫描
- 去除那个
- 如果字符是插入符号,
- 5.2(展开变量)
- 如果字符是
!
,那么- 如果命令扩展被禁用,则
查看下一个字符串,在!
或之前中断<LF>
,并将它们称为 VAR(可能是一个空列表)- 如果下一个字符是
!
那么- 如果定义了 VAR,则
替换!VAR!
为 VAR 的值并继续扫描 - 否则,如果批处理模式,则
删除!VAR!
并继续扫描 - 否则转到 5.2.1
- 如果定义了 VAR,则
- 否则转到 5.2.1
- 如果下一个字符是
- 否则,如果启用了命令扩展,然后
看看字符的下一个字符串,打破之前!
,:
或<LF>
,并呼吁他们VAR(可能是一个空的列表)。如果 VAR 在之前中断:
并且随后的字符被!
包含:
为 VAR 中的最后一个字符并在之前中断!
- 如果下一个字符是
!
那么- 如果 VAR 存在,则
替换!VAR!
为 VAR 的值并继续扫描 - 否则,如果批处理模式,则
删除!VAR!
并继续扫描 - 否则转到 5.2.1
- 如果 VAR 存在,则
- 否则,如果下一个字符是
:
那么- 如果 VAR 未定义,则
- 如果批处理模式则
删除!VAR:
并继续扫描 - 否则转到 5.2.1
- 如果批处理模式则
- 否则,如果下一个字符是
~
那么- 如果下一个字符串与模式匹配,
[integer][,[integer]]!
则替换!VAR:~[integer][,[integer]]!
为 VAR 值的子字符串(可能导致空字符串)并继续扫描。 - 否则转到 5.2.1
- 如果下一个字符串与模式匹配,
- 否则,如果字符的下一个字符串匹配的图案
[*]search=[replace]!
,其中,搜索可以包括任意组除了字符=
,和替换可以包括除了任何字符集合!
,然后
替换!VAR:[*]search=[replace]!
执行搜索之后与VAR的值和替换(可能导致一个空字符串),并继续扫描 - 否则转到 5.2.1
- 如果 VAR 未定义,则
- 否则转到 5.2.1
- 如果下一个字符是
- 5.2.1
- 如果批处理模式则删除前导
!
否则保留前导!
- 从保留的前导之后的下一个字符开始继续扫描
!
- 如果批处理模式则删除前导
- 如果命令扩展被禁用,则
- 如果字符是
回答by bobbogo
As pointed out, commands are passed the entire argument string in μSoft land, and it is up to them to parse this into separate arguments for their own use. There is no consistencty in this between different programs, and therefore there is no one set of rules to describe this process. You really need to check each corner case for whatever C library your program uses.
正如所指出的,命令在 μSoft 域中传递了整个参数字符串,并且由他们将其解析为单独的参数供自己使用。不同的程序之间在这方面没有一致性,因此没有一套规则来描述这个过程。您确实需要检查程序使用的任何 C 库的每个角落情况。
As far as the system .bat
files go, here is that test:
就系统.bat
文件而言,这是该测试:
c> type args.cmd
@echo off
echo cmdcmdline:[%cmdcmdline%]
echo 0:[%0]
echo *:[%*]
set allargs=%*
if not defined allargs goto :eof
setlocal
@rem Wot about a nice for loop?
@rem Then we are in the land of delayedexpansion, !n!, call, etc.
@rem Plays havoc with args like %t%, a"b etc. ugh!
set n=1
:loop
echo %n%:[%1]
set /a n+=1
shift
set param=%1
if defined param goto :loop
endlocal
Now we can run some tests. See if you can figure out just what μSoft are trying to do:
现在我们可以运行一些测试。看看您是否能弄清楚 μSoft 正在尝试做什么:
C>args a b c
cmdcmdline:[cmd.exe ]
0:[args]
*:[a b c]
1:[a]
2:[b]
3:[c]
Fine so far. (I'll leave out the uninteresting %cmdcmdline%
and %0
from now on.)
到目前为止还好。(我就离开了无趣%cmdcmdline%
,并%0
从现在开始。)
C>args *.*
*:[*.*]
1:[*.*]
No filename expansion.
没有文件名扩展。
C>args "a b" c
*:["a b" c]
1:["a b"]
2:[c]
No quote stripping, though quotes do prevent argument splitting.
没有引号剥离,尽管引号确实可以防止参数拆分。
c>args ""a b" c
*:[""a b" c]
1:[""a]
2:[b" c]
Consecutive double quotes causes them to lose any special parsing abilities they may have had. @Beniot's example:
连续的双引号会导致它们失去它们可能拥有的任何特殊解析能力。@Beniot 的例子:
C>args "a """ b "" c"""
*:["a """ b "" c"""]
1:["a """]
2:[b]
3:[""]
4:[c"""]
Quiz: How do you pass the value of any environment var as a singleargument (i.e., as %1
) to a bat file?
测验:如何将任何环境变量的值作为单个参数(即 as %1
)传递给 bat 文件?
c>set t=a "b c
c>set t
t=a "b c
c>args %t%
1:[a]
2:["b c]
c>args "%t%"
1:["a "b]
2:[c"]
c>Aaaaaargh!
Sane parsing seems forever broken.
理智的解析似乎永远被打破了。
For your entertainment, try adding miscellaneous ^
, \
, '
, &
(&c.) characters to these examples.
为了您的娱乐,请尝试在这些示例中添加各种^
, \
, '
, &
(&c.) 字符。
回答by SS64
You have some great answers above already, but to answer one part of your question:
你上面已经有一些很好的答案,但要回答你的问题的一部分:
set a =b, echo %a %b% c% → bb c%
What is happening there is that because you have a space before the =, a variable is created called %a<space>%
so when you echo %a %
that is evaluated correctly as b
.
发生的事情是因为您在 = 之前有一个空格,%a<space>%
所以当您echo %a %
正确评估为b
.
The remaining part b% c%
is then evaluated as plain text + an undefined variable % c%
, which should be echoed as typed, for me echo %a %b% c%
returns bb% c%
b% c%
然后将剩余部分评估为纯文本 + 一个未定义的变量% c%
,它应该作为输入回显,因为我echo %a %b% c%
返回bb% c%
I suspect that the ability to include spaces in variable names is more of an oversight than a planned 'feature'
我怀疑在变量名称中包含空格的能力与其说是计划中的“功能”,不如说是一种疏忽
回答by Benoit
edit: see accepted answer, what follows is wrong and explains only how to pass a command line to TinyPerl.
编辑:请参阅已接受的答案,以下内容是错误的,仅解释了如何将命令行传递给 TinyPerl。
Regarding quotes, I have the feeling that the behaviour is the following:
关于报价,我觉得行为如下:
- when a
"
is found, string globbing begins - when string globbing occurs:
- every character that is not a
"
is globbed - when a
"
is found:- if it is followed by
""
(thus a triple"
) then a double quote is added to the string - if it is followed by
"
(thus a double"
) then a double quote is added to the string and string globbing ends - if the next character is not
"
, string globbing ends
- if it is followed by
- when line ends, string globbing ends.
- every character that is not a
- 当
"
找到a 时,字符串通配开始 - 当字符串通配发生时:
- 每个不是 a 的字符都
"
被打乱了 - 当
"
找到a 时:- 如果后跟
""
(因此是三元组"
),则将双引号添加到字符串中 - 如果后跟
"
(因此是 double"
),则将双引号添加到字符串中,并且字符串通配结束 - 如果下一个字符不是
"
,则字符串通配结束
- 如果后跟
- 当行结束时,字符串通配结束。
- 每个不是 a 的字符都
In short:
简而言之:
"a """ b "" c"""
consists of two strings: a " b "
and c"
"a """ b "" c"""
由两个字符串组成:a " b "
和c"
"a""
, "a"""
and"a""""
are all the same string if at the end of a line
"a""
,"a"""
和"a""""
都是相同的字符串,如果在一行的末尾