Windows 和 Linux 目录名称中禁止使用哪些字符?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1976007/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What characters are forbidden in Windows and Linux directory names?
提问by Jeff
I know that / is illegal in Linux, and the following are illegal in Windows
(I think) *
.
"
/
\
[
]
:
;
|
,
我知道 / 在 Linux 中是非法的,以下在 Windows 中是非法的(我认为) *
.
"
/
\
[
]
:
;
|
,
What else am I missing?
我还缺少什么?
I need a comprehensive guide, however, and one that takes into account double-byte characters. Linking to outside resources is fine with me.
但是,我需要一份综合指南,并且考虑到双字节字符。链接到外部资源对我来说很好。
I need to first create a directory on the filesystem using a name that may contain forbidden characters, so I plan to replace those characters with underscores. I then need to write this directory and its contents to a zip file (using Java), so any additional advice concerning the names of zip directories would be appreciated.
我需要首先使用可能包含禁用字符的名称在文件系统上创建一个目录,因此我计划用下划线替换这些字符。然后我需要将此目录及其内容写入一个 zip 文件(使用 Java),因此任何有关 zip 目录名称的其他建议将不胜感激。
采纳答案by Dour High Arch
A “comprehensive guide” of forbidden filename characters is not going to work on Windows because it reserves filenames as well as characters. Yes, characters like
*
"
?
and others are forbidden, but there are a infinite number of names composed only of valid characters that are forbidden. For example, spaces and dots are valid filename characters, but names composed only of those characters are forbidden.
禁止文件名字符的“综合指南”在 Windows 上不起作用,因为它保留了文件名和字符。是的,像这样的字符*
"
?
和其他字符
是被禁止的,但是有无数的名字只由被禁止的有效字符组成。例如,空格和点是有效的文件名字符,但禁止仅由这些字符组成的名称。
Windows does not distinguish between upper-case and lower-case characters, so you cannot create a folder named A
if one named a
already exists. Worse, seemingly-allowed names like PRN
and CON
, and many others, are reserved and not allowed. Windows also has several length restrictions; a filename valid in one folder may become invalid if moved to another folder. The rules for
naming files and foldersare on the Microsoft docs.
Windows 不区分大写和小写字符,因此A
如果a
已存在一个已命名的文件夹,则您无法创建一个已命名的文件夹。更糟糕的是,看似允许的名称,如PRN
和CON
,以及许多其他名称,是保留的,不允许使用。Windows 也有几个长度限制;如果移动到另一个文件夹,在一个文件夹中有效的文件名可能会变得无效。命名文件和文件夹的规则
在 Microsoft 文档中。
You cannot, in general, use user-generated text to create Windows directory names. If you want to allow users to name anything they want, you have to create safe names like A
, AB
, A2
et al., store user-generated names and their path equivalents in an application data file, and perform path mapping in your application.
通常,您不能使用用户生成的文本来创建 Windows 目录名称。如果你想允许用户他们想要什么名字,你必须创建安全的名字,如A
,AB
,A2
等,存储用户生成的名称和应用程序数据文件的路径等价物,并在你的应用程序中执行路径映射。
If you absolutely must allow user-generated folder names, the only way to tell if they are invalid is to catch exceptions and assume the name is invalid. Even that is fraught with peril, as the exceptions thrown for denied access, offline drives, and out of drive space overlap with those that can be thrown for invalid names. You are opening up one huge can of hurt.
如果您绝对必须允许用户生成的文件夹名称,则判断它们是否无效的唯一方法是捕获异常并假设名称无效。即便如此也充满危险,因为拒绝访问、脱机驱动器和驱动器空间不足引发的异常与无效名称引发的异常重叠。你正在打开一个巨大的伤害罐头。
回答by Leonardo Herrera
Well, if only for research purposes, then your best bet is to look at this Wikipedia entry on Filenames.
好吧,如果只是出于研究目的,那么最好的办法是查看有关 Filenames 的 Wikipedia 条目。
If you want to write a portable function to validate user input and create filenames based on that, the short answer is don't. Take a look at a portable module like Perl's File::Specto have a glimpse to all the hops needed to accomplish such a "simple" task.
如果你想编写一个可移植的函数来验证用户输入并基于它创建文件名,简短的回答是不要。看看像 Perl 的File::Spec这样的可移植模块,可以一瞥完成这样一个“简单”任务所需的所有跃点。
回答by Jonathan Leffler
Under Linux and other Unix-related systems, there are only two characters that cannot appear in the name of a file or directory, and those are NUL '\0'
and slash '/'
. The slash, of course, can appear in a path name, separating directory components.
在 Linux 等 Unix 相关系统下,文件名或目录名中不能出现的字符只有两个,即 NUL'\0'
和 slash '/'
。当然,斜线可以出现在路径名中,分隔目录组件。
Rumour1has it that Steven Bourne (of 'shell' fame) had a directory containing 254 files, one for every single letter (character code) that can appear in a file name (excluding /
, '\0'
; the name .
was the current directory, of course). It was used to test the Bourne shell and routinely wrought havoc on unwary programs such as backup programs.
谣言1说 Steven Bourne(以“shell”闻名)有一个包含 254 个文件的目录,每个文件名(不包括/
, '\0'
;.
当然,名称是当前目录)中的每个字母(字符代码)一个)。它被用来测试 Bourne shell,并经常对不谨慎的程序(如备份程序)造成严重破坏。
Other people have covered the Windows rules.
其他人已经涵盖了 Windows 规则。
Note that MacOS X has a case-insensitive file system.
请注意,MacOS X 具有不区分大小写的文件系统。
11是 Kernighan & PikeThe Practice of Programming在第 6 章,测试,§6.5 压力测试中说过的编程实践:
When Steve Bourne was writing his Unix shell (which came to be known as the Bourne shell), he made a directory of 254 files with one-character names, one for each byte value except
'\0'
and slash, the two characters that cannot appear in Unix file names. He used that directory for all manner of tests of pattern-matching and tokenization. (The test directory was of course created by a program.) For years afterwards, that directory was the bane of file-tree-walking programs; it tested them to destruction.
当 Steve Bourne 在编写他的 Unix shell(后来被称为 Bourne shell)时,他创建了一个包含 254 个文件的目录,文件名只有一个字符,每个字节值一个,除了
'\0'
斜杠和斜杠,这两个字符不能出现在 Unix 中文件名。他使用该目录进行各种模式匹配和标记化测试。(测试目录当然是由程序创建的。)多年之后,该目录是文件树遍历程序的祸根;它考验了他们的毁灭性。
Note that the directory must have contained entries .
and ..
, so it was arguably 253 files (and 2 directories), or 255 name entries, rather than 254 files. This doesn't affect the effectiveness of the anecdote, or the careful testing it describes.
请注意,目录必须包含条目.
and ..
,因此可以说是 253 个文件(和 2 个目录),或 255 个名称条目,而不是 254 个文件。这不会影响轶事的有效性,或它描述的仔细测试。
回答by AeonOfTime
Instead of creating a blacklist of characters, you could use a whitelist. All things considered, the range of characters that make sense in a file or directory name context is quite short, and unless you have some very specific naming requirements your users will not hold it against your application if they cannot use the whole ASCII table.
您可以使用whitelist,而不是创建字符黑名单。考虑到所有因素,在文件或目录名称上下文中有意义的字符范围非常短,除非您有一些非常具体的命名要求,否则您的用户如果不能使用整个 ASCII 表,就不会反对您的应用程序。
It does not solve the problem of reserved names in the target file system, but with a whitelist it is easier to mitigate the risks at the source.
它没有解决目标文件系统中保留名称的问题,但使用白名单可以更容易地降低源头的风险。
In that spirit, this is a range of characters that can be considered safe:
本着这种精神,这是一系列可以被认为是安全的字符:
- Letters (a-z A-Z) - Unicode characters as well, if needed
- Digits (0-9)
- Underscore (_)
- Hyphen (-)
- Space
- Dot (.)
- 字母 (az AZ) - Unicode 字符,如果需要
- 数字 (0-9)
- 下划线 (_)
- 连字符 (-)
- 空间
- 点 (.)
And any additional safe characters you wish to allow. Beyond this, you just have to enforce some additional rules regarding spaces and dots. This is usually sufficient:
以及您希望允许的任何其他安全字符。除此之外,您只需要强制执行一些有关空格和点的附加规则。这通常就足够了:
- Name must contain at least one letter or number (to avoid only dots/spaces)
- Name must start with a letter or number (to avoid leading dots/spaces)
- Name may not end with a dot or space (simply trim those if present, like Explorer does)
- 名称必须至少包含一个字母或数字(以避免仅包含点/空格)
- 名称必须以字母或数字开头(以避免前导点/空格)
- 名称不能以点或空格结尾(如果存在,只需修剪那些,就像资源管理器一样)
This already allows quite complex and nonsensical names. For example, these names would be possible with these rules, and be valid file names in Windows/Linux:
这已经允许相当复杂和无意义的名称。例如,这些名称可能符合这些规则,并且是 Windows/Linux 中的有效文件名:
A...........ext
B -.- .ext
A...........ext
B -.- .ext
In essence, even with so few whitelisted characters you should still decide what actually makes sense, and validate/adjust the name accordingly. In one of my applications, I used the same rules as above but stripped any duplicate dots and spaces.
从本质上讲,即使白名单字符很少,您仍然应该决定什么是真正有意义的,并相应地验证/调整名称。在我的一个应用程序中,我使用了与上述相同的规则,但删除了所有重复的点和空格。
回答by Christopher Oezbek
Let's keep it simple and answer the question, first.
让我们保持简单,首先回答这个问题。
The forbidden printable ASCII charactersare:
Linux/Unix:
/ (forward slash)
Windows:
< (less than) > (greater than) : (colon - sometimes works, but is actually NTFS Alternate Data Streams) " (double quote) / (forward slash) \ (backslash) | (vertical bar or pipe) ? (question mark) * (asterisk)
Non-printable characters
If your data comes from a source that would permit non-printable characters then there is more to check for.
Linux/Unix:
0 (NULL byte)
Windows:
0-31 (ASCII control characters)
Note:While it is legal under Linux/Unix file systems to create files with control characters in the filename, it might be a nightmare for the users to deal with such files.
Reserved file names
The following filenames are reserved:
Windows:
CON, PRN, AUX, NUL COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9 LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9
(both on their own and with arbitrary file extensions, e.g.
LPT1.txt
).
Other rules
Windows:
Filenames cannot end in a space or dot.
禁止打印的 ASCII 字符是:
Linux/Unix:
/ (forward slash)
视窗:
< (less than) > (greater than) : (colon - sometimes works, but is actually NTFS Alternate Data Streams) " (double quote) / (forward slash) \ (backslash) | (vertical bar or pipe) ? (question mark) * (asterisk)
不可打印字符
如果您的数据来自允许不可打印字符的来源,则需要检查更多内容。
Linux/Unix:
0 (NULL byte)
视窗:
0-31 (ASCII control characters)
注意:虽然在 Linux/Unix 文件系统下创建文件名中带有控制字符的文件是合法的,但用户处理这些文件可能是一场噩梦。
保留文件名
保留以下文件名:
视窗:
CON, PRN, AUX, NUL COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9 LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9
(无论是单独的还是具有任意文件扩展名的,例如
LPT1.txt
)。
其他规则
视窗:
文件名不能以空格或点结尾。
回答by chrisjej
The easy way to get Windows to tell you the answer is to attempt to rename a file via Explorer and type in / for the new name. Windows will popup a message box telling you the list of illegal characters.
让 Windows 告诉您答案的简单方法是尝试通过资源管理器重命名文件并键入 / 作为新名称。Windows 将弹出一个消息框,告诉您非法字符列表。
A filename cannot contain any of the following characters:
\ / : * ? " < > |
回答by Meng Lu
I had the same need and was looking for recommendation or standard references and came across this thread. My current blacklist of characters that should be avoided in file and directory names are:
我有同样的需求,正在寻找推荐或标准参考,并遇到了这个线程。我目前应该避免在文件和目录名称中使用的字符黑名单是:
$CharactersInvalidForFileName = {
"pound" -> "#",
"left angle bracket" -> "<",
"dollar sign" -> "$",
"plus sign" -> "+",
"percent" -> "%",
"right angle bracket" -> ">",
"exclamation point" -> "!",
"backtick" -> "`",
"ampersand" -> "&",
"asterisk" -> "*",
"single quotes" -> "“",
"pipe" -> "|",
"left bracket" -> "{",
"question mark" -> "?",
"double quotes" -> "”",
"equal sign" -> "=",
"right bracket" -> "}",
"forward slash" -> "/",
"colon" -> ":",
"back slash" -> "\",
"lank spaces" -> "b",
"at sign" -> "@"
};
回答by Dogg Bookins
Though the only illegal Unix chars might be /
and NULL
, although some consideration for command line interpretation should be included.
尽管唯一的非法 Unix 字符可能是/
and NULL
,但应该包括对命令行解释的一些考虑。
For example, while it might be legal to name a file 1>&2
or 2>&1
in Unix, file names such as this might be misinterpreted when used on a command line.
例如,虽然命名文件1>&2
或2>&1
在 Unix 中可能是合法的,但在命令行上使用时,诸如此类的文件名可能会被误解。
Similarly it might be possible to name a file $PATH
, but when trying to access it from the command line, the shell will translate $PATH
to its variable value.
类似地,也可以命名文件$PATH
,但是当尝试从命令行访问它时,shell 将转换$PATH
为其变量值。
回答by forthy42
In Unix shells, you can quote almost every character in single quotes '
. Except the single quote itself, and you can't express control characters, because \
is not expanded. Accessing the single quote itself from within a quoted string is possible, because you can concatenate strings with single and double quotes, like 'I'"'"'m'
which can be used to access a file called "I'm"
(double quote also possible here).
在 Unix shell 中,您几乎可以用单引号引用每个字符'
。除了单引号本身,你不能表达控制字符,因为\
没有扩展。从带引号的字符串中访问单引号本身是可能的,因为您可以将字符串与单引号和双引号连接起来,就像'I'"'"'m'
可以用来访问一个名为的文件"I'm"
(此处也可以使用双引号)。
So you should avoid all control characters, because they are too difficult to enter in the shell. The rest still is funny, especially files starting with a dash, because most commands read those as options unless you have two dashes --
before, or you specify them with ./
, which also hides the starting -
.
所以你应该避免使用所有控制字符,因为它们太难在 shell 中输入。其余的仍然很有趣,尤其是以破折号开头的文件,因为大多数命令将它们读作选项,除非您--
之前有两个破折号,或者您使用 指定它们./
,这也隐藏了开头的-
.
If you want to be nice, don't use any of the characters the shell and typical commands use as syntactical elements, sometimes position dependent, so e.g. you can still use -
, but not as first character; same with .
, you can use it as first character only when you mean it ("hidden file"). When you are mean, your file names are VT100 escape sequences ;-), so that an ls garbles the output.
如果您想保持友好,请不要使用 shell 和典型命令用作句法元素的任何字符,有时取决于位置,因此例如您仍然可以使用-
,但不能用作第一个字符;与 相同.
,只有当您有意使用它(“隐藏文件”)时,您才能将其用作第一个字符。如果您是刻薄的,您的文件名是 VT100 转义序列 ;-),因此 ls 会使输出出现乱码。
回答by FCastro
As of 18/04/2017, no simple black or white list of characters and filenames is evident among the answers to this topic - and there are many replies.
截至 2017 年 4 月 18 日,该主题的答案中没有明显的字符和文件名的简单黑白名单 - 并且有很多回复。
The best suggestion I could come up with was to let the user name the file however he likes. Using an error handler when the application tries to save the file, catch any exceptions, assume the filename is to blame (obviously after making sure the save path was ok as well), and prompt the user for a new file name. For best results, place this checking procedure within a loop that continues until either the user gets it right or gives up. Worked best for me (at least in VBA).
我能想出的最好建议是让用户随意命名文件。当应用程序尝试保存文件时使用错误处理程序,捕获任何异常,假设文件名是罪魁祸首(显然在确保保存路径也正确之后),并提示用户输入新文件名。为获得最佳结果,请将此检查过程置于循环中,该循环一直持续到用户做对或放弃为止。对我来说效果最好(至少在 VBA 中)。