C# 如何在 Windows 下检查给定的字符串是否是合法/有效的文件名?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/62771/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 10:21:28  来源:igfitidea点击:

How do I check if a given string is a legal/valid file name under Windows?

提问by tomash

I want to include a batch file rename functionality in my application. A user can type a destination filename pattern and (after replacing some wildcards in the pattern) I need to check if it's going to be a legal filename under Windows. I've tried to use regular expression like [a-zA-Z0-9_]+but it doesn't include many national-specific characters from various languages (e.g. umlauts and so on). What is the best way to do such a check?

我想在我的应用程序中包含批处理文件重命名功能。用户可以键入目标文件名模式,并且(在替换模式中的一些通配符后)我需要检查它是否将成为 Windows 下的合法文件名。我尝试使用正则表达式,[a-zA-Z0-9_]+但它不包含来自各种语言(例如变音符号等)的许多特定于国家的字符。进行此类检查的最佳方法是什么?

采纳答案by Eugene Katz

You can get a list of invalid characters from Path.GetInvalidPathCharsand GetInvalidFileNameChars.

您可以从Path.GetInvalidPathChars和获取无效字符列表GetInvalidFileNameChars

UPD:See Steve Cooper's suggestionon how to use these in a regular expression.

UPD:请参阅Steve Cooper关于如何在正则表达式中使用这些的建议

UPD2:Note that according to the Remarks section in MSDN "The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names." The answer provided by sixlettervaliablesgoes into more details.

UPD2:请注意,根据 MSDN 中的备注部分“不能保证从此方法返回的数组包含在文件和目录名称中无效的完整字符集。” Sixlettervaliables 提供的答案更详细。

回答by ConroyP

Rather than explicitly include all possible characters, you could do a regex to check for the presence of illegal characters, and report an error then. Ideally your application should name the files exactly as the user wishes, and only cry foul if it stumbles across an error.

您可以执行正则表达式来检查是否存在非法字符,而不是明确包含所有可能的字符,然后报告错误。理想情况下,您的应用程序应该完全按照用户的意愿命名文件,并且只有在遇到错误时才会抱怨。

回答by Mark Biek

From MSDN, here's a list of characters that aren't allowed:

MSDN,这里是不允许的字符列表:

Use almost any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255), except for the following:

  • The following reserved characters are not allowed: < > : " / \ | ? *
  • Characters whose integer representations are in the range from zero through 31 are not allowed.
  • Any other character that the target file system does not allow.

几乎可以使用当前代码页中的任何字符作为名称,包括 Unicode 字符和扩展字符集 (128–255) 中的字符,但以下字符除外:

  • 不允许使用以下保留字符:< > :" / \ | ? *
  • 不允许使用整数表示在 0 到 31 范围内的字符。
  • 目标文件系统不允许的任何其他字符。

回答by Justin Poliey

Windows filenames are pretty unrestrictive, so really it might not even be thatmuch of an issue. The characters that are disallowed by Windows are:

的Windows文件名是相当不受限制,所以实际上它甚至都不太大的问题。Windows 不允许使用的字符是:

\ / : * ? " < > |

You could easily write an expression to check if those characters are present. A better solution though would be to try and name the files as the user wants, and alert them when a filename doesn't stick.

您可以轻松编写一个表达式来检查这些字符是否存在。更好的解决方案是尝试根据用户的需要命名文件,并在文件名不正确时提醒他们。

回答by Justin Poliey

Also CON, PRN, AUX, NUL, COM# and a few others are never legal filenames in any directory with any extension.

此外,CON、PRN、AUX、NUL、COM# 和其他一些在任何带有任何扩展名的目录中都不是合法的文件名。

回答by Martin Faartoft

Microsoft Windows: Windows kernel forbids the use of characters in range 1-31 (i.e., 0x01-0x1F) and characters " * : < > ? \ |. Although NTFS allows each path component (directory or filename) to be 255 characters long and paths up to about 32767 characters long, the Windows kernel only supports paths up to 259 characters long. Additionally, Windows forbids the use of the MS-DOS device names AUX, CLOCK$, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, CON, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9, NUL and PRN, as well as these names with any extension (for example, AUX.txt), except when using Long UNC paths (ex. \.\C:\nul.txt or \?\D:\aux\con). (In fact, CLOCK$ may be used if an extension is provided.) These restrictions only apply to Windows - Linux, for example, allows use of " * : < > ? \ | even in NTFS.

Microsoft Windows:Windows 内核禁止使用 1-31 范围内的字符(即 0x01-0x1F)和字符“ * : < > ? \ |。虽然 NTFS 允许每个路径组件(目录或文件名)的长度为 255 个字符并且路径最长约 32767 个字符,Windows 内核仅支持最长 259 个字符的路径。此外,Windows 禁止使用 MS-DOS 设备名称 AUX、CLOCK$、COM1、COM2、COM3、COM4、COM5、COM6、 COM7、COM8、COM9、CON、LPT1、LPT2、LPT3、LPT4、LPT5、LPT6、LPT7、LPT8、LPT9、NUL 和 PRN,以及带有任何扩展名的这些名称(例如 AUX.txt),除非使用长 UNC 路径(例如 \.\C:\nul.txt 或 \?\D:\aux\con)。(实际上,如果提供扩展,则可以使用 CLOCK$。)这些限制仅适用于 Windows -例如,Linux 允许使用 " * : < > ? \ | 即使在 NTFS 中。

Source: http://en.wikipedia.org/wiki/Filename

来源:http: //en.wikipedia.org/wiki/Filename

回答by Steve Cooper

For .Net Frameworks prior to 3.5this should work:

对于3.5 之前的 .Net 框架,这应该有效:

Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.InvalidPathCharsconstant;

正则表达式匹配应该会让你有所收获。这是一个使用System.IO.Path.InvalidPathChars常量的片段;

bool IsValidFilename(string testName)
{
    Regex containsABadCharacter = new Regex("[" 
          + Regex.Escape(System.IO.Path.InvalidPathChars) + "]");
    if (containsABadCharacter.IsMatch(testName)) { return false; };

    // other checks for UNC, drive-path format, etc

    return true;
}

For .Net Frameworks after 3.0this should work:

对于3.0 之后的 .Net Frameworks,这应该有效:

http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidpathchars(v=vs.90).aspx

http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidpathchars(v=vs.90).aspx

Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.GetInvalidPathChars()constant;

正则表达式匹配应该会让你有所收获。这是一个使用System.IO.Path.GetInvalidPathChars()常量的片段;

bool IsValidFilename(string testName)
{
    Regex containsABadCharacter = new Regex("["
          + Regex.Escape(new string(System.IO.Path.GetInvalidPathChars())) + "]");
    if (containsABadCharacter.IsMatch(testName)) { return false; };

    // other checks for UNC, drive-path format, etc

    return true;
}

Once you know that, you should also check for different formats, eg c:\my\driveand \\server\share\dir\file.ext

一旦你知道了,你还应该检查不同的格式,例如c:\my\drive\\server\share\dir\file.ext

回答by kfh

The question is are you trying to determine if a path name is a legal windows path, or if it's legal on the system where the code is running.? I think the latter is more important, so personally, I'd probably decompose the full path and try to use _mkdir to create the directory the file belongs in, then try to create the file.

问题是您是否试图确定路径名是否是合法的 Windows 路径,或者它在运行代码的系统上是否合法? 我认为后者更重要,所以我个人可能会分解完整路径并尝试使用_mkdir创建文件所属的目录,然后尝试创建文件。

This way you know not only if the path contains only valid windows characters, but if it actually represents a path that can be written by this process.

通过这种方式,您不仅可以知道路径是否仅包含有效的 Windows 字符,还可以知道它是否确实代表了可由该进程写入的路径。

回答by user7116

From MSDN's "Naming a File or Directory,"here are the general conventions for what a legal file name is under Windows:

MSDN 的“命名文件或目录”中,以下是 Windows 下合法文件名的一般约定:

You may use any character in the current code page (Unicode/ANSI above 127), except:

您可以使用当前代码页中的任何字符(Unicode/ANSI 127 以上),除了:

  • <>:"/\|?*
  • Characters whose integer representations are 0-31 (less than ASCII space)
  • Any other character that the target file system does not allow (say, trailing periods or spaces)
  • Any of the DOS names: CON, PRN, AUX, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 (and avoid AUX.txt, etc)
  • The file name is all periods
  • <>:"/\|?*
  • 整数表示为 0-31(小于 ASCII 空间)的字符
  • 目标文件系统不允许的任何其他字符(例如,尾随句点或空格)
  • 任何 DOS 名称:CON、PRN、AUX、NUL、COM0、COM1、COM2、COM3、COM4、COM5、COM6、COM7、COM8、COM9、LPT0、LPT1、LPT2、LPT3、LPT4、LPT5、LPT6、LPT7、 LPT8、LPT9(避免使用 AUX.txt 等)
  • 文件名是所有句点

Some optional things to check:

一些可选的检查事项:

  • File paths (including the file name) may not have more than 260 characters (that don't use the \?\prefix)
  • Unicode file paths (including the file name) with more than 32,000 characters when using \?\(note that prefix may expand directory components and cause it to overflow the 32,000 limit)
  • 文件路径(包括文件名)不能超过 260 个字符(不使用\?\前缀)
  • 使用时超过 32,000 个字符的 Unicode 文件路径(包括文件名)\?\(注意前缀可能会扩展目录组件并导致其超出 32,000 个限制)

回答by user7116

Try to use it, and trap for the error. The allowed set may change across file systems, or across different versions of Windows. In other words, if you want know if Windows likes the name, hand it the name and let it tell you.

尝试使用它,并捕获错误。允许的集合可能会在不同的文件系统或不同版本的 Windows 之间发生变化。换句话说,如果您想知道 Windows 是否喜欢这个名称,请将名称交给它并让它告诉您。