C# 如何从任意字符串生成有效的 Windows 文件名?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/620605/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 10:35:44  来源:igfitidea点击:

How to make a valid Windows filename from an arbitrary string?

c#windowsfilenames

提问by Ken

I've got a string like "Foo: Bar" that I want to use as a filename, but on Windows the ":" char isn't allowed in a filename.

我有一个像 "Foo: Bar" 这样的字符串,我想用作文件名,但在 Windows 上,文件名中不允许使用 ":" 字符。

Is there a method that will turn "Foo: Bar" into something like "Foo- Bar"?

有没有一种方法可以将“Foo: Bar”变成“Foo-Bar”之类的东西?

采纳答案by Diego Jancic

Try something like this:

尝试这样的事情:

string fileName = "something";
foreach (char c in System.IO.Path.GetInvalidFileNameChars())
{
   fileName = fileName.Replace(c, '_');
}

Edit:

编辑:

Since GetInvalidFileNameChars()will return 10 or 15 chars, it's better to use a StringBuilderinstead of a simple string; the original version will take longer and consume more memory.

由于GetInvalidFileNameChars()将返回 10 或 15 个字符,因此最好使用 aStringBuilder而不是简单的字符串;原始版本将花费更长的时间并消耗更多内存。

回答by Phil Price

fileName = fileName.Replace(":", "-") 

However ":" is not the only illegal character for Windows. You will also have to handle:

然而,“:”并不是 Windows 唯一的非法字符。您还必须处理:

/, \, :, *, ?, ", <, > and |

These are contained in System.IO.Path.GetInvalidFileNameChars();

这些包含在 System.IO.Path.GetInvalidFileNameChars() 中;

Also (on Windows), "." cannot be the only character in the filename (both ".", "..", "...", and so on are invalid). Be careful when naming files with ".", for example:

还有(在 Windows 上),“.” 不能是文件名中的唯一字符(“.”、“..”、“...”等都是无效的)。使用“.”命名文件时要小心,例如:

echo "test" > .test.

Will generate a file named ".test"

将生成一个名为“.test”的文件

Lastly, if you reallywant to do things correctly, there are some special file namesyou need to look out for. On Windowsyou can't create files named:

最后,如果你真的想正确地做事,你需要注意一些特殊的文件名在 Windows 上,您无法创建名为:

CON, PRN, AUX, CLOCK$, NUL
COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9
LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.

回答by leggetter

Diego does have the correct solution but there is one very small mistake in there. The version of string.Replace being used should be string.Replace(char, char), there isn't a string.Replace(char, string)

迭戈确实有正确的解决方案,但其中有一个非常小的错误。正在使用的 string.Replace 版本应该是 string.Replace(char, char),没有 string.Replace(char, string)

I can't edit the answer or I would have just made the minor change.

我无法编辑答案,否则我只会做一些小改动。

So it should be:

所以应该是:

string fileName = "something";
foreach (char c in System.IO.Path.GetInvalidFileNameChars())
{
   fileName = fileName.Replace(c, '_');
}

回答by D W

You can do this with a sedcommand:

您可以使用以下sed命令执行此操作:

 sed -e "
 s/[?()\[\]=+<>:;??”,*|]/_/g
 s/"$'\t'"/ /g
 s/–/-/g
 s/\"/_/g
 s/[[:cntrl:]]/_/g"

回答by Joseph Gabriel

This isn't more efficient, but it's more fun :)

这不是更有效,但它更有趣:)

var fileName = "foo:bar";
var invalidChars = System.IO.Path.GetInvalidFileNameChars();
var cleanFileName = new string(fileName.Where(m => !invalidChars.Contains(m)).ToArray<char>());

回答by Joan Vilari?o

I needed to do this today... in my case, I needed to concatenate a customer name with the date and time for a final .kmz file. My final solution was this:

我今天需要这样做……就我而言,我需要将客户名称与日期和时间连接起来,以生成最终的 .kmz 文件。我的最终解决方案是这样的:

 string name = "Whatever name with valid/invalid chars";
 char[] invalid = System.IO.Path.GetInvalidFileNameChars();
 string validFileName = string.Join(string.Empty,
                            string.Format("{0}.{1:G}.kmz", name, DateTime.Now)
                            .ToCharArray().Select(o => o.In(invalid) ? '_' : o));

You can even make it replace spaces if you add the space char to the invalid array.

如果将空格字符添加到无效数组,您甚至可以让它替换空格。

Maybe it's not the fastest, but as performance wasn't an issue, I found it elegant and understandable.

也许它不是最快的,但由于性能不是问题,我发现它优雅且易于理解。

Cheers!

干杯!

回答by Joan Vilari?o

Cleaning a little my code and making a little refactoring... I created an extension for string type:

清理一点我的代码并进行一点重构......我为字符串类型创建了一个扩展:

public static string ToValidFileName(this string s, char replaceChar = '_', char[] includeChars = null)
{
  var invalid = Path.GetInvalidFileNameChars();
  if (includeChars != null) invalid = invalid.Union(includeChars).ToArray();
  return string.Join(string.Empty, s.ToCharArray().Select(o => o.In(invalid) ? replaceChar : o));
}

Now it's easier to use with:

现在更容易使用:

var name = "Any string you want using ? / \ or even +.zip";
var validFileName = name.ToValidFileName();

If you want to replace with a different char than "_" you can use:

如果要替换为与“_”不同的字符,可以使用:

var validFileName = name.ToValidFileName(replaceChar:'#');

And you can add chars to replace.. for example you dont want spaces or commas:

你可以添加字符来替换..例如你不想要空格或逗号:

var validFileName = name.ToValidFileName(includeChars: new [] { ' ', ',' });

Hope it helps...

希望能帮助到你...

Cheers

干杯

回答by rkagerer

Here's a slight twist on Diego's answer.

迭戈的回答略有不同。

If you're not afraid of Unicode, you can retain a bit more fidelity by replacing the invalid characters with valid Unicode symbols that resemble them. Here's the code I used in a recent project involving lumber cutlists:

如果您不害怕 Unicode,则可以通过将无效字符替换为与它们相似的有效 Unicode 符号来保持更高的保真度。这是我在最近一个涉及木材切割清单的项目中使用的代码:

static string MakeValidFilename(string text) {
  text = text.Replace('\'', '''); // U+2019 right single quotation mark
  text = text.Replace('"',  '”'); // U+201D right double quotation mark
  text = text.Replace('/', '?');  // U+2044 fraction slash
  foreach (char c in System.IO.Path.GetInvalidFileNameChars()) {
    text = text.Replace(c, '_');
  }
  return text;
}

This produces filenames like 1?2” spruce.txtinstead of 1_2_ spruce.txt

这会产生文件名,1?2” spruce.txt而不是1_2_ spruce.txt

Yes, it really works:

是的,它确实有效:

Explorer sample

资源管理器示例

Caveat Emptor

买者自负

I knew this trick would work on NTFS but was surprised to find it also works on FAT and FAT32 partitions. That's because long filenamesare stored in Unicode, even as far backas Windows 95/NT. I tested on Win7, XP, and even a Linux-based router and they showed up OK. Can't say the same for inside a DOSBox.

我知道这个技巧适用于 NTFS,但惊讶地发现它也适用于 FAT 和 FAT32 分区。这是因为长文件名以Unicode格式存储,甚至可以追溯到与Windows 95 / NT。我在 Win7、XP 甚至基于 Linux 的路由器上进行了测试,它们都显示正常。在 DOSBox 中不能说同样的话。

That said, before you go nuts with this, consider whether you really need the extra fidelity. The Unicode look-alikes could confuse people or old programs, e.g. older OS's relying on codepages.

也就是说,在您对此发疯之前,请考虑您是否真的需要额外的保真度。Unicode 外观可能会混淆人们或旧程序,例如依赖于代码页的旧操作系统。

回答by Qwertie

In case anyone wants an optimized version based on StringBuilder, use this. Includes rkagerer's trick as an option.

如果有人想要基于 的优化版本StringBuilder,请使用它。包括 rkagerer 的技巧作为选项。

static char[] _invalids;

/// <summary>Replaces characters in <c>text</c> that are not allowed in 
/// file names with the specified replacement character.</summary>
/// <param name="text">Text to make into a valid filename. The same string is returned if it is valid already.</param>
/// <param name="replacement">Replacement character, or null to simply remove bad characters.</param>
/// <param name="fancy">Whether to replace quotes and slashes with the non-ASCII characters ” and ?.</param>
/// <returns>A string that can be used as a filename. If the output string would otherwise be empty, returns "_".</returns>
public static string MakeValidFileName(string text, char? replacement = '_', bool fancy = true)
{
    StringBuilder sb = new StringBuilder(text.Length);
    var invalids = _invalids ?? (_invalids = Path.GetInvalidFileNameChars());
    bool changed = false;
    for (int i = 0; i < text.Length; i++) {
        char c = text[i];
        if (invalids.Contains(c)) {
            changed = true;
            var repl = replacement ?? '
public static string GetSafeFilename(string arbitraryString)
{
    var invalidChars = System.IO.Path.GetInvalidFileNameChars();
    var replaceIndex = arbitraryString.IndexOfAny(invalidChars, 0);
    if (replaceIndex == -1) return arbitraryString;

    var r = new StringBuilder();
    var i = 0;

    do
    {
        r.Append(arbitraryString, i, replaceIndex - i);

        switch (arbitraryString[replaceIndex])
        {
            case '"':
                r.Append("''");
                break;
            case '<':
                r.Append('\u02c2'); // '?' (modifier letter left arrowhead)
                break;
            case '>':
                r.Append('\u02c3'); // '?' (modifier letter right arrowhead)
                break;
            case '|':
                r.Append('\u2223'); // '∣' (divides)
                break;
            case ':':
                r.Append('-');
                break;
            case '*':
                r.Append('\u2217'); // '?' (asterisk operator)
                break;
            case '\':
            case '/':
                r.Append('\u2044'); // '?' (fraction slash)
                break;
            case '##代码##':
            case '\f':
            case '?':
                break;
            case '\t':
            case '\n':
            case '\r':
            case '\v':
                r.Append(' ');
                break;
            default:
                r.Append('_');
                break;
        }

        i = replaceIndex + 1;
        replaceIndex = arbitraryString.IndexOfAny(invalidChars, i);
    } while (replaceIndex != -1);

    r.Append(arbitraryString, i, arbitraryString.Length - i);

    return r.ToString();
}
'; if (fancy) { if (c == '"') repl = '”'; // U+201D right double quotation mark else if (c == '\'') repl = '''; // U+2019 right single quotation mark else if (c == '/') repl = '?'; // U+2044 fraction slash } if (repl != '##代码##') sb.Append(repl); } else sb.Append(c); } if (sb.Length == 0) return "_"; return changed ? sb.ToString() : text; }

回答by jnm2

Here's a version that uses StringBuilderand IndexOfAnywith bulk append for full efficiency. It also returns the original string rather than create a duplicate string.

这里有一个版本的使用StringBuilderIndexOfAny与全效率散装追加。它还返回原始字符串而不是创建重复的字符串。

Last but not least, it has a switch statement that returns look-alike characters which you can customize any way you wish. Check out Unicode.org's confusables lookupto see what options you might have, depending on the font.

最后但并非最不重要的一点是,它有一个 switch 语句,可以返回相似字符,您可以按照自己的意愿自定义这些字符。查看Unicode.org 的 confusables 查找,看看您可能有哪些选项,具体取决于字体。

##代码##

It doesn't check for ., .., or reserved names like CONbecause it isn't clear what the replacement should be.

它不检查...或像保留名称CON,因为它没有明确的更换应该是什么。