C# 如何从任意字符串生成有效的 Windows 文件名?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/620605/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to make a valid Windows filename from an arbitrary string?
提问by Ken
I've got a string like "Foo: Bar" that I want to use as a filename, but on Windows the ":" char isn't allowed in a filename.
我有一个像 "Foo: Bar" 这样的字符串,我想用作文件名,但在 Windows 上,文件名中不允许使用 ":" 字符。
Is there a method that will turn "Foo: Bar" into something like "Foo- Bar"?
有没有一种方法可以将“Foo: Bar”变成“Foo-Bar”之类的东西?
采纳答案by Diego Jancic
Try something like this:
尝试这样的事情:
string fileName = "something";
foreach (char c in System.IO.Path.GetInvalidFileNameChars())
{
fileName = fileName.Replace(c, '_');
}
Edit:
编辑:
Since GetInvalidFileNameChars()
will return 10 or 15 chars, it's better to use a StringBuilder
instead of a simple string; the original version will take longer and consume more memory.
由于GetInvalidFileNameChars()
将返回 10 或 15 个字符,因此最好使用 aStringBuilder
而不是简单的字符串;原始版本将花费更长的时间并消耗更多内存。
回答by Phil Price
fileName = fileName.Replace(":", "-")
However ":" is not the only illegal character for Windows. You will also have to handle:
然而,“:”并不是 Windows 唯一的非法字符。您还必须处理:
/, \, :, *, ?, ", <, > and |
These are contained in System.IO.Path.GetInvalidFileNameChars();
这些包含在 System.IO.Path.GetInvalidFileNameChars() 中;
Also (on Windows), "." cannot be the only character in the filename (both ".", "..", "...", and so on are invalid). Be careful when naming files with ".", for example:
还有(在 Windows 上),“.” 不能是文件名中的唯一字符(“.”、“..”、“...”等都是无效的)。使用“.”命名文件时要小心,例如:
echo "test" > .test.
Will generate a file named ".test"
将生成一个名为“.test”的文件
Lastly, if you reallywant to do things correctly, there are some special file namesyou need to look out for. On Windowsyou can't create files named:
最后,如果你真的想正确地做事,你需要注意一些特殊的文件名。在 Windows 上,您无法创建名为:
CON, PRN, AUX, CLOCK$, NUL
COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9
LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9.
回答by leggetter
Diego does have the correct solution but there is one very small mistake in there. The version of string.Replace being used should be string.Replace(char, char), there isn't a string.Replace(char, string)
迭戈确实有正确的解决方案,但其中有一个非常小的错误。正在使用的 string.Replace 版本应该是 string.Replace(char, char),没有 string.Replace(char, string)
I can't edit the answer or I would have just made the minor change.
我无法编辑答案,否则我只会做一些小改动。
So it should be:
所以应该是:
string fileName = "something";
foreach (char c in System.IO.Path.GetInvalidFileNameChars())
{
fileName = fileName.Replace(c, '_');
}
回答by D W
You can do this with a sed
command:
您可以使用以下sed
命令执行此操作:
sed -e "
s/[?()\[\]=+<>:;??”,*|]/_/g
s/"$'\t'"/ /g
s/–/-/g
s/\"/_/g
s/[[:cntrl:]]/_/g"
回答by Joseph Gabriel
This isn't more efficient, but it's more fun :)
这不是更有效,但它更有趣:)
var fileName = "foo:bar";
var invalidChars = System.IO.Path.GetInvalidFileNameChars();
var cleanFileName = new string(fileName.Where(m => !invalidChars.Contains(m)).ToArray<char>());
回答by Joan Vilari?o
I needed to do this today... in my case, I needed to concatenate a customer name with the date and time for a final .kmz file. My final solution was this:
我今天需要这样做……就我而言,我需要将客户名称与日期和时间连接起来,以生成最终的 .kmz 文件。我的最终解决方案是这样的:
string name = "Whatever name with valid/invalid chars";
char[] invalid = System.IO.Path.GetInvalidFileNameChars();
string validFileName = string.Join(string.Empty,
string.Format("{0}.{1:G}.kmz", name, DateTime.Now)
.ToCharArray().Select(o => o.In(invalid) ? '_' : o));
You can even make it replace spaces if you add the space char to the invalid array.
如果将空格字符添加到无效数组,您甚至可以让它替换空格。
Maybe it's not the fastest, but as performance wasn't an issue, I found it elegant and understandable.
也许它不是最快的,但由于性能不是问题,我发现它优雅且易于理解。
Cheers!
干杯!
回答by Joan Vilari?o
Cleaning a little my code and making a little refactoring... I created an extension for string type:
清理一点我的代码并进行一点重构......我为字符串类型创建了一个扩展:
public static string ToValidFileName(this string s, char replaceChar = '_', char[] includeChars = null)
{
var invalid = Path.GetInvalidFileNameChars();
if (includeChars != null) invalid = invalid.Union(includeChars).ToArray();
return string.Join(string.Empty, s.ToCharArray().Select(o => o.In(invalid) ? replaceChar : o));
}
Now it's easier to use with:
现在更容易使用:
var name = "Any string you want using ? / \ or even +.zip";
var validFileName = name.ToValidFileName();
If you want to replace with a different char than "_" you can use:
如果要替换为与“_”不同的字符,可以使用:
var validFileName = name.ToValidFileName(replaceChar:'#');
And you can add chars to replace.. for example you dont want spaces or commas:
你可以添加字符来替换..例如你不想要空格或逗号:
var validFileName = name.ToValidFileName(includeChars: new [] { ' ', ',' });
Hope it helps...
希望能帮助到你...
Cheers
干杯
回答by rkagerer
Here's a slight twist on Diego's answer.
迭戈的回答略有不同。
If you're not afraid of Unicode, you can retain a bit more fidelity by replacing the invalid characters with valid Unicode symbols that resemble them. Here's the code I used in a recent project involving lumber cutlists:
如果您不害怕 Unicode,则可以通过将无效字符替换为与它们相似的有效 Unicode 符号来保持更高的保真度。这是我在最近一个涉及木材切割清单的项目中使用的代码:
static string MakeValidFilename(string text) {
text = text.Replace('\'', '''); // U+2019 right single quotation mark
text = text.Replace('"', '”'); // U+201D right double quotation mark
text = text.Replace('/', '?'); // U+2044 fraction slash
foreach (char c in System.IO.Path.GetInvalidFileNameChars()) {
text = text.Replace(c, '_');
}
return text;
}
This produces filenames like 1?2” spruce.txt
instead of 1_2_ spruce.txt
这会产生文件名,1?2” spruce.txt
而不是1_2_ spruce.txt
Yes, it really works:
是的,它确实有效:
Caveat Emptor
买者自负
I knew this trick would work on NTFS but was surprised to find it also works on FAT and FAT32 partitions. That's because long filenamesare stored in Unicode, even as far backas Windows 95/NT. I tested on Win7, XP, and even a Linux-based router and they showed up OK. Can't say the same for inside a DOSBox.
我知道这个技巧适用于 NTFS,但惊讶地发现它也适用于 FAT 和 FAT32 分区。这是因为长文件名都以Unicode格式存储,甚至可以追溯到与Windows 95 / NT。我在 Win7、XP 甚至基于 Linux 的路由器上进行了测试,它们都显示正常。在 DOSBox 中不能说同样的话。
That said, before you go nuts with this, consider whether you really need the extra fidelity. The Unicode look-alikes could confuse people or old programs, e.g. older OS's relying on codepages.
也就是说,在您对此发疯之前,请考虑您是否真的需要额外的保真度。Unicode 外观可能会混淆人们或旧程序,例如依赖于代码页的旧操作系统。
回答by Qwertie
In case anyone wants an optimized version based on StringBuilder
, use this. Includes rkagerer's trick as an option.
如果有人想要基于 的优化版本StringBuilder
,请使用它。包括 rkagerer 的技巧作为选项。
static char[] _invalids;
/// <summary>Replaces characters in <c>text</c> that are not allowed in
/// file names with the specified replacement character.</summary>
/// <param name="text">Text to make into a valid filename. The same string is returned if it is valid already.</param>
/// <param name="replacement">Replacement character, or null to simply remove bad characters.</param>
/// <param name="fancy">Whether to replace quotes and slashes with the non-ASCII characters ” and ?.</param>
/// <returns>A string that can be used as a filename. If the output string would otherwise be empty, returns "_".</returns>
public static string MakeValidFileName(string text, char? replacement = '_', bool fancy = true)
{
StringBuilder sb = new StringBuilder(text.Length);
var invalids = _invalids ?? (_invalids = Path.GetInvalidFileNameChars());
bool changed = false;
for (int i = 0; i < text.Length; i++) {
char c = text[i];
if (invalids.Contains(c)) {
changed = true;
var repl = replacement ?? 'public static string GetSafeFilename(string arbitraryString)
{
var invalidChars = System.IO.Path.GetInvalidFileNameChars();
var replaceIndex = arbitraryString.IndexOfAny(invalidChars, 0);
if (replaceIndex == -1) return arbitraryString;
var r = new StringBuilder();
var i = 0;
do
{
r.Append(arbitraryString, i, replaceIndex - i);
switch (arbitraryString[replaceIndex])
{
case '"':
r.Append("''");
break;
case '<':
r.Append('\u02c2'); // '?' (modifier letter left arrowhead)
break;
case '>':
r.Append('\u02c3'); // '?' (modifier letter right arrowhead)
break;
case '|':
r.Append('\u2223'); // '∣' (divides)
break;
case ':':
r.Append('-');
break;
case '*':
r.Append('\u2217'); // '?' (asterisk operator)
break;
case '\':
case '/':
r.Append('\u2044'); // '?' (fraction slash)
break;
case '##代码##':
case '\f':
case '?':
break;
case '\t':
case '\n':
case '\r':
case '\v':
r.Append(' ');
break;
default:
r.Append('_');
break;
}
i = replaceIndex + 1;
replaceIndex = arbitraryString.IndexOfAny(invalidChars, i);
} while (replaceIndex != -1);
r.Append(arbitraryString, i, arbitraryString.Length - i);
return r.ToString();
}
';
if (fancy) {
if (c == '"') repl = '”'; // U+201D right double quotation mark
else if (c == '\'') repl = '''; // U+2019 right single quotation mark
else if (c == '/') repl = '?'; // U+2044 fraction slash
}
if (repl != '##代码##')
sb.Append(repl);
} else
sb.Append(c);
}
if (sb.Length == 0)
return "_";
return changed ? sb.ToString() : text;
}
回答by jnm2
Here's a version that uses StringBuilder
and IndexOfAny
with bulk append for full efficiency. It also returns the original string rather than create a duplicate string.
这里有一个版本的使用StringBuilder
和IndexOfAny
与全效率散装追加。它还返回原始字符串而不是创建重复的字符串。
Last but not least, it has a switch statement that returns look-alike characters which you can customize any way you wish. Check out Unicode.org's confusables lookupto see what options you might have, depending on the font.
最后但并非最不重要的一点是,它有一个 switch 语句,可以返回相似字符,您可以按照自己的意愿自定义这些字符。查看Unicode.org 的 confusables 查找,看看您可能有哪些选项,具体取决于字体。
##代码##It doesn't check for .
, ..
, or reserved names like CON
because it isn't clear what the replacement should be.
它不检查.
,..
或像保留名称CON
,因为它没有明确的更换应该是什么。