C# 如何确定文件是否与文件掩码匹配?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/725341/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to determine if a File Matches a File Mask?
提问by jing
I need to decide whether file name fits to file mask. The file mask could contain * or ? characters. Is there any simple solution for this?
我需要决定文件名是否适合文件掩码。文件掩码可以包含 * 或 ? 人物。有什么简单的解决方案吗?
bool bFits = Fits("myfile.txt", "my*.txt");
private bool Fits(string sFileName, string sFileMask)
{
??? anything simple here ???
}
采纳答案by Joel Coehoorn
Try this:
尝试这个:
private bool FitsMask(string sFileName, string sFileMask)
{
Regex mask = new Regex(sFileMask.Replace(".", "[.]").Replace("*", ".*").Replace("?", "."));
return mask.IsMatch(sFileName);
}
回答by Richard
If PowerShell is available, it has direct support for wildcard type matching(as well as Regex).
如果 PowerShell 可用,它会直接支持通配符类型匹配(以及 Regex)。
WildcardPattern pat = new WildcardPattern("a*.b*");
if (pat.IsMatch(filename)) { ... }
回答by Michael Sorens
I appreciate finding Joel's answer--saved me some time as well ! I did, however, have to make a few changes to make the method do what most users would expect:
我很感激找到乔尔的答案——也为我节省了一些时间!但是,我确实必须进行一些更改才能使该方法符合大多数用户的期望:
- I removed the 'this' keyword preceding the first argument. It does nothing here (though it could be useful if the method is intended to be an extension method, in which case it needs to be public and contained within a static class and itself be a static method).
- I made the regular expression case-independent to match standard Windows wildcard behavior (so e.g. "c*.*" and "C*.*" both return the same result).
- I added starting and ending anchors to the regular expression, again to match standard Windows wildcard behavior (so e.g. "stuff.txt" would be matched by "stuff*" or "s*" or "s*.*" but not by just "s").
- 我删除了第一个参数之前的“this”关键字。它在这里什么都不做(尽管如果该方法旨在成为扩展方法,则它可能很有用,在这种情况下,它需要是公共的并包含在静态类中,并且本身是静态方法)。
- 我使正则表达式独立于大小写以匹配标准的 Windows 通配符行为(例如“c*.*”和“C*.*”都返回相同的结果)。
- 我在正则表达式中添加了开始和结束锚点,再次匹配标准的 Windows 通配符行为(例如,“stuff.txt”将与“stuff*”或“s*”或“s*.*”匹配,但不仅仅是“s”)。
private bool FitsMask(string fileName, string fileMask)
{
Regex mask = new Regex(
'^' +
fileMask
.Replace(".", "[.]")
.Replace("*", ".*")
.Replace("?", ".")
+ '$',
RegexOptions.IgnoreCase);
return mask.IsMatch(fileName);
}
2009.11.04 Update: Match one of several masks
2009.11.04 更新:匹配几个面具之一
For even more flexibility, here is a plug-compatible method built on top of the original. This version lets you pass multiple masks (hence the plural on the second parameter name fileMasks) separated by lines, commas, vertical bars, or spaces. I wanted it so that I could let the user put as many choices as desired in a ListBox and then select all files matching anyof them. Note that some controls (like a ListBox) use CR-LF for line breaks while others (e.g. RichTextBox) use just LF--that is why both "\r\n" and "\n" show up in the Split list.
为了获得更大的灵活性,这里有一种建立在原始方法之上的插件兼容方法。此版本允许您传递由行、逗号、竖线或空格分隔的多个掩码(因此第二个参数名称fileMasks为复数)。我想要它,以便我可以让用户根据需要在 ListBox 中放置尽可能多的选择,然后选择与其中任何一个匹配的所有文件。请注意,某些控件(如 ListBox)使用 CR-LF 进行换行,而其他控件(例如 RichTextBox)仅使用 LF——这就是“\r\n”和“\n”都出现在拆分列表中的原因。
private bool FitsOneOfMultipleMasks(string fileName, string fileMasks)
{
return fileMasks
.Split(new string[] {"\r\n", "\n", ",", "|", " "},
StringSplitOptions.RemoveEmptyEntries)
.Any(fileMask => FitsMask(fileName, fileMask));
}
2009.11.17 Update: Handle fileMask inputs more gracefully
2009.11.17 更新:更优雅地处理 fileMask 输入
The earlier version of FitsMask (which I have left in for comparison) does a fair job but since we are treating it as a regular expression it will throw an exception if it is not a valid regular expression when it comes in. The solution is that we actually want any regex metacharacters in the input fileMask to be considered literals, not metacharacters. But we still need to treat period, asterisk, and question mark specially. So this improved version of FitsMask safely moves these three characters out of the way, transforms all remaining metacharacters into literals, then puts the three interesting characters back, in their "regex'ed" form.
FitsMask 的早期版本(我留下来进行比较)做得很好,但是由于我们将其视为正则表达式,因此如果它进入时它不是有效的正则表达式,它将抛出异常。 解决方案是我们实际上希望输入 fileMask 中的任何正则表达式元字符都被视为文字,而不是元字符。但是我们仍然需要特别对待句号、星号和问号。所以这个改进版的 FitsMask 安全地将这三个字符移开,将所有剩余的元字符转换为文字,然后将三个有趣的字符放回,以它们的“正则表达式”形式。
One other minor improvement is to allow for case-independence, per standard Windows behavior.
另一项小的改进是允许根据标准 Windows 行为独立于大小写。
private bool FitsMask(string fileName, string fileMask)
{
string pattern =
'^' +
Regex.Escape(fileMask.Replace(".", "__DOT__")
.Replace("*", "__STAR__")
.Replace("?", "__QM__"))
.Replace("__DOT__", "[.]")
.Replace("__STAR__", ".*")
.Replace("__QM__", ".")
+ '$';
return new Regex(pattern, RegexOptions.IgnoreCase).IsMatch(fileName);
}
2010.09.30 Update: Somewhere along the way, passion ensued...
2010.09.30 更新:一路走来,激情不断……
I have been remiss in not updating this earlier but these references will likely be of interest to readers who have made it to this point:
我没有更早地更新这一点是我的疏忽,但这些参考文献可能会引起已经做到这一点的读者的兴趣:
- I embedded the FitsMaskmethod as the heart of a WinForms user control aptly called a FileMask--see the API here.
- I then wrote an article featuring the FileMask control published on Simple-Talk.com, entitled Using LINQ Lambda Expressions to Design Customizable Generic Components. (While the method itself does not use LINQ, the FileMask user control does, hence the title of the article.)
- 我嵌入了FitsMask方法作为 WinForms 用户控件的核心,恰当地称为FileMask -- 请参阅此处的 API 。
- 然后,我写了一篇文章,介绍了 Simple-Talk.com 上发布的 FileMask 控件,标题为使用 LINQ Lambda 表达式来设计可定制的通用组件。(虽然该方法本身不使用 LINQ,但 FileMask 用户控件使用,因此是文章的标题。)
回答by Nissim
Many people don't know that, but .NET includes an internal class, called "PatternMatcher" (under the "System.IO" namespace).
许多人不知道这一点,但 .NET 包含一个内部类,称为“PatternMatcher”(在“System.IO”命名空间下)。
This static class contains only 1 method:
public static bool StrictMatchPattern(string expression, string name)
这个静态类只包含 1 个方法:
public static bool StrictMatchPattern(string expression, string name)
This method is used by .net whenever it needs to compare files with wildcard (FileSystemWatcher, GetFiles(), etc)
每当 .net 需要将文件与通配符(FileSystemWatcher、GetFiles() 等)进行比较时,就会使用此方法
Using reflector, I exposed the code here. Didn't really go through it to understand how it works, but it works great,
使用反射器,我在这里公开了代码。没有真正通过它来了解它是如何工作的,但它工作得很好,
So this is the code for anyone who doesn't want to work with the inefficient RegEx way:
所以这是任何不想使用低效 RegEx 方式的人的代码:
public static class PatternMatcher
{
// Fields
private const char ANSI_DOS_QM = '<';
private const char ANSI_DOS_STAR = '>';
private const char DOS_DOT = '"';
private const int MATCHES_ARRAY_SIZE = 16;
// Methods
public static bool StrictMatchPattern(string expression, string name)
{
expression = expression.ToLowerInvariant();
name = name.ToLowerInvariant();
int num9;
char ch = ' public static bool FitsMasks(string filePath, params string[] fileMasks)
// or
public static Regex FileMasksToRegex(params string[] fileMasks)
{
if (!_maskRegexes.ContainsKey(fileMasks))
{
StringBuilder sb = new StringBuilder("^");
bool first = true;
foreach (string fileMask in fileMasks)
{
if(first) first =false; else sb.Append("|");
sb.Append('(');
foreach (char c in fileMask)
{
switch (c)
{
case '*': sb.Append(@".*"); break;
case '?': sb.Append(@"."); break;
default:
sb.Append(Regex.Escape(c.ToString()));
break;
}
}
sb.Append(')');
}
sb.Append("$");
_maskRegexes[fileMasks] = new Regex(sb.ToString(), RegexOptions.IgnoreCase);
}
return _maskRegexes[fileMasks].IsMatch(filePath);
// or
return _maskRegexes[fileMasks];
}
static readonly Dictionary<string[], Regex> _maskRegexes = new Dictionary<string[], Regex>(/*unordered string[] comparer*/);
';
char ch2 = 'public static Boolean Fits(string sFileName, string sFileMask)
{
String convertedMask = "^" + Regex.Escape(sFileMask).Replace("\*", ".*").Replace("\?", ".") + "$";
Regex regexMask = new Regex(convertedMask, RegexOptions.IgnoreCase);
return regexMask.IsMatch(sFileName)
}
';
int[] sourceArray = new int[16];
int[] numArray2 = new int[16];
bool flag = false;
if (((name == null) || (name.Length == 0)) || ((expression == null) || (expression.Length == 0)))
{
return false;
}
if (expression.Equals("*") || expression.Equals("*.*"))
{
return true;
}
if ((expression[0] == '*') && (expression.IndexOf('*', 1) == -1))
{
int length = expression.Length - 1;
if ((name.Length >= length) && (string.Compare(expression, 1, name, name.Length - length, length, StringComparison.OrdinalIgnoreCase) == 0))
{
return true;
}
}
sourceArray[0] = 0;
int num7 = 1;
int num = 0;
int num8 = expression.Length * 2;
while (!flag)
{
int num3;
if (num < name.Length)
{
ch = name[num];
num3 = 1;
num++;
}
else
{
flag = true;
if (sourceArray[num7 - 1] == num8)
{
break;
}
}
int index = 0;
int num5 = 0;
int num6 = 0;
while (index < num7)
{
int num2 = (sourceArray[index++] + 1) / 2;
num3 = 0;
Label_00F2:
if (num2 != expression.Length)
{
num2 += num3;
num9 = num2 * 2;
if (num2 == expression.Length)
{
numArray2[num5++] = num8;
}
else
{
ch2 = expression[num2];
num3 = 1;
if (num5 >= 14)
{
int num11 = numArray2.Length * 2;
int[] destinationArray = new int[num11];
Array.Copy(numArray2, destinationArray, numArray2.Length);
numArray2 = destinationArray;
destinationArray = new int[num11];
Array.Copy(sourceArray, destinationArray, sourceArray.Length);
sourceArray = destinationArray;
}
if (ch2 == '*')
{
numArray2[num5++] = num9;
numArray2[num5++] = num9 + 1;
goto Label_00F2;
}
if (ch2 == '>')
{
bool flag2 = false;
if (!flag && (ch == '.'))
{
int num13 = name.Length;
for (int i = num; i < num13; i++)
{
char ch3 = name[i];
num3 = 1;
if (ch3 == '.')
{
flag2 = true;
break;
}
}
}
if ((flag || (ch != '.')) || flag2)
{
numArray2[num5++] = num9;
numArray2[num5++] = num9 + 1;
}
else
{
numArray2[num5++] = num9 + 1;
}
goto Label_00F2;
}
num9 += num3 * 2;
switch (ch2)
{
case '<':
if (flag || (ch == '.'))
{
goto Label_00F2;
}
numArray2[num5++] = num9;
goto Label_028D;
case '"':
if (flag)
{
goto Label_00F2;
}
if (ch == '.')
{
numArray2[num5++] = num9;
goto Label_028D;
}
break;
}
if (!flag)
{
if (ch2 == '?')
{
numArray2[num5++] = num9;
}
else if (ch2 == ch)
{
numArray2[num5++] = num9;
}
}
}
}
Label_028D:
if ((index < num7) && (num6 < num5))
{
while (num6 < num5)
{
int num14 = sourceArray.Length;
while ((index < num14) && (sourceArray[index] < numArray2[num6]))
{
index++;
}
num6++;
}
}
}
if (num5 == 0)
{
return false;
}
int[] numArray4 = sourceArray;
sourceArray = numArray2;
numArray2 = numArray4;
num7 = num5;
}
num9 = sourceArray[num7 - 1];
return (num9 == num8);
}
}
回答by Mr. TA
Fastest version of the previously proposed function:
先前提出的函数的最快版本:
public static Regex FileMaskToRegex(string sFileMask)
{
String convertedMask = "^" + Regex.Escape(sFileMask).Replace("\*", ".*").Replace("\?", ".") + "$";
return new Regex(convertedMask, RegexOptions.IgnoreCase);
}
Notes:
笔记:
- Re-using Regex objects.
- Using StringBuilder to optimize Regex creation (multiple .Replace() calls are slow).
- Multiple masks, combined with OR.
- Another version returning the Regex.
- 重用 Regex 对象。
- 使用 StringBuilder 优化 Regex 创建(多个 .Replace() 调用很慢)。
- 多个掩码,结合 OR。
- 另一个版本返回正则表达式。
回答by Nyerguds
None of these answers quite seem to do the trick, and msorens's is needlessly complex. This one should work just fine:
这些答案似乎都不能解决问题,而且 msorens 的答案也不必要地复杂。这个应该可以正常工作:
WildcardPattern pattern = new WildcardPattern("my*.txt");
bool fits = pattern.IsMatch("myfile.txt");
This makes sure possible regex chars in the mask are escaped, replaces the \* and \?, and surrounds it all by ^ and $ to mark the boundaries.
这确保掩码中可能的正则表达式字符被转义,替换 \* 和 \?,并用 ^ 和 $ 将其全部包围以标记边界。
Of course, in most situations, it's far more useful to simply make this into a FileMaskToRegex
tool function which returns the Regex object, so you just got it once and can then make a loop in which you check all strings from your files list on it.
当然,在大多数情况下,将它简单地变成一个FileMaskToRegex
返回 Regex 对象的工具函数会更有用,因此您只需获得一次,然后就可以创建一个循环,在其中检查文件列表中的所有字符串。
static Regex FileMask2Regex(string mask)
{
var sb = new StringBuilder(mask);
// hide wildcards
sb.Replace("**", "affefa0d52e84c2db78f5510117471aa-StarStar");
sb.Replace("*", "affefa0d52e84c2db78f5510117471aa-Star");
sb.Replace("?", "affefa0d52e84c2db78f5510117471aa-Question");
sb.Replace("/", "affefa0d52e84c2db78f5510117471aa-Slash");
sb.Replace("\", "affefa0d52e84c2db78f5510117471aa-Slash");
sb = new StringBuilder(Regex.Escape(sb.ToString()));
// unhide wildcards
sb.Replace("affefa0d52e84c2db78f5510117471aa-StarStar", @".*");
sb.Replace("affefa0d52e84c2db78f5510117471aa-Star", @"[^/\]*");
sb.Replace("affefa0d52e84c2db78f5510117471aa-Question", @"[^/\]");
sb.Replace("affefa0d52e84c2db78f5510117471aa-Slash", @"[/\]");
sb.Append("$");
// allowed to have prefix
sb.Insert(0, @"^(?:.*?[/\])?");
return new Regex(sb.ToString(), RegexOptions.IgnoreCase);
}
回答by Thomas Hoekstra
Nissim mentioned the PatternMatcher Class in his answer...
Nissim 在他的回答中提到了 PatternMatcher 类......
There is an explanation available here:
这里有一个解释:
http://referencesource.microsoft.com/#System/services/io/system/io/PatternMatcher.cs
http://referencesource.microsoft.com/#System/services/io/system/io/PatternMatcher.cs
So you don't have to use the reflected code and guess how it works.
所以你不必使用反射代码并猜测它是如何工作的。
Also, I think using this code is probably the best solution, because it guarantees consistent behavior when using the same pattern in your comparisons and in Framework methods like GetFiles()
.
此外,我认为使用此代码可能是最好的解决方案,因为在比较和框架方法(如GetFiles()
.
回答by Guillaume
Use WildCardPattern
class from System.Management.Automation
available as NuGet packageor in Windows PowerShell SDK.
使用NuGet 包或 Windows PowerShell SDK 中可用的WildCardPattern
类。System.Management.Automation
// UNICODE_STRING for Rtl... method
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Unicode)]
public struct UNICODE_STRING
{
public ushort Length;
public ushort MaximumLength;
[MarshalAs(UnmanagedType.LPWStr)]
string Buffer;
public UNICODE_STRING(string buffer)
{
if (buffer == null)
Length = MaximumLength = 0;
else
Length = MaximumLength = unchecked((ushort)(buffer.Length * 2));
Buffer = buffer;
}
}
// RtlIsNameInExpression method from NtDll.dll system library
public static class NtDll
{
[DllImport("NtDll.dll", CharSet=CharSet.Unicode, ExactSpelling=true)]
[return: MarshalAs(UnmanagedType.U1)]
public extern static bool RtlIsNameInExpression(
ref UNICODE_STRING Expression,
ref UNICODE_STRING Name,
[MarshalAs(UnmanagedType.U1)]
bool IgnoreCase,
IntPtr Zero
);
}
public bool MatchMask(string mask, string fileName)
{
// Expression must be uppercase for IgnoreCase == true (see MSDN for RtlIsNameInExpression)
UNICODE_STRING expr = new UNICODE_STRING(mask.ToUpper());
UNICODE_STRING name = new UNICODE_STRING(fileName);
if (NtDll.RtlIsNameInExpression(ref expr, ref name, true, IntPtr.Zero))
{
// MATCHES !!!
}
}
回答by Sergey Azarkevich
My version, which supports ** wild card:
我的版本,支持**通配符:
##代码##回答by David R??i?ka
From Windows 7 using P/Invoke (without 260 char count limit):
从 Windows 7 使用 P/Invoke(没有 260 个字符计数限制):
##代码##