C#如何用直引号替换微软的Smart Quotes?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/334850/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-03 23:39:20  来源:igfitidea点击:

C# How to replace Microsoft's Smart Quotes with straight quotation marks?

c#smart-quotes

提问by

My post below asked what the curly quotation marks were and why my app wouldn't work with them, my question now is how can I replace them when my program comes across them, how can I do this in C#? Are they special characters?

我下面的帖子询问了卷曲引号是什么以及为什么我的应用程序不能使用它们,我现在的问题是当我的程序遇到它们时如何替换它们,我该如何在 C# 中做到这一点?他们是特殊人物吗?

curly-quotation-marks-vs-square-quotation-marks-what-gives

卷曲引号与方形引号引起的

Thanks

谢谢

回答by Mark Ransom

According to the Character Map application that comes with Windows, the Unicode values for the curly quotes are 0x201c and 0x201d. Replace those values with the straight quote 0x0022, and you should be good to go.

根据 Windows 附带的字符映射应用程序,花括号的 Unicode 值是 0x201c 和 0x201d。用直接引用 0x0022 替换这些值,你应该很高兴。

String.Replace(0x201c, '"');
String.Replace(0x201d, '"');

回答by Rob Kennedy

Note that what you have is inherently a corrupt CSV file. Indiscriminately replacing all typographer's quotes with straight quotes won't necessarily fix your file. For all you know, some of the typographer's quotes were supposed to be there, as part of a field's value. Replacing them with straight quotes might not leave you with a valid CSV file, either.

请注意,您所拥有的本质上是一个损坏的 CSV 文件。不加选择地用直引号替换所有印刷师的引号不一定会修复您的文件。就你所知,一些排版员的引号应该在那里,作为字段值的一部分。用直引号替换它们可能也不会给您留下有效的 CSV 文件。

I don't think there is an algorithmic way to fix a file that is corrupt in the way you describe. Your time might be better spent investigating how you come to have such invalid files in the first place, and then putting a stop to it. Is someone using Word to edit your data files, for instance?

我认为没有一种算法方法可以修复以您描述的方式损坏的文件。您最好先花时间调查一下您是如何拥有此类无效文件的,然后再制止它。例如,有人使用 Word 编辑您的数据文件吗?

回答by Matthew Ruston

When I encountered this problem I wrote an extension method to the String class in C#.

当我遇到这个问题时,我在 C# 中为 String 类编写了一个扩展方法。

public static class StringExtensions
{
    public static string StripIncompatableQuotes(this string s)
    {
        if (!string.IsNullOrEmpty(s))
            return s.Replace('\u2018', '\'').Replace('\u2019', '\'').Replace('\u201c', '\"').Replace('\u201d', '\"');
        else
            return s;
    }
}

This simply replaces the silly 'smart quotes' with normal quotes.

这只是用普通引号替换了愚蠢的“智能引号”。

[EDIT] Fixed to also support replacement of 'double smart quotes'.

[编辑] 修复还支持替换“双智能引号”。

回答by Dmitri Nesteruk

I have a whole great big... program... that does precisely this. You can rip out the script and use it at your leasure. It does all sorts of replacements, and is located at http://bitbucket.org/nesteruk/typografix

我有一个非常大的……程序……正是这样做的。您可以撕掉脚本并在您的许可下使用它。它可以进行各种替换,位于http://bitbucket.org/nesteruk/typografix

回答by Dmitri Nesteruk

Try this for smart single quotes if the above don't work:

如果上述方法不起作用,请尝试使用智能单引号:

string.Replace("200", "'")
string.Replace("201", "'")

Try this as well for smart double quotes:

对于智能双引号也试试这个:

string.Replace("204", '"')
string.Replace("205", '"')

回答by Nick van Esch

A more extensive listing of problematic word characters

更广泛的有问题的单词字符列表

if (buffer.IndexOf('\u2013') > -1) buffer = buffer.Replace('\u2013', '-');
if (buffer.IndexOf('\u2014') > -1) buffer = buffer.Replace('\u2014', '-');
if (buffer.IndexOf('\u2015') > -1) buffer = buffer.Replace('\u2015', '-');
if (buffer.IndexOf('\u2017') > -1) buffer = buffer.Replace('\u2017', '_');
if (buffer.IndexOf('\u2018') > -1) buffer = buffer.Replace('\u2018', '\'');
if (buffer.IndexOf('\u2019') > -1) buffer = buffer.Replace('\u2019', '\'');
if (buffer.IndexOf('\u201a') > -1) buffer = buffer.Replace('\u201a', ',');
if (buffer.IndexOf('\u201b') > -1) buffer = buffer.Replace('\u201b', '\'');
if (buffer.IndexOf('\u201c') > -1) buffer = buffer.Replace('\u201c', '\"');
if (buffer.IndexOf('\u201d') > -1) buffer = buffer.Replace('\u201d', '\"');
if (buffer.IndexOf('\u201e') > -1) buffer = buffer.Replace('\u201e', '\"');
if (buffer.IndexOf('\u2026') > -1) buffer = buffer.Replace("\u2026", "...");
if (buffer.IndexOf('\u2032') > -1) buffer = buffer.Replace('\u2032', '\'');
if (buffer.IndexOf('\u2033') > -1) buffer = buffer.Replace('\u2033', '\"');

回答by pospi

I also have a program which does this, the source is in this fileof CP-1252 Fixer. It additionally defines some mappings for converting characters within RTF strings whilst preserving all formatting, which may be useful to some.

我也有一个程序可以做到这一点,源代码在CP-1252 Fixer 的这个文件中。它还定义了一些映射,用于在保留所有格式的同时转换 RTF 字符串中的字符,这可能对某些人有用。

It is also a complete mapping of all"smart quote" characters to their low-ascii counterparts, entity codes and character references.

它也是所有“智能引号”字符到它们的低 ASCII 对应物、实体代码和字符引用的完整映射。

回答by cjbarth

The VB equivalent of what @Matthew wrote:

相当于@Matthew 所写的 VB:

Public Module StringExtensions

    <Extension()>
    Public Function StripIncompatableQuotes(BadString As String) As String
        If Not String.IsNullOrEmpty(BadString) Then
            Return BadString.Replace(ChrW(&H2018), "'").Replace(ChrW(&H2019), "'").Replace(ChrW(&H201C), """").Replace(ChrW(&H201D), """")
        Else
            Return BadString
        End If
    End Function
End Module

回答by Barbara from Boston

To extend on Nick van Esch's popular answer, here is the code with the names of the characters in the comments.

为了扩展 Nick van Esch 的流行答案,这里是注释中带有字符名称的代码。

if (buffer.IndexOf('\u2013') > -1) buffer = buffer.Replace('\u2013', '-'); // en dash
if (buffer.IndexOf('\u2014') > -1) buffer = buffer.Replace('\u2014', '-'); // em dash
if (buffer.IndexOf('\u2015') > -1) buffer = buffer.Replace('\u2015', '-'); // horizontal bar
if (buffer.IndexOf('\u2017') > -1) buffer = buffer.Replace('\u2017', '_'); // double low line
if (buffer.IndexOf('\u2018') > -1) buffer = buffer.Replace('\u2018', '\''); // left single quotation mark
if (buffer.IndexOf('\u2019') > -1) buffer = buffer.Replace('\u2019', '\''); // right single quotation mark
if (buffer.IndexOf('\u201a') > -1) buffer = buffer.Replace('\u201a', ','); // single low-9 quotation mark
if (buffer.IndexOf('\u201b') > -1) buffer = buffer.Replace('\u201b', '\''); // single high-reversed-9 quotation mark
if (buffer.IndexOf('\u201c') > -1) buffer = buffer.Replace('\u201c', '\"'); // left double quotation mark
if (buffer.IndexOf('\u201d') > -1) buffer = buffer.Replace('\u201d', '\"'); // right double quotation mark
if (buffer.IndexOf('\u201e') > -1) buffer = buffer.Replace('\u201e', '\"'); // double low-9 quotation mark
if (buffer.IndexOf('\u2026') > -1) buffer = buffer.Replace("\u2026", "..."); // horizontal ellipsis
if (buffer.IndexOf('\u2032') > -1) buffer = buffer.Replace('\u2032', '\''); // prime
if (buffer.IndexOf('\u2033') > -1) buffer = buffer.Replace('\u2033', '\"'); // double prime

回答by Asif Ghanchi

it worked for me, you can try below code

它对我有用,你可以试试下面的代码

string replacedstring = ("your string with smart quotes").Replace('\u201d', '\'');

Thanks!

谢谢!