在 vba 中修剪前导和尾随空格的函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24048400/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-12 03:22:15  来源:igfitidea点击:

Function to trim leading and trailing whitespace in vba

regexexcelvbatrim

提问by Marcus Widerberg

I have checked quite a few suggestions re trimming leading & trailing whitespace in vba (excel, incidentally).

我已经检查了很多关于在 vba 中修剪前导和尾随空格的建议(顺便说一句,excel)。

I have found this solution, but it also trims ? ? ? (also caps) and I am too weak in regex to see why:

我找到了这个解决方案,但它也会修剪?? ? (也是大写)而且我在正则表达式方面太弱了,看不出为什么:

Function MultilineTrim (Byval TextData)
    Dim textRegExp
    Set textRegExp = new regexp
    textRegExp.Pattern = "\s{0,}(\S{1}[\s,\S]*\S{1})\s{0,}"
    textRegExp.Global = False
    textRegExp.IgnoreCase = True
    textRegExp.Multiline = True

    If textRegExp.Test (TextData) Then
      MultilineTrim = textRegExp.Replace (TextData, "")
    Else
      MultilineTrim = ""
    End If
End Function

(this is from an answer here at SO, where the useraccount seems inactive:

(这是来自 SO 的答案,其中 useraccount 似乎处于非活动状态:

https://stackoverflow.com/a/1606433/3701019)

https://stackoverflow.com/a/1606433/3701019)

So, I would love if anyone could help with either (a) an alternative solution to the problem or (b) a version of the regexp / code that would not strip out (single) ??? characters.

因此,如果有人可以帮助解决(a)问题的替代解决方案或(b)不会删除(单个)的正则表达式/代码版本,我会很高兴?人物。

Thanks for any help!

谢谢你的帮助!

Details: Problem

详细信息:问题

  • Trim functions in vba do not consider all whitespace chars (tabs, for instance). Some custom trim is needed
  • The best solution I found is above, but it also removes single ? ? ? characters.
  • vba 中的修剪函数不考虑所有空白字符(例如制表符)。需要一些自定义修剪
  • 我发现的最佳解决方案是上面,但它也删除了单个 ? ? ? 人物。

My context is a xmlparser in vba, where it gets chunks of xml to parse. It sometimes just gets a character from the stream, which may be ? ? ?, which then this function strips away completely.

我的上下文是 vba 中的一个 xmlparser,它在其中获取要解析的 xml 块。它有时只是从流中获取一个字符,这可能是 ? ? ?,然后这个功能完全剥离。

I would be happy to clarify or edit this question, of course.

当然,我很乐意澄清或编辑这个问题。

FYI: I have shared exactly what I did based on the answers, see below.

仅供参考:我已经根据答案分享了我所做的一切,见下文。

采纳答案by Ron Rosenfeld

For a regex I would use:

对于正则表达式,我会使用:

^[\s\xA0]+|[\s\xA0]+$

This will match the "usual" whitespace characters as well as the NBSP, commonly found in HTML documents.

这将匹配 HTML 文档中常见的“常用”空白字符以及 NBSP。

VBA Code would look something like below, where S is the line to Trim:

VBA 代码如下所示,其中 S 是 Trim 的行:

Dim RE as Object, ResultString as String
Set RE = CreateObject("vbscript.regexp")
RE.MultiLine = True
RE.Global = True
RE.Pattern = "^[\s\xA0]+|[\s\xA0]+$"
ResultString = RE.Replace(S, "")

And an explanation of the regex:

以及对正则表达式的解释:

Trim whitespace at the start and the end of each line
-----------------------------------------------------

^[\s\xA0]+|[\s\xA0]+$

Options:  ^$?match?at?line?breaks

Match this alternative (attempting the next alternative only if this one fails) ?^[\s\xA0]+?
   Assert position at the beginning of a line (at beginning of the string or after a line break character) ?^?
   Match a single character present in the list below ?[\s\xA0]+?
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) ?+?
      A “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) ?\s?
      The character with position 0xA0 (160 decimal) in the character set ?\xA0?
Or match this alternative (the entire match attempt fails if this one fails to match) ?[\s\xA0]+$?
   Match a single character present in the list below ?[\s\xA0]+?
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) ?+?
      A “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) ?\s?
      The character with position 0xA0 (160 decimal) in the character set ?\xA0?
   Assert position at the end of a line (at the end of the string or before a line break character) ?$?

Created with RegexBuddy

回答by Richard Vivian

You can create a custom function that strips out the characters that you don't want specifically.

您可以创建一个自定义函数来去除您不想要的字符。

Private Function CleanMyString(sInput As String) As String
   Dim sResult As String

   ' Remove leading ans trailing spaces
   sResult = Trim(sInput)
   'Remove other characters that you dont want
   sResult = Replace(sResult, chr(10), "")
   sResult = Replace(sResult, chr(13), "")
   sResult = Replace(sResult, chr(9), "")

End Function

This does not use regex though. Not sure if thats OK for your requirements?

虽然这不使用正则表达式。不确定这是否符合您的要求?

回答by AnotherParker

Try this:

尝试这个:

Function MultilineTrim (Byval TextData)
    Dim textRegExp
    Set textRegExp = new regexp
    textRegExp.Pattern = "(^[ \t]+|[ \t]+$)"
    textRegExp.Global = True
    textRegExp.IgnoreCase = True
    textRegExp.Multiline = True

    MultilineTrim = textRegExp.Replace (TextData, "")
End Function

回答by Marcus Widerberg

After consulting with stackexchange people on how to do this, I am adding the edit of the question as my own answer, instead. Here it is:

与 stackexchange 人员协商如何执行此操作后,我将问题的编辑添加为我自己的答案。这里是:

Answer / Used code

答案/使用代码

Thanks to the answer(s), this is what I will be using:

感谢答案,这就是我将使用的:

Function MultilineTrim(ByVal TextData)
    MultilineTrim = textRegExp.Replace(TextData, "")

'    If textRegExp.Test(TextData) Then
'        MultilineTrim = textRegExp.Replace(TextData, "")
'    Else
'        MultilineTrim = "" ' ??
'    End If
End Function

Private Sub InitRegExp()
    Set textRegExp = New RegExp
    'textRegExp.Pattern = "\s{0,}(\S{1}[\s,\S]*\S{1})\s{0,}" 'this removes ? ? ? - bug!
    'textRegExp.Global = False

    'textRegExp.Pattern = "(^[ \t]+|[ \t]+$)" ' leaves a line break at start
    textRegExp.Pattern = "^[\s\xA0]+|[\s\xA0]+$" ' works! Ron Rosenfelds submit

    textRegExp.Global = True

    textRegExp.IgnoreCase = True
    textRegExp.MultiLine = True
End Sub

Thanks again all! (nod to Ron Rosenfeld)

再次感谢大家!(向罗恩·罗森菲尔德点头)

回答by Sebastian Viereck

Refactored and improved Richard Vivians version

重构和改进的 Richard Vivians 版本

Function cleanMyString(sInput)
    ' Remove leading and trailing spaces
    sInput = Trim(sInput)
    'Remove other characters that you dont want
    sInput = Replace(sInput, Chr(10), "")
    sInput = Replace(sInput, Chr(13), "")
    sInput = Replace(sInput, Chr(9), "")
    cleanMyString = sInput
End Function

回答by Mark Karam

I would call Trim after replacing all the other characters. This way if there are spaces after the other characters they will also be removed.

在替换所有其他字符后,我会调用 Trim。这样,如果其他字符后面有空格,它们也将被删除。