vba 如何使用excel VBA脚本删除某些字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8195996/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 14:30:02  来源:igfitidea点击:

How to delete certain characters using excel VBA script

regexexcelvbaexcel-vba

提问by SammyJ

The following VBA script gets rid of unwanted characters but unfortunately only NUMBERS.

以下 VBA 脚本删除了不需要的字符,但不幸的是只有 NUMBERS。

Could you please assist me, It needs to rid letters too as in the table example(bolded) below.

你能帮我吗,它也需要删除字母,如下表示例(粗体)。

the Range could be anywhere from 0 to 15000+ cells

范围可以是从 0 到 15000+ 个单元格的任何地方

.....................................................

………………………………………………………………………………………………………………………………………………………… ...

anew ayork atimes a

纽约时间

bnew byork btimes b

bb纽约bb

cnew cyork cwatertown cny c

çç纽约ç水城Ç纽约Ç

6ave 6new 6york 6city 6

6AVE 66纽约6城市6

......................................................

……………………………………………………………………………………………………………………………………………………………… ....

The VBA script:

VBA 脚本:

Sub Remove()

Application.ScreenUpdating = False
Dim R As RegExp, C As Range
For Each C In Range("A1:A" & Cells(Rows.Count, 1).End(xlUp).Row)
If R Is Nothing Then
Set R = New RegExp
R.Global = True
R.Pattern = "\D"
C.Offset(0, 1) = R.Replace(C, "")
R.Pattern = "\d"
C = R.Replace(C, "")
End If
Set R = Nothing
Next C
Application.ScreenUpdating = True
End Sub

EDIT1

编辑1

Sub Remove()
Call BackMeUp

Dim cell As Range
Dim RE As Object
Dim Whitecell As Range
Dim strFind As String, strReplace As String
Dim lLoop As Long
Dim Loop1 As Long

    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual
    Application.EnableEvents = False
Range("A3:L3").Select
Selection.Delete Shift:=xlUp
'--------------------------------------------------Remove JUNK
Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select
On Error Resume Next
For lLoop = 1 To 100
    strFind = Choose(lLoop, "~??", "~?", "~.", "~!", "~?", "~-", "~§", "~$", "~%", "~&", "~/", "~\", "~,", "~(", "~)", "~=", "~www", "~WWW", "~.com", "~.net", "~.org", "~{", "~}", "~[", "~]", "~?", "~?", "~?", "~:", "~;", "~_", "~μ", "~@", "~#", "~'", "~|", "~", "~?", "~?", "~ü", "~?", "~ü", "~?", "~+", "~<", "~>", "~nbsp", "~a", "~|", "~?", "~?", "~–", "~?", "~?")
    strReplace = Choose(lLoop, " ")

    Selection.Replace What:=strFind, Replacement:=strReplace, LookAt:=xlPart, _
    SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
    ReplaceFormat:=False

Next lLoop
'--------------------------------------------------Remove Numbers
Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select
On Error Resume Next
For Loop1 = 1 To 40
    strFind = Choose(lLoop, "~1", "~2", "~3", "~4", "~5", "~6", "~7", "~8", "~9", "~0")
    strReplace = Choose(Loop1, " ")

    Selection.Replace What:=strFind, Replacement:=strReplace, LookAt:=xlPart, _
    SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
    ReplaceFormat:=False

Next Loop1
'--------------------------------------------------Remove Single Letters
Set RE = CreateObject("vbscript.regexp")
RE.Global = True
RE.MultiLine = True
RE.Pattern = "^[a-z]\b | \b[a-z]\b"

For Each cell In Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row)
    cell.Value = RE.Replace(cell.Value, "")

Next

'--------------------------------------------------Remove WHITE SPACES

For Each Whitecell In Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row)
    Whitecell = WorksheetFunction.Trim(Whitecell)
Next Whitecell

'--------------------------------------------------Remove DUPES

ActiveSheet.Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).RemoveDuplicates Columns:=1, Header:=xlYes
ActiveWorkbook.Worksheets(ActiveSheet.Name).Sort.SortFields.Clear

'--------------------------------------------------Copy to B - REPLACE ALL WHITE IN B

Range("A3:A" & Cells(Rows.Count, 1).End(xlUp).Row).Select
    Selection.Copy
    Range("B3:B" & Cells(Rows.Count, 1).End(xlUp).Row).Select
    ActiveSheet.Paste
    Selection.Replace What:=" ", Replacement:="", LookAt:=xlPart, _
        SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
        ReplaceFormat:=False
        Columns("A:L").EntireColumn.AutoFit

'--------------------------------------------------END
    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic
    Application.EnableEvents = True
    Range("a1").Select
End Sub

回答by aevanko

EDIT(deleted original answer as it was not applicable after recieving more info on what you wanted, but leaving advice)

编辑删除原始答案,因为它在收到有关您想要的更多信息后不适用,但留下建议)

  • You are creating/destroying the RE object every cell, which is expensive/unnessessary
    • If other users will use the function, create the object inside the code instead of adding references
    • There is no need to set the regex object to nothing at the end - variables are released from memory at the end of the function automatically
    • Improving your variable naming and using proper indentation could help improve readability and make it easier to edit
    • Add the multiline option in case your cells have line breaks inside them.
    • You might want to use a variant array if working with a lot of cells
  • 您正在创建/销毁每个单元格的 RE 对象,这是昂贵的/不必要的
    • 如果其他用户将使用该函数,请在代码内部创建对象,而不是添加引用
    • 不需要在最后将正则表达式对象设置为空 - 变量会在函数结束时自动从内存中释放
    • 改进变量命名和使用适当的缩进有助于提高可读性并使其更易于编辑
    • 如果您的单元格内部有换行符,请添加多行选项。
    • 如果处理大量单元格,您可能想要使用变体数组

UDPATE 2

UDPATE 2

Based one the comments below, here is how to get only occurances of two or more lowercase characters and the single spaces in-between. Instead of replacing what you DON'Twant, I personally think a good way is to extract what you DOwant. I have shared the below function quite a bit on this site as it's really useful. Here's an example of how to call it on the contents of Column A and put the results in Column B.

基于下面的评论之一,这里是如何只出现两个或更多小写字符和中间的单个空格。而不是取代你什么的不要想,我个人认为一个好方法是什么提取你DO想。我在这个网站上分享了很多下面的功能,因为它真的很有用。这是一个示例,说明如何在 A 列的内容上调用它并将结果放在 B 列中。

Sub test()

' Show how to run this on cells in A and transpose result in B
Dim varray As Variant
Dim i As Long

Application.ScreenUpdating = False
varray = Range("A1:A15000").Value

For i = 1 To UBound(varray, 1)
    varray(i, 1) = RegexExtract(varray(i, 1), "([a-z]{2,})", " ")
Next

Range("B1").Resize(UBound(varray, 1)).Value = _
Application.WorksheetFunction.Transpose(varray)

Application.ScreenUpdating = True

End Sub

And make sure this is in the module:

并确保这是在模块中:

Function RegexExtract(ByVal text As String, _
                      ByVal extract_what As String, _
                      Optional seperator As String = "") As String

Dim i As Long
Dim j As Long
Dim result As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")

RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)

For i = 0 To allMatches.Count - 1
    For j = 0 To allMatches.Item(i).submatches.Count - 1
        result = result & seperator & allMatches.Item(i).submatches.Item(j)
    Next
Next

If Len(result) <> 0 Then
    result = Right$(result, Len(result) - Len(seperator))
End If

RegexExtract = result

End Function

回答by phatfingers

Your "R.Pattern = "\d" is the only line you need to change. The "\d" is a regular expression describing a "digit".

您的 "R.Pattern = "\d" 是唯一需要更改的行。"\d" 是描述“数字”的正则表达式。

I would suggest changing "\d" to "^[a-z0-9] | [a-z0-9]\b" as a starting point.

我建议将 "\d" 更改为 "^[a-z0-9] | [a-z0-9]\b" 作为起点。

回答by brettdj

I rewrote your code below so that

我在下面重写了您的代码,以便

  • The RegExp is only created once. Your current code creates a new object then destroys it for each cell being tested as it is inside your loop
  • The code below uses a variant array to minimise process time when manipulating each cell value. The constant VbNullStringis slightly quicker than "".
  • you case use the simpler \w in a regex to match any a-z0-9
  • late binding on the RegExp object avoids the need to ak a third party to set a reference, setting ignore case to true makes your replacement case insenstive

         Sub Remove()
         Dim R As Object
         Dim C As Range
         Dim lngrow As Long
         Dim rng1 As Range
         Dim X
         Set R = CreateObject("vbscript.regexp")
         With R
           .Global = True
           .Pattern = "^\w\s|\b\w\b"
           .ignoreCase = True
         End With
         Application.ScreenUpdating = False
         Set rng1 = Range("A1:A" & Cells(Rows.Count, 1).End(xlUp).Row)
         X = rng1.Value2
         For lngrow = 1 To UBound(X, 1)
           X(lngrow, 1) = R.Replace(X(lngrow, 1), vbNullString)
         Next lngrow
         rng1.Value2 = X
         Application.ScreenUpdating = True
          End Sub
    
  • RegExp 只创建一次。您当前的代码创建一个新对象,然后为每个被测试的单元格销毁它,因为它在您的循环中
  • 下面的代码使用变体数组来最小化处理每个单元格值时的处理时间。常数VbNullString比“”稍快。
  • 你在正则表达式中使用更简单的 \w 来匹配任何 a-z0-9
  • RegExp 对象上的后期绑定避免了需要第三方来设置引用,将 ignore case 设置为 true 会使您的替换大小写不敏感

         Sub Remove()
         Dim R As Object
         Dim C As Range
         Dim lngrow As Long
         Dim rng1 As Range
         Dim X
         Set R = CreateObject("vbscript.regexp")
         With R
           .Global = True
           .Pattern = "^\w\s|\b\w\b"
           .ignoreCase = True
         End With
         Application.ScreenUpdating = False
         Set rng1 = Range("A1:A" & Cells(Rows.Count, 1).End(xlUp).Row)
         X = rng1.Value2
         For lngrow = 1 To UBound(X, 1)
           X(lngrow, 1) = R.Replace(X(lngrow, 1), vbNullString)
         Next lngrow
         rng1.Value2 = X
         Application.ScreenUpdating = True
          End Sub