vba 如何删除excel单元格内以逗号分隔的重复项?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21533574/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-12 01:42:41  来源:igfitidea点击:

How to remove duplicates separated by a comma inside cells in excel?

excelvbaduplicates

提问by user3263960

I was handled a very long excel file (up to 11000 rows and 7 columns) that has many repeated data inside a cell. I am looking for a macro to get rid of it but couldn't find any.

我处理了一个非常长的 excel 文件(最多 11000 行和 7 列),该文件在一个单元格中有许多重复的数据。我正在寻找一个宏来摆脱它,但找不到任何宏。

Example of one such cells:

一个这样的细胞的例子:

Ciencias de la Educación,Educación,Pedagogía,Ciencias de la Educación,Educación,Pedagogía

Ciencias de la Educación,Educación,Pedagogía,Ciencias de la Educación,Educación,Pedagogía

It should look like:

它应该看起来像:

Ciencias de la Educación,Educación,Pedagogía

Ciencias de la Educación,Educación,Pedagogía

How can I get rid of the thousands of repeats (not to mention the extra, orphaned, commas)?

我怎样才能摆脱成千上万的重复(更不用说额外的、孤立的、逗号)?

采纳答案by Dmitry Pavliv

This code runs 6 seconds on my machine and 2 seconds on @SiddharthRout's machine:)(with data in cells A1:G20000: 20000x7=140000 non empty cells)

此代码在我的机器上运行 6 秒,A1:G20000@SiddharthRout的机器上运行 2 秒:)(单元格中的数据:20000x7=140000 非空单元格)

Sub test2()
    Dim c, arr, el, data, it
    Dim start As Date
    Dim targetRange As Range

    Dim dict As Object
    Set dict = CreateObject("Scripting.dictionary")

    Application.ScreenUpdating = False

    Set targetRange = Range("A1:G20000")

    data = targetRange

    start = Now
    For i = LBound(data) To UBound(data)
        For j = LBound(data, 2) To UBound(data, 2)
            c = data(i, j)
            dict.RemoveAll
            arr = Split(c, ",")
            For Each el In arr
                On Error Resume Next
                dict.Add Trim(el), Trim(el)
                On Error GoTo 0
            Next
            c = ""
            For Each it In dict.Items
               c = c & it & ","
            Next
            If c <> "" Then c = Left(c, Len(c) - 1)
            data(i, j) = c
        Next j
    Next i
    targetRange = data
    Application.ScreenUpdating = True

    MsgBox "Working time: " & Format(Now - start, "hh:nn:ss")

End Sub


You can make this code slightly fasterby changing next two lines

您可以通过更改接下来的两行来使此代码稍微快一点

Dim dict As Object
Set dict = CreateObject("Scripting.dictionary")

to

Dim dict As new Dictionary

after adding reference to library: go to Tools->Referencesand select "Microsoft Scripting Runtime"

添加对库的引用后:转到“工具”->“引用”并选择“Microsoft Scripting Runtime”

enter image description here

在此处输入图片说明

回答by Siddharth Rout

Here is a basic example

这是一个基本的例子

Sub Sample()
    Dim sString As String
    Dim MyAr As Variant
    Dim Col As New Collection
    Dim itm

    sString = "Ciencias de la Educación,Educación,Pedagogía,Ciencias de la Educación,Educación,Pedagogía"

    MyAr = Split(sString, ",")

    For i = LBound(MyAr) To UBound(MyAr)
        On Error Resume Next
        Col.Add Trim(MyAr(i)), CStr(Trim(MyAr(i)))
        On Error GoTo 0
    Next i

    sString = ""

    For Each itm In Col
        sString = sString & "," & itm
    Next

    sString = Mid(sString, 2)

    Debug.Print sString
End Sub

EDIT

编辑

Tried and tested in Excel 2010 with A1:G20000filled with Ciencias de la Educación,Educación,Pedagogía,Ciencias de la Educación,Educación,Pedagogía

在 Excel 2010 中尝试和测试,A1:G20000填充Ciencias de la Educación,Educación,Pedagogía,Ciencias de la Educación,Educación,Pedagogía

Time Taken: 2 Seconds

耗时: 2 秒

Code

代码

Sub Sample()
    Dim sString As String
    Dim MyAr As Variant, rngAr
    Dim Col As New Collection
    Dim itm
    Dim rng As Range

    Debug.Print "StartTime: " & Now

    Set rng = ThisWorkbook.Sheets("Sheet1").Range("A1:G20000")
    rngAr = rng.Value

    For i = LBound(rngAr) To UBound(rngAr)
        For j = LBound(rngAr, 2) To UBound(rngAr, 2)
            MyAr = Split(rngAr(i, j), ",")

            For k = LBound(MyAr) To UBound(MyAr)
                On Error Resume Next
                Col.Add Trim(MyAr(k)), CStr(Trim(MyAr(k)))
                On Error GoTo 0
            Next k

            sString = ""

            For Each itm In Col
                sString = sString & "," & itm
            Next

            sString = Mid(sString, 2)

            rngAr(i, j) = sString
        Next j
    Next i

    ThisWorkbook.Sheets("Sheet1").Range("A1").Resize(20000, 7).Value = rngAr

    Debug.Print "EndTime: " & Now
End Sub

ScreenShot

截屏

enter image description here

在此处输入图片说明