vba 有没有办法在不使用 CountIf 函数的情况下检查 Excel 中的重复值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9638260/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 12:44:17  来源:igfitidea点击:

Is there a way to check for duplicate values in Excel WITHOUT using the CountIf function?

excelvbaexcel-vbaduplicates

提问by phan

A lot of the solutions here on SO involve using CountIfto find duplicates. When I have a list of 100,000+ values however, it will often take minutes for CountIfto search for duplicates.

SO 上的许多解决方案都涉及CountIf用于查找重复项。但是,当我有一个包含 100,000 多个值的列表时,CountIf搜索重复项通常需要几分钟时间。

Is there a quicker way to search for duplicates within an Excel column WITHOUTusing CountIf?

有没有一种使用 CountIf在 Excel 列中搜索重复项的更快方法?

Thanks!

谢谢!

EDIT #1:
After reading the comments and replies I realize I need to go into greater detail. Let's pretend I'm a birdwatcher, and after I return from a birdwatching trip I input anywhere from 1 to 25 or 50 new birds that I saw on my trip into my "Master List of Birds Seen". This is really a dynamically growing list, and with each addition I want to make sure I'm not duplicating something that already exists in my list.

编辑#1:
阅读评论和回复后,我意识到我需要更详细地了解。假设我是一名观鸟者,在我从观鸟旅行回来后,我将我在旅行中看到的 1 到 25 或 50 只新鸟输入到我的“所见鸟类总清单”中。这确实是一个动态增长的列表,每次添加时,我都想确保不会复制列表中已存在的内容。

So, in column A of my file are the names of the birds. Column B-M might contain other attributes of the birds. I want to know if a bird that I just added in column A after my latest birdwatching trip ALREADY exists somewhere ELSE in my list. And, if it does, I would manually merge the data of the 2 entries and throw away some and keep some after careful review. I clearly don't want to have duplicate entries of the same bird in my database.

因此,在我的文件的 A 列中是鸟类的名称。BM 列可能包含鸟类的其他属性。我想知道我最近一次观鸟之旅后刚在 A 列中添加的一只鸟是否已经存在于我的列表中的其他地方。而且,如果是这样,我会手动合并 2 个条目的数据,并在仔细检查后丢弃一些并保留一些。我显然不希望在我的数据库中有同一只鸟的重复条目。

So, ultimately I want some indication that there is or isn't a duplicate somewhere else, and if there is duplicate please tell me what row to look in (or highlight or color both of the duplicates).

所以,最终我想要一些迹象表明其他地方是否有重复,如果有重复,请告诉我要查看的行(或突出显示或着色这两个重复)。

采纳答案by lori_m

If using Excel 2007 or later (which is likely from the 100,000+ values) you can choose:

如果使用 Excel 2007 或更高版本(可能来自 100,000 多个值),您可以选择:

Home Tab | Conditional Formatting > Highlight Cell Rules > Duplicate Values...

主页选项卡 | 条件格式 > 突出显示单元格规则 > 重复值...

Right-click a highlighted cell and filter by selected cell color to show just the duplicates (be aware however this can be slow with conditional formatting).

右键单击突出显示的单元格并按选定的单元格颜色过滤以仅显示重复项(但请注意,条件格式设置可能会很慢)。

Alternatively run this code and filter for colored cells which takes only a second on 100,000 cells:

或者运行此代码并过滤彩色单元格,这在 100,000 个单元格上只需一秒钟:

Sub HighlightDupes()

Dim i As Long, dic As Variant, v As Variant

Application.ScreenUpdating = False
Set dic = CreateObject("Scripting.Dictionary")

i = 1
For Each v In Selection.Value2
    If dic.exists(v) Then dic(v) = "" Else dic.Add v, i
    i = i + 1
Next v

Selection.Font.Color = 255
For Each v In dic
    If dic(v) <> "" Then Selection(dic(v)).Font.Color = 0
Next v

End Sub

Addendum:

附录

To select only duplicate values without code or formulas, i have found this method useful:

要仅选择没有代码或公式的重复值,我发现此方法很有用:

Data Tab | Advanced Filter...Filter in Place, Unique Records Only, OK.

数据选项卡 | 高级过滤器...就地过滤,仅唯一记录,好的。

Now select the range of unique values and press Alt+; (Goto Special... Visible cells only). With this selection clear the filter and you will see that all unselected cells are duplicates, you can then press Ctrl+9 (Hide Rows) to show just the duplicates. These rows can be copied to another sheet if needed or marked with an "X".

现在选择唯一值的范围并按 Alt+;(转到特殊...仅可见单元格)。通过此选择清除过滤器,您将看到所有未选择的单元格都是重复的,然后您可以按 Ctrl+9(隐藏行)以仅显示重复项。如果需要,可以将这些行复制到另一个工作表或用“X”标记。

回答by Siddharth Rout

The fastest way that I know of (in case you are using Excel 2007/2010/2011) is to use Data (In Ribbon) | Remove Duplicatesto find the total number of duplicates OR to remove duplicates. You might want to move data to a temp sheet before you test this.

我所知道的最快方法(如果您使用的是 Excel 2007/2010/2011)是使用数据(在功能区中)| 删除重复项以查找重复项的总数或删除重复项。在测试之前,您可能希望将数据移动到临时表。

The 2nd fastest way is to use Countif. Now Countif can be used in many ways to find duplicates. Here are two main ways.

第二个最快的方法是使用 Countif。现在 Countif 可以以多种方式用于查找重复项。这里主要有两种方式。

1) Inserting a New Column next to the data and putting the formula and simply copying it down.

1)在数据旁边插入一个新列并放置公式并简单地将其复制下来。

2) Using Countif in Conditional formatting to highlight cells which are duplicates. For more details, please see this link.

2) 在条件格式中使用 Countif 突出显示重复的单元格。有关更多详细信息,请参阅此链接。

suggestions for a macro to find duplicates in a SINGLE column

建议宏在 SINGLE 列中查找重复项

EDIT:

编辑:

My Apologies :)

我很抱歉 :)

Countif is the 3rd fastest way!

Countif 是第三快的方式!

The 2nd fastest way is to use Pivot Tables ;)

第二个最快的方法是使用数据透视表;)

What exactly is your main purpose of finding duplicates? Do you want to delete them? Or Do you want to highlight them? Or something else?

您查找重复项的主要目的究竟是什么?你想删除它们吗?或者您想突出显示它们吗?或者是其他东西?

FOLLOWUP

跟进

Seems like I made a typo in the formula. Yes for large number of rows, CountIf does take minutes as you suggested.

好像我在公式中打错了字。是的,对于大量行,CountIf 确实像您建议的那样需要几分钟。

Let me see if I can come up with a VBA code to suit your exact needs.

让我看看我是否可以想出一个 VBA 代码来满足您的确切需求。

Sid

锡德

回答by assylias

You can use VBA - the following function returns a list of unique entries within a list of 100,000 in less than a second. Usage: select a range, type the formula (=getUniqueListFromRange(YourRange)) and validate with CTRL+SHIFT+ENTER.

您可以使用 VBA - 以下函数在不到一秒的时间内返回 100,000 个列表中的唯一条目列表。用法:选择一个范围,输入公式 (=getUniqueListFromRange(YourRange)) 并使用 CTRL+SHIFT+ENTER 进行验证。

Public Function getUniqueListFromRange(parRange As Range) As Variant
' Returns a (1 to n,1 to 1) array with all the values without duplicates

  Dim i As Long
  Dim j As Long
  Dim locKey As Variant
  Dim locData As Variant
  Dim locUniqueDict As Variant
  Dim locUniqueList As Variant

  On Error GoTo error_handler
  locData = Intersect(parRange.Parent.UsedRange, parRange)

  Set locUniqueDict = CreateObject("Scripting.Dictionary")

  On Error Resume Next
  For i = 1 To UBound(locData, 1)
    For j = 1 To UBound(locData, 2)
      locKey = UCase(locData(i, j))
      If locKey <> "" Then locUniqueDict.Add locKey, locData(i, j)
    Next j
  Next i

  If locUniqueDict.Count > 0 Then
    ReDim locUniqueList(1 To locUniqueDict.Count, 1 To 1) As Variant
    i = 1
    For Each locKey In locUniqueDict
      locUniqueList(i, 1) = locUniqueDict(locKey)
      i = i + 1
    Next
    getUniqueListFromRange = locUniqueList
  End If

error_handler:         'Empty range

End Function

回答by brettdj

Preventing Duplicates with Data Validation
You can use Data Validation to prevent you entering duplicate bird names. See Debra Dalgelish's sitehere

使用数据验证防止重复
您可以使用数据验证来防止您输入重复的鸟名。在此处查看黛布拉·达格利什 (Debra Dalgelish) 的网站

Handling existing duplicates
My free Duplicate Master addinwill let you

处理现有的重复
我的免费Duplicate Master 插件会让你

  • Select
  • Colour
  • List
  • Delete
  • 选择
  • 颜色
  • 列表
  • 删除

duplicates.

重复。

But more importantly it will let you run more complex matching than exact strings, ie

但更重要的是它会让你运行比精确字符串更复杂的匹配,即

  • Case Insensitive / Case Sensitive searches (sample below)
  • Trim/Clean data
  • Remove all blank spaces (including CHAR(160)) see the " mapgie" and "magpie" example below
  • Run regular expression matches (for example the sample below replaces s$with ""to remove plurals)
  • Match on any combination of columns (ie Column A, all columns, Column A&B etc)
  • 不区分大小写/区分大小写的搜索(以下示例)
  • 修剪/清理数据
  • 删除所有空格(包括 CHAR(160)),请参见下面的“mapgie”和“magpie”示例
  • 运行的正则表达式匹配(例如低于替换样品s$""以除去复数)
  • 匹配列的任意组合(即 A 列、所有列、A 列和 B 列等)

enter image description here

在此处输入图片说明

回答by datatoo

You do not mention what you want to do when you find them. If you merely want to see where they are...

当你找到他们时,你没有提到你想做什么。如果你只是想看看他们在哪里......

Sub HighLightCells()
   ActiveSheet.UsedRange.Cells.FormatConditions.Delete
   ActiveSheet.UsedRange.Cells.FormatConditions.Add Type:=xlCellValue, Operator:=xlEqual,  Formula1:=ActiveCell
   ActiveSheet.UsedRange.Cells.FormatConditions(1).Interior.ColorIndex = 4
End Sub

回答by user2573063

I'm surprised that no one has mentioned the RemoveDuplicates method.

我很惊讶没有人提到 RemoveDuplicates 方法。

ActiveSheet.Range("A:A").RemoveDuplicates Columns:=1

This will simply remove any duplicate entries on the active worksheet in column A. It takes milliseconds to run (tested with 200k rows). Mind you, this will strictly delete all the duplicate entries. Although that isn't how the original question was worded, I do believe that this still serves your purpose.

这将简单地删除 A 列中活动工作表上的任何重复条目。运行需要几毫秒(用 200k 行测试)。请注意,这将严格删除所有重复条目。尽管这不是原始问题的措辞,但我确实相信这仍然符合您的目的。

回答by polaco

Sort the rangeand in next column put `=if(a2=a1;1;if(a2=a3;1;0))

对范围进行排序,然后在下一列中输入 `=if(a2=a1;1;if(a2=a3;1;0))

"1" will be displayed for duplicates.

重复项将显示“1”。

回答by Baala

One simple way of finding unique values is to use the advance filter and filter for unique values only and copy and paste them into other sheet as when the pivot is removed you will get the whole data with the duplicate in them.

查找唯一值的一种简单方法是仅对唯一值使用高级过滤器和过滤器,然后将它们复制并粘贴到其他工作表中,因为当删除枢轴时,您将获得包含重复项的整个数据。