Java 删除字符串中重复字符的函数

Question

提问by Jony

The following code is trying to remove any duplicate characters in a string. I'm not sure if the code is right. Can anybody help me work with the code (i.e whats actually happening when there is a match in characters)?

以下代码试图删除字符串中的任何重复字符。我不确定代码是否正确。任何人都可以帮助我处理代码（即字符匹配时实际发生了什么）？

public static void removeDuplicates(char[] str) {
  if (str == null) return;
  int len = str.length;
  if (len < 2) return;
  int tail = 1;
  for (int i = 1; i < len; ++i) {
    int j;
    for (j = 0; j < tail; ++j) {
      if (str[i] == str[j]) break;
    }
    if (j == tail) {
      str[tail] = str[i];
      ++tail;
    }
  }
  str[tail] = 0;
}

Answer 1

采纳答案by codaddict

The function looks fine to me. I've written inline comments. Hope it helps:

该功能对我来说看起来不错。我已经写了内联评论。希望能帮助到你：

// function takes a char array as input.
// modifies it to remove duplicates and adds a 0 to mark the end
// of the unique chars in the array.
public static void removeDuplicates(char[] str) {
  if (str == null) return; // if the array does not exist..nothing to do return.
  int len = str.length; // get the array length.
  if (len < 2) return; // if its less than 2..can't have duplicates..return.
  int tail = 1; // number of unique char in the array.
  // start at 2nd char and go till the end of the array.
  for (int i = 1; i < len; ++i) { 
    int j;
    // for every char in outer loop check if that char is already seen.
    // char in [0,tail) are all unique.
    for (j = 0; j < tail; ++j) {
      if (str[i] == str[j]) break; // break if we find duplicate.
    }
    // if j reachs tail..we did not break, which implies this char at pos i
    // is not a duplicate. So we need to add it our "unique char list"
    // we add it to the end, that is at pos tail.
    if (j == tail) {
      str[tail] = str[i]; // add
      ++tail; // increment tail...[0,tail) is still "unique char list"
    }
  }
  str[tail] = 0; // add a 0 at the end to mark the end of the unique char.
}

Answer 2

回答by cjk

This would be much easier if you just looped through the array and added all new characters to a list, then retruned that list.

如果您只是遍历数组并将所有新字符添加到列表中，然后重新调整该列表，这会容易得多。

With this approach, you need to reshuffle the array as you step through it and eventually redimension it to the appropriate size in the end.

使用这种方法，您需要在逐步遍历数组时重新洗牌，并最终将其重新调整为适当的大小。

Answer 3

回答by polygenelubricants

Your code is, I'm sorry to say, very C-like.

很抱歉，您的代码非常像 C。

A Java Stringis not a char[]. You say you want to remove duplicates from a String, but you take a char[]instead.

JavaString不是char[]. 你说你想从 a 中删除重复项String，但你取了 a char[]。

Is this char[]\0-terminated? Doesn't look like it because you take the whole .lengthof the array. But then your algorithm tries to \0-terminate a portion of the array. What happens if the arrays contains no duplicates?

这是char[]\0终止了吗？看起来不像，因为你拿走了整个.length数组。但是随后您的算法尝试\0终止数组的一部分。如果数组不包含重复项会发生什么？

Well, as it is written, your code actually throws an ArrayIndexOutOfBoundsExceptionon the last line! There is no room for the \0because all slots are used up!

好吧，正如所写的那样，您的代码实际上ArrayIndexOutOfBoundsException在最后一行抛出了一个！没有空间了，\0因为所有的插槽都用完了！

You can add a check not to add \0in this exceptional case, but then how are you planning to use this code anyway? Are you planning to have a strlen-like function to find the first \0in the array? And what happens if there isn't any? (due to all-unique exceptional case above?).

\0在这种特殊情况下，您可以添加一个不添加的检查，但是无论如何您打算如何使用此代码？您是否打算使用类似strlen函数来查找\0数组中的第一个？如果没有会发生什么？（由于上述所有独特的例外情况？）。

What happens if the original String/char[]contains a \0? (which is perfectly legal in Java, by the way, see JLS 10.9 An Array of Characters is Not a String)

如果原始String/char[]包含会发生什么\0？（顺便说一下，这在 Java 中是完全合法的，请参阅JLS 10.9 An Array of Characters Is Not a String）

The result will be a mess, and all because you want to do everything C-like, and in place without any additional buffer. Are you sure you really need to do this? Why not work with String, indexOf, lastIndexOf, replace, and all the higher-level API of String? Is it provably too slow, or do you only suspect that it is?

结果将是一团糟，这一切都是因为您想像 C 一样做所有事情，并且在没有任何额外缓冲区的情况下就地进行。你确定你真的需要这样做吗？为什么不使用String、indexOf、lastIndexOf、replace和的所有高级 API String？是否可以证明它太慢了，还是您只是怀疑它是？

"Premature optimization is the root of all evils". I'm sorry but if you can't even understand what the original code does, then figuring out how it will fit in the bigger (and messier) system will be a nightmare.

“过早优化是万恶之源”。很抱歉，但如果您甚至无法理解原始代码的作用，那么弄清楚它如何适应更大（更混乱）的系统将是一场噩梦。

My minimal suggestion is to do the following:

我的最小建议是执行以下操作：

Make the function takes and returns a String, i.e. public static String removeDuplicates(String in)
Internally, works with char[] str = in.toCharArray();
Replace the last line by return new String(str, 0, tail);

使函数接受并返回 a String，即public static String removeDuplicates(String in)
在内部，与 char[] str = in.toCharArray();
将最后一行替换为 return new String(str, 0, tail);

This does use additional buffers, but at least the interface to the rest of the system is much cleaner.

这确实使用了额外的缓冲区，但至少与系统其余部分的接口要干净得多。

Alternatively, you can use StringBuilderas such:

或者，您可以这样使用StringBuilder：

static String removeDuplicates(String s) {
    StringBuilder noDupes = new StringBuilder();
    for (int i = 0; i < s.length(); i++) {
        String si = s.substring(i, i + 1);
        if (noDupes.indexOf(si) == -1) {
            noDupes.append(si);
        }
    }
    return noDupes.toString();
}

Note that this is essentially the same algorithm as what you had, but much cleaner and without as many little corner cases, etc.

请注意，这与您拥有的算法本质上是相同的，但更清晰，并且没有那么多小角落等。

Answer 4

回答by Shrivatsan

char[] chars = s.toCharArray();
    HashSet<Character> charz = new HashSet<Character>();

    for(Character c : s.toCharArray() )
    {
        if(!charz.contains(c))
        {
            charz.add(c);
            //System.out.print(c);
        }
    }

    for(Character c : charz)
    {
        System.out.print(c);
    }

Answer 5

回答by jcrshankar

    String s = "Javajk";
    List<Character> charz = new ArrayList<Character>();
    for (Character c : s.toCharArray()) {
        if (!(charz.contains(Character.toUpperCase(c)) || charz
                .contains(Character.toLowerCase(c)))) {
            charz.add(c);
        }
    }
     ListIterator litr = charz.listIterator();
   while (litr.hasNext()) {

       Object element = litr.next();
       System.err.println(":" + element);

   }    }

this will remove the duplicate if the character present in both the case.

如果字符出现在两种情况下，这将删除重复项。

Answer 6

回答by Shrivatsan

public class StringRedundantChars {
    /**
     * @param args
     */
    public static void main(String[] args) {

        //initializing the string to be sorted
        String sent = "I love painting and badminton";

        //Translating the sentence into an array of characters
        char[] chars = sent.toCharArray();

        System.out.println("Before Sorting");
        showLetters(chars);

        //Sorting the characters based on the ASCI character code. 
        java.util.Arrays.sort(chars);

        System.out.println("Post Sorting");
        showLetters(chars);

        System.out.println("Removing Duplicates");
        stripDuplicateLetters(chars);

        System.out.println("Post Removing Duplicates");
        //Sorting to collect all unique characters 
        java.util.Arrays.sort(chars);
        showLetters(chars);

    }

    /**
     * This function prints all valid characters in a given array, except empty values
     * 
     * @param chars Input set of characters to be displayed
     */
    private static void showLetters(char[] chars) {

        int i = 0;
        //The following loop is to ignore all white spaces
        while ('    public static void removeDuplicates(char[] str) {
        int map = 0;
        for (int i = 0; i < str.length; i++) {
            if ((map & (1 << (str[i] - 'a'))) > 0) // duplicate detected
                str[i] = 0;
            else // add unique char as a bit '1' to the map
                map |= 1 << (str[i] - 'a');
        }
    }
' == chars[i]) {
            i++;
        }
        for (; i < chars.length; i++) {
            System.out.print(" " + chars[i]);
        }
        System.out.println();
    }

    private static char[] stripDuplicateLetters(char[] chars) {

        // Basic cursor that is used to traverse through the unique-characters
        int cursor = 0;
        // Probe which is used to traverse the string for redundant characters
        int probe = 1;

        for (; cursor < chars.length - 1;) {

            // Checking if the cursor and probe indices contain the same
            // characters
            if (chars[cursor] == chars[probe]) {
                System.out.println("Removing char : " + chars[probe]);
                // Please feel free to replace the redundant character with
                // character. I have used 'private static String removeDuplicateCharactersFromWord(String word) {

    String result = new String("");

    for (int i = 0; i < word.length(); i++) {
        if (!result.contains("" + word.charAt(i))) {
            result += "" + word.charAt(i);
        }
    }

    return result;
}
'
                chars[probe] = 'void removeduplicate(char *str)
{
    int checker = 0;
    int cnt = 0;
    for(int i = 0; i < strlen(str); i++)
    {
        int val = *(str + i) - (int)'a';
        if ((checker & (1 << val)) > 0) continue;
        else {
            *(str + cnt) = *(str + i);
            cnt++;
        }
        checker |= (1 << val);
    }
    *(str+cnt) = 'public class RemoveDuplicateInString {
    public static void main(String[] args) {
        String s = "ABCDDCA";
        RemoveDuplicateInString rs = new RemoveDuplicateInString();
        System.out.println(rs.removeDuplicate(s));

    }

    public String removeDuplicate(String s) {
        String retn = null;
        boolean[] b = new boolean[256];

        char[] ch = s.toCharArray();
        for (int i = 0; i < ch.length; i++) {

            if (b[ch[i]]) {
                ch[i]=' ';

            }

            else {
                b[ch[i]] = true;

            }
        }

        retn = new String(ch);
        return retn;

    }

}
';
}
';
                // Pushing the probe to the next character
                probe++;
            } else {
                // Since the probe has traversed the chars from cursor it means
                // that there were no unique characters till probe.
                // Hence set cursor to the probe value
                cursor = probe;
                // Push the probe to refer to the next character
                probe++;
            }
        }
        System.out.println();

        return chars;
    }
}

Answer 7

回答by Dhruv Gairola

Given the following question :

鉴于以下问题：

Write code to remove the duplicate characters in a string without using any additional buffer. NOTE: One or two additional variables are fine.An extra copy of the array is not.

编写代码以在不使用任何额外缓冲区的情况下删除字符串中的重复字符。注意：一两个附加变量就可以了。数组的额外副本不是。

Since one or two additional variables are fine but no buffer is allowed, you can simulate the behaviour of a hashmap by using an integer to store bits instead. This simple solution runs at O(n), which is faster than yours. Also, it isn't conceptually complicated and in-place :

由于一两个附加变量很好，但不允许使用缓冲区，因此您可以通过使用整数来存储位来模拟哈希图的行为。这个简单的解决方案以 O(n) 运行，这比你的要快。此外，它在概念上并不复杂和就地：

##代码##

The drawback is that the duplicates (which are replaced with 0's) will not be placed at the end of the str[] array. However, this can easily be fixed by looping through the array one last time. Also, an integer has the capacity for only regular letters.

缺点是重复项（用 0 替换）不会放在 str[] 数组的末尾。但是，这可以通过最后一次遍历数组轻松解决。此外，整数只能容纳常规字母。

Answer 8

回答by Arvind Krishnakumar

##代码##

Answer 9

回答by Deven

This will solve purpose in fast and simple code. It gives result in O(n).

这将解决快速和简单代码的目的。它给出了结果O(n)。

##代码##

Answer 10

回答by dasa

##代码##

Java 删除字符串中重复字符的函数

提问by Jony

采纳答案by codaddict

回答by cjk

回答by polygenelubricants

回答by Shrivatsan

回答by jcrshankar

回答by Shrivatsan

回答by Dhruv Gairola

回答by Arvind Krishnakumar

回答by Deven

回答by dasa

相关推荐

最近更新

标签

Java 删除字符串中重复字符的函数

提问by Jony

采纳答案by codaddict

回答by cjk

回答by polygenelubricants

回答by Shrivatsan

回答by jcrshankar

回答by Shrivatsan

回答by Dhruv Gairola

回答by Arvind Krishnakumar

回答by Deven

回答by dasa

相关推荐

Java 在 WAR 文件中定义 Servlet 上下文

Java 如何处理来自 REST 服务的大量数据

Java Hibernate 一对一：getId() 不获取整个对象

在运行时根据输入从属性文件中获取值 - java Spring

相关推荐

最近更新

标签