如何在 Java 中将 CamelCase 转换为人类可读的名称？

Question

提问by Frederik

I'd like to write a method that converts CamelCase into a human-readable name.

我想编写一个将 CamelCase 转换为人类可读名称的方法。

Here's the test case:

这是测试用例：

public void testSplitCamelCase() {
    assertEquals("lowercase", splitCamelCase("lowercase"));
    assertEquals("Class", splitCamelCase("Class"));
    assertEquals("My Class", splitCamelCase("MyClass"));
    assertEquals("HTML", splitCamelCase("HTML"));
    assertEquals("PDF Loader", splitCamelCase("PDFLoader"));
    assertEquals("A String", splitCamelCase("AString"));
    assertEquals("Simple XML Parser", splitCamelCase("SimpleXMLParser"));
    assertEquals("GL 11 Version", splitCamelCase("GL11Version"));
}

Answer 1

采纳答案by polygenelubricants

This works with your testcases:

这适用于您的测试用例：

static String splitCamelCase(String s) {
   return s.replaceAll(
      String.format("%s|%s|%s",
         "(?<=[A-Z])(?=[A-Z][a-z])",
         "(?<=[^A-Z])(?=[A-Z])",
         "(?<=[A-Za-z])(?=[^A-Za-z])"
      ),
      " "
   );
}

Here's a test harness:

这是一个测试工具：

    String[] tests = {
        "lowercase",        // [lowercase]
        "Class",            // [Class]
        "MyClass",          // [My Class]
        "HTML",             // [HTML]
        "PDFLoader",        // [PDF Loader]
        "AString",          // [A String]
        "SimpleXMLParser",  // [Simple XML Parser]
        "GL11Version",      // [GL 11 Version]
        "99Bottles",        // [99 Bottles]
        "May5",             // [May 5]
        "BFG9000",          // [BFG 9000]
    };
    for (String test : tests) {
        System.out.println("[" + splitCamelCase(test) + "]");
    }

It uses zero-length matching regex with lookbehind and lookforward to find where to insert spaces. Basically there are 3 patterns, and I use String.formatto put them together to make it more readable.

它使用零长度匹配正则表达式与后视和前瞻来查找插入空格的位置。基本上有 3 种模式，我习惯String.format将它们放在一起以使其更具可读性。

The three patterns are:

这三种模式是：

UC behind me, UC followed by LC in front of me

UC在我身后，UC跟在我前面的LC

  XMLParser   AString    PDFLoader
    /\        /\           /\

non-UC behind me, UC in front of me

非UC在我身后，UC在我面前

 MyClass   99Bottles
  /\        /\

Letter behind me, non-letter in front of me

我身后的信，我面前的非信

 GL11    May5    BFG9000
  /\       /\      /\

References

参考

regular-expressions.info/Lookarounds

正则表达式.info/Lookarounds

回答by Jens

The following Regex can be used to identify the capitals inside words:

以下正则表达式可用于识别单词中的大写字母：

"((?<=[a-z0-9])[A-Z]|(?<=[a-zA-Z])[0-9]]|(?<=[A-Z])[A-Z](?=[a-z]))"

It matches every capital letter, that is ether after a non-capital letter or digit or followed by a lower case letter and every digit after a letter.

它匹配每个大写字母，即在非大写字母或数字之后的以太或后跟小写字母和字母之后的每个数字。

How to insert a space before them is beyond my Java skills =)

如何在它们之前插入一个空格超出了我的 Java 技能 =)

Edited to include the digit case and the PDF Loader case.

编辑以包括数字案例和 PDF 加载程序案例。

Answer 3

回答by Felix

I think you will have to iterate over the string and detect changes from lowercase to uppercase, uppercase to lowercase, alphabetic to numeric, numeric to alphabetic. On every change you detect insert a space with one exception though: on a change from upper- to lowercase you insert the space one character before.

我认为您必须遍历字符串并检测从小写到大写、大写到小写、字母到数字、数字到字母的变化。在每次更改时，您检测到插入一个空格，但有一个例外：在从大写到小写的更改中，您在前一个字符插入空格。

Answer 4

回答by BeesonBison

http://code.google.com/p/inflection-js/

You could chain the String.underscore().humanize()methods to take a CamelCase string and convert it into a human readable string.

您可以链接String.underscore().humanize()方法以获取 CamelCase 字符串并将其转换为人类可读的字符串。

Answer 5

回答by Joel

I'm not a regex ninja, so I'd iterate over the string, keeping the indexes of the current position being checked & the previous position. If the current position is a capital letter, I'd insert a space after the previous position and increment each index.

我不是正则表达式忍者，所以我会遍历字符串，保持当前位置的索引被检查和前一个位置。如果当前位置是大写字母，我会在前一个位置后插入一个空格并增加每个索引。

Answer 6

回答by Hendy Irawan

You can use org.modeshape.common.text.Inflector.

您可以使用org.modeshape.common.text.Inflector。

Specifically:

具体来说：

String humanize(String lowerCaseAndUnderscoredWords,
    String... removableTokens) 
Capitalizes the first word and turns underscores into spaces and strips trailing "_id" and any supplied removable tokens.

String humanize(String lowerCaseAndUnderscoredWords,
    String... removableTokens) 
将第一个单词大写并将下划线变成空格并去除尾随“_id”和任何提供的可移动标记。

Maven artifact is: org.modeshape:modeshape-common:2.3.0.Final

Maven 工件是：org.modeshape:modeshape-common:2.3.0.Final

on JBoss repository: https://repository.jboss.org/nexus/content/repositories/releases

在 JBoss 存储库上：https: //repository.jboss.org/nexus/content/repositories/releases

Here's the JAR file: https://repository.jboss.org/nexus/content/repositories/releases/org/modeshape/modeshape-common/2.3.0.Final/modeshape-common-2.3.0.Final.jar

这是 JAR 文件：https: //repository.jboss.org/nexus/content/repositories/releases/org/modeshape/modeshape-common/2.3.0.Final/modeshape-common-2.3.0.Final.jar

Answer 7

回答by gerferra

For the record, here is an almost (*) compatible Scala version:

作为记录，这里有一个几乎 (*) 兼容的 Scala 版本：

  object Str { def unapplySeq(s: String): Option[Seq[Char]] = Some(s) }

  def splitCamelCase(str: String) =
    String.valueOf(
      (str + "A" * 2) sliding (3) flatMap {
        case Str(a, b, c) =>
          (a.isUpper, b.isUpper, c.isUpper) match {
            case (true, false, _) => " " + a
            case (false, true, true) => a + " "
            case _ => String.valueOf(a)
          }
      } toArray
    ).trim

Once compiled it can be used directly from Java if the corresponding scala-library.jar is in the classpath.

编译后，如果相应的 scala-library.jar 位于类路径中，则可以直接从 Java 中使用它。

(*) it fails for the input "GL11Version"for which it returns "G L11 Version".

(*)"GL11Version"它返回的输入失败"G L11 Version"。

Answer 8

回答by jlb83

If you don't like "complicated" regex's, and aren't at all bothered about efficiency, then I've used this example to achieve the same effect in three stages.

如果您不喜欢“复杂”的正则表达式，并且根本不关心效率，那么我已经使用这个示例在三个阶段实现了相同的效果。

String name = 
    camelName.replaceAll("([A-Z][a-z]+)", " ") // Words beginning with UC
             .replaceAll("([A-Z][A-Z]+)", " ") // "Words" of only UC
             .replaceAll("([^A-Za-z ]+)", " ") // "Words" of non-letters
             .trim();

It passes all the test cases above, including those with digits.

它通过了上述所有测试用例，包括那些带有数字的测试用例。

As I say, this isn't as good as using the one regular expression in some other examples here - but someone might well find it useful.

正如我所说，这不如在其他一些示例中使用 one 正则表达式好 - 但有人可能会发现它很有用。

Answer 9

回答by vbullinger

I took the Regex from polygenelubricants and turned it into an extension method on objects:

我从 polygenelubricants 中获取了 Regex 并将其转换为对象的扩展方法：

    /// <summary>
    /// Turns a given object into a sentence by:
    /// Converting the given object into a <see cref="string"/>.
    /// Adding spaces before each capital letter except for the first letter of the string representation of the given object.
    /// Makes the entire string lower case except for the first word and any acronyms.
    /// </summary>
    /// <param name="original">The object to turn into a proper sentence.</param>
    /// <returns>A string representation of the original object that reads like a real sentence.</returns>
    public static string ToProperSentence(this object original)
    {
        Regex addSpacesAtCapitalLettersRegEx = new Regex(@"(?<=[A-Z])(?=[A-Z][a-z]) | (?<=[^A-Z])(?=[A-Z]) | (?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);
        string[] words = addSpacesAtCapitalLettersRegEx.Split(original.ToString());
        if (words.Length > 1)
        {
            List<string> wordsList = new List<string> { words[0] };
            wordsList.AddRange(words.Skip(1).Select(word => word.Equals(word.ToUpper()) ? word : word.ToLower()));
            words = wordsList.ToArray();
        }
        return string.Join(" ", words);
    }

This turns everything into a readable sentence. It does a ToString on the object passed. Then it uses the Regex given by polygenelubricants to split the string. Then it ToLowers each word except for the first word and any acronyms. Thought it might be useful for someone out there.

这将所有内容都变成了一个可读的句子。它对传递的对象执行 ToString。然后它使用 polygenelubricants 给出的 Regex 来拆分字符串。然后它降低除第一个单词和任何首字母缩略词之外的每个单词。认为它可能对那里的人有用。

Answer 10

回答by Ralph

You can do it using org.apache.commons.lang.StringUtils

你可以使用 org.apache.commons.lang.StringUtils

StringUtils.join(
     StringUtils.splitByCharacterTypeCamelCase("ExampleTest"),
     ' '
);

如何在 Java 中将 CamelCase 转换为人类可读的名称？

提问by Frederik

采纳答案by polygenelubricants

UC behind me, UC followed by LC in front of me

UC在我身后，UC跟在我前面的LC

non-UC behind me, UC in front of me

非UC在我身后，UC在我面前

Letter behind me, non-letter in front of me

我身后的信，我面前的非信

References

参考

Related questions

相关问题

回答by Jens

回答by Felix

回答by BeesonBison

回答by Joel

回答by Hendy Irawan

回答by gerferra

回答by jlb83

回答by vbullinger

回答by Ralph

相关推荐

最近更新

标签

如何在 Java 中将 CamelCase 转换为人类可读的名称？

提问by Frederik

采纳答案by polygenelubricants

UC behind me, UC followed by LC in front of me

UC在我身后，UC跟在我前面的LC

non-UC behind me, UC in front of me

非UC在我身后，UC在我面前

Letter behind me, non-letter in front of me

我身后的信，我面前的非信

References

参考

Related questions

相关问题

回答by Jens

回答by Felix

回答by BeesonBison

回答by Joel

回答by Hendy Irawan

回答by gerferra

回答by jlb83

回答by vbullinger

回答by Ralph

相关推荐

Java 如何在 Tomcat 6 中为 Hibernate 使用 JTA 支持？

Java 如何在 IntelliJ IDEA 中从 -source 1.6 更改为 -source 7

我如何在 Java 中对方法执行时间进行基准测试？

Java 密钥库：删除密码

相关推荐

最近更新

标签