Java 用省略号截断字符串的理想方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3597550/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 02:25:18  来源:igfitidea点击:

Ideal method to truncate a string with ellipsis

javaellipsis

提问by Amy B

I'm sure all of us have seen ellipsis' on Facebook statuses (or elsewhere), and clicked "Show more" and there are only another 2 characters or so. I'd guess this is because of lazy programming, because surely there is an ideal method.

我相信我们所有人都在 Facebook 状态(或其他地方)上看到过省略号,然后点击“显示更多”,然后只有另外 2 个字符左右。我猜这是因为懒惰的编程,因为肯定有一种理想的方法。

Mine counts slim characters [iIl1]as "half characters", but this doesn't get around ellipsis' looking silly when they hide barely any characters.

我的将纤细的字符算作[iIl1]“半字符”,但这并不能解决省略号在几乎不隐藏任何字符时看起来很傻的问题。

Is there an ideal method? Here is mine:

有没有理想的方法?这是我的:

/**
 * Return a string with a maximum length of <code>length</code> characters.
 * If there are more than <code>length</code> characters, then string ends with an ellipsis ("...").
 *
 * @param text
 * @param length
 * @return
 */
public static String ellipsis(final String text, int length)
{
    // The letters [iIl1] are slim enough to only count as half a character.
    length += Math.ceil(text.replaceAll("[^iIl]", "").length() / 2.0d);

    if (text.length() > length)
    {
        return text.substring(0, length - 3) + "...";
    }

    return text;
}

Language doesn't really matter, but tagged as Java because that's what I'm mostly interested in seeing.

语言并不重要,但被标记为 Java,因为这是我最感兴趣的。

采纳答案by aioobe

I like the idea of letting "thin" characters count as half a character. Simple and a good approximation.

我喜欢让“瘦”字符算作半个字符的想法。简单且很好的近似。

The main issue with most ellipsizings however, are (imho) that they chop of words in the middle. Here is a solution taking word-boundaries into account (but does not dive into pixel-math and the Swing-API).

然而,大多数省略号的主要问题是(恕我直言)它们在中间切掉了单词。这是一个考虑到字边界的解决方案(但没有深入研究像素数学和 Swing-API)。

private final static String NON_THIN = "[^iIl1\.,']";

private static int textWidth(String str) {
    return (int) (str.length() - str.replaceAll(NON_THIN, "").length() / 2);
}

public static String ellipsize(String text, int max) {

    if (textWidth(text) <= max)
        return text;

    // Start by chopping off at the word before max
    // This is an over-approximation due to thin-characters...
    int end = text.lastIndexOf(' ', max - 3);

    // Just one long word. Chop it off.
    if (end == -1)
        return text.substring(0, max-3) + "...";

    // Step forward as long as textWidth allows.
    int newEnd = end;
    do {
        end = newEnd;
        newEnd = text.indexOf(' ', end + 1);

        // No more spaces.
        if (newEnd == -1)
            newEnd = text.length();

    } while (textWidth(text.substring(0, newEnd) + "...") < max);

    return text.substring(0, end) + "...";
}

A test of the algorithm looks like this:

该算法的测试如下所示:

enter image description here

enter image description here

回答by trashgod

It seems like you might get more accurate geometry from the Java graphics context's FontMetrics.

看起来您可能会从 Java 图形上下文的FontMetrics.

Addendum: In approaching this problem, it may help to distinguish between the model and view. The model is a String, a finite sequence of UTF-16 code points, while the view is a series of glyphs, rendered in some font on some device.

附录:在解决这个问题时,区分模型和视图可能会有所帮助。模型是一个StringUTF-16 代码点的有限序列,而视图是一系列字形,在某些设备上以某种字体呈现。

In the particular case of Java, one can use SwingUtilities.layoutCompoundLabel()to effect the translation. The example below intercepts the layout call in BasicLabelUIto demonstrate the effect. It may be possible to use the utility method in other contexts, but the appropriate FontMetricswould have to be be determined empirically.

在 Java 的特殊情况下,可以使用SwingUtilities.layoutCompoundLabel()来实现翻译。下面的例子截取布局调用进来BasicLabelUI演示效果。在其他情况下可能会使用效用方法,但FontMetrics必须根据经验确定适当的方法。

alt text

alt text

import java.awt.Color;
import java.awt.EventQueue;
import java.awt.Font;
import java.awt.FontMetrics;
import java.awt.GridLayout;
import java.awt.Rectangle;
import java.awt.event.ComponentAdapter;
import java.awt.event.ComponentEvent;
import javax.swing.BorderFactory;
import javax.swing.Icon;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
import javax.swing.border.EmptyBorder;
import javax.swing.border.LineBorder;
import javax.swing.plaf.basic.BasicLabelUI;

/** @see http://stackoverflow.com/questions/3597550 */
public class LayoutTest extends JPanel {

    private static final String text =
        "A damsel with a dulcimer in a vision once I saw.";
    private final JLabel sizeLabel = new JLabel();
    private final JLabel textLabel = new JLabel(text);
    private final MyLabelUI myUI = new MyLabelUI();

    public LayoutTest() {
        super(new GridLayout(0, 1));
        this.setBorder(BorderFactory.createCompoundBorder(
            new LineBorder(Color.blue), new EmptyBorder(5, 5, 5, 5)));
        textLabel.setUI(myUI);
        textLabel.setFont(new Font("Serif", Font.ITALIC, 24));
        this.add(sizeLabel);
        this.add(textLabel);
        this.addComponentListener(new ComponentAdapter() {

            @Override
            public void componentResized(ComponentEvent e) {
                sizeLabel.setText(
                    "Before: " + myUI.before + " after: " + myUI.after);
            }
        });
    }

    private static class MyLabelUI extends BasicLabelUI {

        int before, after;

        @Override
        protected String layoutCL(
            JLabel label, FontMetrics fontMetrics, String text, Icon icon,
            Rectangle viewR, Rectangle iconR, Rectangle textR) {
            before = text.length();
            String s = super.layoutCL(
                label, fontMetrics, text, icon, viewR, iconR, textR);
            after = s.length();
            System.out.println(s);
            return s;
        }
    }

    private void display() {
        JFrame f = new JFrame("LayoutTest");
        f.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
        f.add(this);
        f.pack();
        f.setLocationRelativeTo(null);
        f.setVisible(true);
    }

    public static void main(String[] args) {
        EventQueue.invokeLater(new Runnable() {

            @Override
            public void run() {
                new LayoutTest().display();
            }
        });
    }
}

回答by davmac

If you're worried about the ellipsis only hiding a very small number of characters, why not just check for that condition?

如果您担心省略号只隐藏了很少的字符,为什么不直接检查这种情况呢?

public static String ellipsis(final String text, int length)
{
    // The letters [iIl1] are slim enough to only count as half a character.
    length += Math.ceil(text.replaceAll("[^iIl]", "").length() / 2.0d);

    if (text.length() > length + 20)
    {
        return text.substring(0, length - 3) + "...";
    }

    return text;
}

回答by Gopi

For me this would be ideal -

对我来说,这将是理想的 -

 public static String ellipsis(final String text, int length)
 {
     return text.substring(0, length - 3) + "...";
 }

I would not worry about the size of every character unless I really know where and in what font it is going to be displayed. Many fonts are fixed width fonts where every character has same dimension.

我不会担心每个字符的大小,除非我真的知道它将在何处显示以及以什么字体显示。许多字体是固定宽度字体,其中每个字符都具有相同的尺寸。

Even if its a variable width font, and if you count 'i', 'l' to take half the width, then why not count 'w' 'm' to take double the width? A mix of such characters in a string will generally average out the effect of their size, and I would prefer ignoring such details. Choosing the value of 'length' wisely would matter the most.

即使它是一个可变宽度的字体,如果你计算 'i', 'l' 来占据一半的宽度,那么为什么不计算 'w' 'm' 来占据两倍的宽度呢?字符串中这些字符的混合通常会平均它们大小的影响,我宁愿忽略这些细节。明智地选择“长度”的值最重要。

回答by Chris

I'd go with something similar to the standard model that you have. I wouldn't bother with the character widths thing - as @Gopi said it is probably goign to all balance out in the end. What I'd do that is new is have another paramter called something like "minNumberOfhiddenCharacters" (maybe a bit less verbose). Then when doign the ellipsis check I'd do something like:

我会选择类似于您拥有的标准模型的东西。我不会理会字符宽度的事情-正如@Gopi 所说,最终可能会完全平衡。我要做的是有另一个名为“minNumberOfhiddenCharacters”之类的参数(可能不那么冗长)。然后在进行省略号检查时,我会执行以下操作:

if (text.length() > length+minNumberOfhiddenCharacters)
{
    return text.substring(0, length - 3) + "...";
}

What this will mean is that if your text length is 35, your "length" is 30 and your min number of characters to hide is 10 then you would get your string in full. If your min number of character to hide was 3 then you would get the ellipsis instead of those three characters.

这意味着,如果您的文本长度为 35,您的“长度”为 30,并且您要隐藏的最小字符数为 10,那么您将获得完整的字符串。如果您要隐藏的最小字符数为 3,那么您将得到省略号而不是这三个字符。

The main thing to be aware of is that I've subverted the meaning of "length" so that it is no longer a maximum length. The length of the outputted string can now be anything from 30 characters (when the text length is >40) to 40 characters (when the text length is 40 characters long). Effectively our max length becomes length+minNumberOfhiddenCharacters. The string could of course be shorter than 30 characters when the original string is less than 30 but this is a boring case that we should ignore.

要注意的主要事情是我颠覆了“长度”的含义,使其不再是最大长度。输出字符串的长度现在可以是 30 个字符(当文本长度大于 40 时)到 40 个字符(当文本长度为 40 个字符时)。实际上,我们的最大长度变为 length+minNumberOfhiddenCharacters。当原始字符串小于 30 时,字符串当然可以短于 30 个字符,但这是一个我们应该忽略的无聊案例。

If you want length to be a hard and fast maximum then you'd want something more like:

如果您希望长度成为一个硬而快的最大值,那么您需要更像这样的东西:

if (text.length() > length)
{
    if (text.length() - length < minNumberOfhiddenCharacters-3)
    {
        return text.substring(0, text.length() - minNumberOfhiddenCharacters) + "...";
    }
    else
    {
        return text.substring(0, length - 3) + "...";
    }
}

So in this example if text.length() is 37, length is 30 and minNumberOfhiddenCharacters = 10 then we'll go into the second part of the inner if and get 27 characters + ... to make 30. This is actually the same as if we'd gone into the first part of the loop (which is a sign we have our boundary conditions right). If the text length was 36 we'd get 26 characters + the ellipsis giving us 29 characters with 10 hidden.

所以在这个例子中,如果 text.length() 是 37,length 是 30 并且 minNumberOfhiddenCharacters = 10 那么我们将进入内部 if 的第二部分并得到 27 个字符 + ... 来制作 30。这实际上是一样的好像我们进入了循环的第一部分(这表明我们的边界条件是正确的)。如果文本长度为 36,我们将得到 26 个字符 + 省略号为我们提供 29 个字符,其中 10 个隐藏。

I was debating whether rearranging some of the comparison logic would make it more intuitive but in the end decided to leave it as it is. You might find that text.length() - minNumberOfhiddenCharacters < length-3makes it more obvious what you are doing though.

我在争论重新排列一些比较逻辑是否会使它更直观,但最终决定保持原样。你可能会发现这text.length() - minNumberOfhiddenCharacters < length-3让你在做什么变得更加明显。

回答by rompetroll

In my eyes, you can't get good results without pixel math.

在我看来,没有像素数学就无法获得好的结果。

Thus, Java is probably the wrong end to fix this problem when you are in a web application context (like facebook).

因此,当您处于 Web 应用程序上下文(如 facebook)时,Java 可能是解决此问题的错误方法。

I'd go for javascript. Since Javascript is not my primary field of interest, I can't really judge if thisis a good solution, but it might give you a pointer.

我会去 javascript。由于 Javascript 不是我的主要兴趣领域,我无法判断是否是一个好的解决方案,但它可能会给您一个提示。

回答by Spudley

If you're talking about a web site - ie outputting HTML/JS/CSS, you can throw away all these solutions because there is a pure CSS solution.

如果你在谈论一个网站——即输出 HTML/JS/CSS,你可以扔掉所有这些解决方案,因为有一个纯 CSS 解决方案。

text-overflow:ellipsis;

It's not quite as simple as just adding that style to your CSS, because it interracts with other CSS; eg it requires that the element has overflow:hidden; and if you want your text on a single line, white-space:nowrap;is good too.

这并不像将样式添加到 CSS 中那么简单,因为它会与其他 CSS 交互;例如,它要求元素有溢出:隐藏;如果你希望你的文字在一行上,white-space:nowrap;也很好。

I have a stylesheet that looks like this:

我有一个看起来像这样的样式表:

.myelement {
  word-wrap:normal;
  white-space:nowrap;
  overflow:hidden;
  -o-text-overflow:ellipsis;
  text-overflow:ellipsis;
  width: 120px;
}

You can even have a "read more" button that simply runs a javascript function to change the styles, and bingo, the box will re-size and the full text will be visible. (in my case though, I tend to use the html title attribute for the full text, unless it's likely to get very long)

您甚至可以有一个“阅读更多”按钮,它只需运行一个 javascript 函数来更改样式,而宾果游戏,框将重新调整大小并且全文将可见。(不过,就我而言,我倾向于对全文使用 html 标题属性,除非它可能会变得很长)

Hope that helps. It's a much simpler solution that trying to mess calculate the text size and truncate it, and all that. (of course, if you're writing a non-web-based app, you may still need to do that)

希望有帮助。这是一个更简单的解决方案,它试图计算文本大小并截断它,等等。(当然,如果您正在编写一个非基于网络的应用程序,您可能仍然需要这样做)

There is one down-side to this solution: Firefox doesn't support the ellipsis style. Annoying, but I don't think critical -- It does still truncate the text correctly, as that is dealt with by by overflow:hidden, it just doesn't display the ellipsis. It does work in all the other browsers (including IE, all the way back to IE5.5!), so it's a bit annoying that Firefox doesn't do it yet. Hopefully a new version of Firefox will solve this issue soon.

此解决方案有一个缺点:Firefox 不支持省略号样式。烦人,但我认为并不重要——它仍然正确地截断文本,因为这是由溢出处理的:隐藏,它只是不显示省略号。它确实适用于所有其他浏览器(包括 IE,一直回到 IE5.5!),所以 Firefox 还没有这样做有点烦人。希望新版本的 Firefox 能尽快解决这个问题。

[EDIT]
People are still voting on this answer, so I should edit it to note that Firefox does now support the ellipsis style. The feature was added in Firefox 7. If you're using an earlier version (FF3.6 and FF4 still have some users) then you're out of luck, but most FF users are now okay. There's a lot more detail about this here: text-overflow:ellipsis in Firefox 4? (and FF5)

[编辑]
人们仍在对这个答案进行投票,所以我应该编辑它以注意 Firefox 现在确实支持省略号样式。该功能是在 Firefox 7 中添加的。如果您使用的是早期版本(FF3.6 和 FF4 仍然有一些用户),那么您就不走运了,但大多数 FF 用户现在都可以了。这里有更多关于此的详细信息:文本溢出:Firefox 4 中的省略号?(和FF5)

回答by gayavat

 public static String getTruncated(String str, int maxSize){
    int limit = maxSize - 3;
    return (str.length() > maxSize) ? str.substring(0, limit) + "..." : str;
 }

回答by Adam Gent

I'm shocked no one mentioned Commons Lang StringUtils#abbreviate().

我很震惊没有人提到Commons Lang StringUtils#abbreviate()

Update: yes it doesn't take the slim characters into account but I don't agree with that considering everyone has different screens and fonts setup and a large portion of the people that land here on this page are probably looking for a maintained library like the above.

更新:是的,它没有考虑到细长的字符,但我不同意这一点,因为每个人都有不同的屏幕和字体设置,而且在此页面上的大部分人可能正在寻找一个维护的库,例如以上。

回答by yegor256

How about this (to get a string of 50 chars):

这个怎么样(得到一个 50 个字符的字符串):

text.replaceAll("(?<=^.{47}).*$", "...");