在 Javascript 中为推文计算字符的最佳方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6245487/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-25 20:02:18  来源:igfitidea点击:

Best way to count characters in Javascript for a tweet

javascripttwitter

提问by PeeHaa

From the Twitter API docs ( http://dev.twitter.com/pages/counting_characters):

来自 Twitter API 文档(http://dev.twitter.com/pages/counting_characters):

the 140 chars tweet limit doesn't really count the characters but rather the bytes of the string.

140 个字符的推文限制并没有真正计算字符数,而是计算字符串的字节数。

How would I be able to count the bytes in a string using Javascript or does every character in my string always use 2 bytes since I set the encoding of my page to UTF-8?

我如何能够使用 Javascript 计算字符串中的字节数,或者我的字符串中的每个字符是否总是使用 2 个字节,因为我将页面的编码设置为 UTF-8?

Perhaps there is already a nice counter function for me to use?

也许已经有一个不错的计数器功能供我使用?

回答by moluv00

Actually, because of the t.co url shortener, just counting characters doesn't work anymore. Check out these two Twitter references to see how to handle shortened links:

实际上,由于 t.co 网址缩短器,仅计算字符不再起作用。查看这两个 Twitter 参考以了解如何处理缩短的链接:

https://support.twitter.com/articles/78124-how-to-shorten-links-urls

https://support.twitter.com/articles/78124-how-to-shorten-links-urls

https://dev.twitter.com/docs/tco-url-wrapper/how-twitter-wrap-urls

https://dev.twitter.com/docs/tco-url-wrapper/how-twitter-wrap-urls

If you're looking for help on the client-side, you'll have to make a new friend with twitter-text.js

如果您在客户端寻求帮助,则必须使用 twitter-text.js 结交新朋友

https://github.com/twitter/twitter-text-js

https://github.com/twitter/twitter-text-js

I also posted a walk-through of a function I use to count the remaining characters in a tweet

我还发布了一个用于计算推文中剩余字符的函数的演练

http://blog.pay4tweet.com/2012/04/27/twitter-lifts-140-character-limit/

http://blog.pay4tweet.com/2012/04/27/twitter-lifts-140-character-limit/

The function looks like this

该函数看起来像这样

function charactersleft(tweet) {
    var url, i, lenUrlArr;
    var virtualTweet = tweet;
    var filler = "01234567890123456789";
    var extractedUrls = twttr.txt.extractUrlsWithIndices(tweet);
    var remaining = 140;
    lenUrlArr = extractedUrls.length;
    if ( lenUrlArr > 0 ) {
        for (var i = 0; i < lenUrlArr; i++) {
            url = extractedUrls[i].url;
            virtualTweet = virtualTweet.replace(url,filler);
        }
    }
    remaining = remaining - virtualTweet.length;
    return remaining;
}

The function returns the number of characters remaining, assuming that all URLs, including those shortened to less than 20 characters, have been "shortened" by t.co to 19 characters plus a space.

该函数返回剩余的字符数,假设所有 URL,包括那些缩短到少于 20 个字符的 URL,都已被 t.co “缩短”为 19 个字符加一个空格。

It assumes that twitter-text.js is being included.

它假设 twitter-text.js 被包含在内。

回答by rd3n

Thanks moluv00for your answer that save me some search and put me on the right track. I just wanted to share the way I proceeded to deal with twitter characters counting (due to tiny urls) in my app.

感谢moluv00的回答,为我节省了一些搜索时间并使我走上正轨。我只是想分享我在我的应用程序中处理 twitter 字符计数(由于小网址)的方式。

A pull requestas been merged on the github repositoryon 2012-05-31 introducing the twttr.txt.getTweetLength(text, options)function that is taking consideration to t.co URLs and defined as follow :

拉动请求作为被合并于GitHub的存储库上2012-05-31引入twttr.txt.getTweetLength(文本,选项)即正在考虑t.co网址功能和定义如下:

twttr.txt.getTweetLength = function(text, options) {
    if (!options) {
        options = {
            short_url_length: 22,
            short_url_length_https: 23
        };
    }
    var textLength = text.length;
    var urlsWithIndices = twttr.txt.extractUrlsWithIndices(text);

    for (var i = 0; i < urlsWithIndices.length; i++) {
        // Subtract the length of the original URL
        textLength += urlsWithIndices[i].indices[0] - urlsWithIndices[i].indices[1];

        // Add 21 characters for URL starting with https://
        // Otherwise add 20 characters
        if (urlsWithIndices[i].url.toLowerCase().match(/^https:\/\//)) {
            textLength += options.short_url_length_https;
        } else {
            textLength += options.short_url_length;
        }
    }

    return textLength;
};

So your function will just become :

所以你的功能将变成:

function charactersleft(tweet) {
    return 140 - twttr.txt.getTweetLength(tweet);
}

Plus, regarding the best practices with t.cowe should retrieve the short_url_lengthand short_url_length_httpsvalues from twitter and pass them as the optionsparameter in the twttr.txt.getTweetLengthfunction :

另外,关于t.co 的最佳实践,我们应该从 twitter 中检索short_url_lengthshort_url_length_https值,并将它们作为twttr.txt.getTweetLength函数中的选项参数传递:

Request GET help/configuration once daily in your application and cache the "short_url_length" (t.co's current maximum length value) for 24 hours. Cache "short_url_length_https" (the maximum length for HTTPS-based t.co links) and use it as the length of HTTPS-based URLs.

每天在您的应用程序中请求一次 GET 帮助/配置,并将“short_url_length”(t.co 当前的最大长度值)缓存 24 小时。缓存“short_url_length_https”(基于 HTTPS 的 t.co 链接的最大长度)并将其用作基于 HTTPS 的 URL 的长度。

Especially knowing that some changes in the t.co urls length will be effective on 2013-02-20as described in the twitter developer blog

特别是知道t.co 网址长度的一些变化将在 2013-02-20 生效,twitter 开发者博客中所述

回答by Kemal Da?

As others mentioned, twitter counts links as a string with length of 20. In our small project we ended up using following code piece :

正如其他人提到的,twitter 将链接视为长度为 20 的字符串。在我们的小项目中,我们最终使用了以下代码段:

function getTweetLength(input) {
  var tmp = "";
  for(var i = 0; i < 20; i++){tmp+="o"}
  return input.replace(/(http[s]?:\/\/[\S]*)/g, tmp).length;
};

In case you are using angular.js, here is a small filter you can use in your angular.js app:

如果您使用 angular.js,这里有一个小过滤器,您可以在 angular.js 应用程序中使用:

app.filter('tweetLength', function() {
  return function(input) {
    var tmp = "";
    for(var i = 0; i < 20; i++){tmp+="o"}
    return input.replace(/(http[s]?:\/\/[\S]*)/g, tmp).length;
  };
});

And usage is as simple as :

用法很简单:

Tweet length is {{tweet|tweetLength}}

回答by Tomalak

How would I be able to count the bytes in a string using Javascript or does every character in my string always use 2 bytes since I set the encoding of my page to UTF-8?

我如何能够使用 Javascript 计算字符串中的字节数,或者我的字符串中的每个字符是否总是使用 2 个字节,因为我将页面的编码设置为 UTF-8?

JavaScript counts characters and not bytes. You don't have a problem at all.

JavaScript 计算字符数而不是字节数。你根本没有问题。

"嘰嘰喳喳".length == 4
"Twitter".length == 7

Update: The above only is correct for strings that contain nothing but characters in the Basic Multilingual Plane (BMP).

更新:以上仅适用于在基本多语言平面 (BMP) 中只包含字符的字符串。

Determining string length is not quite so simple when the string contains characters from outside the BMP (like Emoji) or combining marks. The following blog post discusses the matter exhaustively, reading it is highly recommended: https://mathiasbynens.be/notes/javascript-unicode

当字符串包含来自 BMP 之外的字符(如 Emoji)或组合标记时,确定字符串长度就不是那么简单了。以下博客文章详尽地讨论了这个问题,强烈建议阅读:https: //mathiasbynens.be/notes/javascript-unicode