java 如何在java中的特定单词之后抓取子字符串

Question

提问by quibblify

I'm creating an IRC bot that grabs Twitter links and sends the text to the channel. This is my code:

我正在创建一个 IRC 机器人来抓取 Twitter 链接并将文本发送到频道。这是我的代码：

if (messageIC.contains("https://twitter.com/") && messageIC.contains("/status/")) {
    try {
        String tweeter = message.substring(20);
        String[] tweety = tweeter.split(" ");
        String tweety1 = tweety[0];
        String url = "https://twitter.com/" + tweety1;
        Document doc = Jsoup.connect(url).get();
        Element tweetText = doc.select("p.js-tweet-text.tweet-text").first();
        sendMessage(channel, "Twitter: " + tweetText.text());
    } catch (IOException ex) {
        Logger.getLogger(Ampersand.class.getName()).log(Level.SEVERE, null, ex);
    }
}

This works if the user sends only the link or even if the user types something out after the link. But it doesn't work if the user types something before the link, for example, "blahblahblah http://www.twitter.com/user/status/xxxx" since it will start grabbing immediately and not after twitter.com.

如果用户只发送链接，或者即使用户在链接后输入一些内容，这也有效。但如果用户在链接之前输入一些内容，例如“blahblahblah http://www.twitter.com/user/status/xxxx” ，它就不起作用，因为它会立即开始抓取，而不是在 twitter.com 之后。

Is there a way to only grab the substring after twitter.com?

有没有办法只在 twitter.com 之后抓取子字符串？

Answer 1

回答by Bubletan

You can use indexOfand substring. First get the start of the link by getting the index of "https://twitter.com/". Then you look for a space after the beginning of the link, if one exists link ends there, otherwise it ends at the end of the message. Then we can use the substringmethod to get the link:

您可以使用indexOf和substring。首先通过获取的索引来获取链接的开始"https://twitter.com/"。然后你在链接开始后寻找一个空格，如果存在链接结束那里，否则结束在message. 然后我们就可以使用substring方法来获取链接了：

int startIndex = message.indexOf("https://twitter.com/");
int endIndex = message.indexOf(" ", startIndex);
if (endIndex == -1) {
    endIndex = message.length();
}
String link = message.substring(startIndex, endIndex);

Another easy way, spliteverything by space and check if they match the requirements:

另一种简单的方法，split按空间所有内容并检查它们是否符合要求：

String[] words = message.split(" ");
for (String word : words) {
    if (word.startsWith("https://twitter.com/")) {
        // ...
    }
}

Answer 2

回答by The Guy with The Hat

You can use String's indexOf(String str)method to find where the http://etcis. You can then use indexOf(String str, int fromIndex)method to find where the first space after the URL is. Lastly, use substring(int beginIndex, int endIndex)with those two values.

您可以使用 String 的indexOf(String str)方法来查找它的http://etc位置。然后，您可以使用indexOf(String str, int fromIndex)method 查找 URL 后的第一个空格在哪里。最后，使用substring(int beginIndex, int endIndex)这两个值。

I won't give you the full code; you'll learn by writing it yourself.

我不会给你完整的代码；您将通过自己编写来学习。

Answer 3

回答by connorp

Use the String indexOf(String s)method on the full string. Then add that int to the length of the target String (in this case "www.twitter.com") and use that as the starting index for your substring.

indexOf(String s)对完整字符串使用 String方法。然后将该 int 添加到目标字符串的长度（在本例中为"www.twitter.com"），并将其用作子字符串的起始索引。

String s = "http://www.twitter.com/user/status/xxxx";
String target = "www.twitter.com";
int index = s.indexOf(target);
int subIndex = index + target.length();
System.out.print(s.substring(subIndex));

java 如何在java中的特定单词之后抓取子字符串

提问by quibblify

回答by Bubletan

回答by The Guy with The Hat

回答by connorp

相关推荐

最近更新

标签

java 如何在java中的特定单词之后抓取子字符串

提问by quibblify

回答by Bubletan

回答by The Guy with The Hat

回答by connorp

相关推荐

推理变量具有不兼容的界限。Java 8 编译器回归？

java 堆外、本机堆、直接内存和本机内存

Java Web 服务中的 Soap 信封命名空间前缀

java 'bean 实例化失败；它是一个抽象类吗？尝试使用抽象父级实例化 bean 时

相关推荐

最近更新

标签