Java 如何使用 Twitter4j 检索超过 100 个结果

Question

提问by hapless_cap

I'm using the Twitter4j library to retrieve tweets, but I'm not getting nearly enough for my purposes. Currently, I'm getting that maximum of 100 from one page. How do I implement maxId and sinceId into the below code in Processing in order to retrieve more than the 100 results from the Twitter search API? I'm totally new to Processing (and programming in general), so any bit of direction on this would be awesome! Thanks!

我正在使用 Twitter4j 库来检索推文，但我的目的还远远不够。目前，我从一页中获得最多 100 个。为了从 Twitter 搜索 API 中检索超过 100 个结果，我如何在 Processing 中的以下代码中实现 maxId 和 sinceId？我对处理（以及一般编程）完全陌生，所以任何关于这方面的方向都会很棒！谢谢！

void setup() {

  ConfigurationBuilder cb = new ConfigurationBuilder();
  cb.setOAuthConsumerKey("xxxx");
  cb.setOAuthConsumerSecret("xxxx");
  cb.setOAuthAccessToken("xxxx");
  cb.setOAuthAccessTokenSecret("xxxx");

  Twitter twitter = new TwitterFactory(cb.build()).getInstance();
  Query query = new Query("#peace");
  query.setCount(100);

  try {
    QueryResult result = twitter.search(query);
    ArrayList tweets = (ArrayList) result.getTweets();

    for (int i = 0; i < tweets.size(); i++) {
      Status t = (Status) tweets.get(i);

      GeoLocation loc = t.getGeoLocation();

      if (loc!=null) {
        tweets.get(i++);

        String user = t.getUser().getScreenName();
        String msg = t.getText();

        Double lat = t.getGeoLocation().getLatitude();
        Double lon = t.getGeoLocation().getLongitude();

        println("USER: " + user + " wrote: " + msg + " located at " + lat + ", " + lon);

      }
    }
  }

  catch (TwitterException te) {
    println("Couldn't connect: " + te);
  };
}

void draw() {
}

Answer 1

采纳答案by Petros Koutsolampros

Unfortunately you can't, at least not in a direct way such as doing

不幸的是你不能，至少不能以直接的方式，比如做

query.setCount(101);

As the javadocsays it will only allow up to 100 tweets.

正如javadoc所说，它最多只允许 100 条推文。

In order to overcome this, you just have to ask for them in batches and in every batch set the maximum ID that you get to be 1 less than the last Id you got from the last one. To wrap this up, you gather every tweet from the process into an ArrayList (which by the way should not stay generic, but have its type defined as ArrayList<Status>- An ArrayList that carries Status objects) and then print everything! Here's an implementation:

为了克服这个问题，您只需要分批请求它们，并且在每个批次中将获得的最大 ID 设置为比从上一个获得的最后一个 ID 小 1。总结一下，您将流程中的每条推文收集到一个 ArrayList 中（顺便说一下，它不应该保持通用，而是将其类型定义为ArrayList<Status>- 一个带有 Status 对象的 ArrayList），然后打印所有内容！这是一个实现：

void setup() {

  ConfigurationBuilder cb = new ConfigurationBuilder();
  cb.setOAuthConsumerKey("xxxx");
  cb.setOAuthConsumerSecret("xxxx");
  cb.setOAuthAccessToken("xxxx");
  cb.setOAuthAccessTokenSecret("xxxx");

  Twitter twitter = new TwitterFactory(cb.build()).getInstance();
  Query query = new Query("#peace");
  int numberOfTweets = 512;
  long lastID = Long.MAX_VALUE;
  ArrayList<Status> tweets = new ArrayList<Status>();
  while (tweets.size () < numberOfTweets) {
    if (numberOfTweets - tweets.size() > 100)
      query.setCount(100);
    else 
      query.setCount(numberOfTweets - tweets.size());
    try {
      QueryResult result = twitter.search(query);
      tweets.addAll(result.getTweets());
      println("Gathered " + tweets.size() + " tweets");
      for (Status t: tweets) 
        if(t.getId() < lastID) lastID = t.getId();

    }

    catch (TwitterException te) {
      println("Couldn't connect: " + te);
    }; 
    query.setMaxId(lastID-1);
  }

  for (int i = 0; i < tweets.size(); i++) {
    Status t = (Status) tweets.get(i);

    GeoLocation loc = t.getGeoLocation();

    String user = t.getUser().getScreenName();
    String msg = t.getText();
    String time = "";
    if (loc!=null) {
      Double lat = t.getGeoLocation().getLatitude();
      Double lon = t.getGeoLocation().getLongitude();
      println(i + " USER: " + user + " wrote: " + msg + " located at " + lat + ", " + lon);
    } 
    else 
      println(i + " USER: " + user + " wrote: " + msg);
  }
}

Note: The line

注：线

ArrayList<Status> tweets = new ArrayList<Status>();

should properly be:

应该是：

List<Status> tweets = new ArrayList<Status>();

because you should always use the interface in case you want to add a different implementation. This of course, if you are on Processing 2.x will require this in the beginning:

因为如果您想添加不同的实现，您应该始终使用该接口。当然，如果您使用的是 Processing 2.x，则一开始就需要这样做：

import java.util.List;

Answer 2

回答by Jonathan

Just keep track of the lowest Statusid and use that to set the max_idfor subsequent searchcalls. This will allow you to step back through the results 100 at a time until you've got enough, e.g.:

只需跟踪最低的Statusid 并使用它来设置max_id后续search调用。这将允许您一次退回 100 个结果，直到您得到足够的结果，例如：

boolean finished = false;
while (!finished) {
    final QueryResult result = twitter.search(query);    

    final List<Status> statuses = result.getTweets();
    long lowestStatusId = Long.MAX_VALUE;
    for (Status status : statuses) {
        // do your processing here and work out if you are 'finished' etc... 

        // Capture the lowest (earliest) Status id
        lowestStatusId = Math.min(status.getId(), lowestStatusId);
    }

    // Subtracting one here because 'max_id' is inclusive
    query.setMaxId(lowestStatusId - 1);
}

See Twitter's guide on Working with Timelinesfor more information.

有关更多信息，请参阅 Twitter 关于使用时间线的指南。

Answer 3

回答by Rodrigo

Here's the function I made for my app based on the past answers. Thank you everybody for your solutions.

这是我根据过去的答案为我的应用程序制作的功能。谢谢大家的解决方案。

List<Status> tweets = new ArrayList<Status>();

void getTweets(String term)
{
int wantedTweets = 112;
long lastSearchID = Long.MAX_VALUE;
int remainingTweets = wantedTweets;
Query query = new Query(term);
 try
{ 

  while(remainingTweets > 0)
  {
    remainingTweets = wantedTweets - tweets.size();
    if(remainingTweets > 100)
    {
      query.count(100);
    }
    else
    {
     query.count(remainingTweets); 
    }
    QueryResult result = twitter.search(query);
    tweets.addAll(result.getTweets());
    Status s = tweets.get(tweets.size()-1);
    firstQueryID = s.getId();
    query.setMaxId(firstQueryID);
    remainingTweets = wantedTweets - tweets.size();
  }

  println("tweets.size() "+tweets.size() );
}
catch(TwitterException te)
{
  System.out.println("Failed to search tweets: " + te.getMessage());
  System.exit(-1);
}
}

Answer 4

回答by user3032707

From the Twitter search API doc: At this time, users represented by access tokens can make 180 requests/queries per 15 minutes. Using application-only auth, an application can make 450 queries/requests per 15 minutes on its own behalf without a user context. You can wait for 15 min and then collect another batch of 400 Tweets, something like:

来自 Twitter 搜索 API 文档：此时，以访问令牌为代表的用户每 15 分钟可以发出 180 个请求/查询。使用仅限应用程序的身份验证，应用程序可以在没有用户上下文的情况下代表自己每 15 分钟进行 450 次查询/请求。您可以等待 15 分钟，然后再收集一批 400 条推文，例如：

            if(tweets.size() % 400 == 0 ) {
            try {
                    Thread.sleep(900000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }

Java 如何使用 Twitter4j 检索超过 100 个结果

提问by hapless_cap

采纳答案by Petros Koutsolampros

回答by Jonathan

回答by Rodrigo

回答by user3032707

相关推荐

最近更新

标签

Java 如何使用 Twitter4j 检索超过 100 个结果

提问by hapless_cap

采纳答案by Petros Koutsolampros

回答by Jonathan

回答by Rodrigo

回答by user3032707

相关推荐

Java 参数 [变量] 的非法修饰符；只允许final

Java println 格式化以便我可以显示表格？

如何将 Java 7 EE SDK 下载为 Mac OSX 的 .sh 文件

Java JNI Hello World 不满意链接错误

相关推荐

最近更新

标签