Python 使用 Tweepy 避免 Twitter API 限制

Question

提问by 4m1nh4j1

I saw in some question on Stack Exchange that the limitation can be a function of the number of requests per 15 minutes and depends also on the complexity of the algorithm, except that this is not a complex one.

我在 Stack Exchange 上的一些问题中看到，限制可以是每 15 分钟请求数的函数，还取决于算法的复杂性，只是这不是一个复杂的算法。

So I use this code:

所以我使用这个代码：

import tweepy
import sqlite3
import time

db = sqlite3.connect('data/MyDB.db')

# Get a cursor object
cursor = db.cursor()
cursor.execute('''CREATE TABLE IF NOT EXISTS MyTable(id INTEGER PRIMARY KEY, name TEXT, geo TEXT, image TEXT, source TEXT, timestamp TEXT, text TEXT, rt INTEGER)''')
db.commit()

consumer_key = ""
consumer_secret = ""
key = ""
secret = ""

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(key, secret)

api = tweepy.API(auth)

search = "#MyHashtag"

for tweet in tweepy.Cursor(api.search,
                           q=search,
                           include_entities=True).items():
    while True:
        try:
            cursor.execute('''INSERT INTO MyTable(name, geo, image, source, timestamp, text, rt) VALUES(?,?,?,?,?,?,?)''',(tweet.user.screen_name, str(tweet.geo), tweet.user.profile_image_url, tweet.source, tweet.created_at, tweet.text, tweet.retweet_count))
        except tweepy.TweepError:
                time.sleep(60 * 15)
                continue
        break
db.commit()
db.close()

I always get the Twitter limitation error:

我总是收到 Twitter 限制错误：

Traceback (most recent call last):
  File "stream.py", line 25, in <module>
    include_entities=True).items():
  File "/usr/local/lib/python2.7/dist-packages/tweepy/cursor.py", line 153, in next
    self.current_page = self.page_iterator.next()
  File "/usr/local/lib/python2.7/dist-packages/tweepy/cursor.py", line 98, in next
    data = self.method(max_id = max_id, *self.args, **self.kargs)
  File "/usr/local/lib/python2.7/dist-packages/tweepy/binder.py", line 200, in _call
    return method.execute()
  File "/usr/local/lib/python2.7/dist-packages/tweepy/binder.py", line 176, in execute
    raise TweepError(error_msg, resp)
tweepy.error.TweepError: [{'message': 'Rate limit exceeded', 'code': 88}]

Answer 1

采纳答案by Aaron Hill

The problem is that your try: except:block is in the wrong place. Inserting data into the database will never raise a TweepError- it's iterating over Cursor.items()that will. I would suggest refactoring your code to call the nextmethod of Cursor.items()in an infinite loop. That call should be placed in the try: except:block, as it can raise an error.

问题是您的try: except:块位于错误的位置。将数据插入数据库永远不会引发TweepError- 它正在迭代Cursor.items()那个意志。我建议重构您的代码以在无限循环中调用的next方法Cursor.items()。该调用应该放在try: except:块中，因为它可能会引发错误。

Here's (roughly) what the code should look like:

这是（大致）代码的样子：

# above omitted for brevity
c = tweepy.Cursor(api.search,
                       q=search,
                       include_entities=True).items()
while True:
    try:
        tweet = c.next()
        # Insert into db
    except tweepy.TweepError:
        time.sleep(60 * 15)
        continue
    except StopIteration:
        break

This works because when Tweepy raises a TweepError, it hasn't updated any of the cursor data. The next time it makes the request, it will use the same parameters as the request which triggered the rate limit, effectively repeating it until it goes though.

这是有效的，因为当 Tweepy 引发 a 时TweepError，它没有更新任何游标数据。下次它发出请求时，它将使用与触发速率限制的请求相同的参数，有效地重复它直到它通过。

Answer 2

回答by Till Hoffmann

If you want to avoid errors and respect the rate limit you can use the following function which takes your apiobject as an argument. It retrieves the number of remaining requests of the same type as the last requestand waits until the rate limit has been reset if desired.

如果您想避免错误并遵守速率限制，您可以使用以下函数，它将您的api对象作为参数。它检索与上次请求相同类型的剩余请求数，并等待直到速率限制被重置（如果需要）。

def test_rate_limit(api, wait=True, buffer=.1):
    """
    Tests whether the rate limit of the last request has been reached.
    :param api: The `tweepy` api instance.
    :param wait: A flag indicating whether to wait for the rate limit reset
                 if the rate limit has been reached.
    :param buffer: A buffer time in seconds that is added on to the waiting
                   time as an extra safety margin.
    :return: True if it is ok to proceed with the next request. False otherwise.
    """
    #Get the number of remaining requests
    remaining = int(api.last_response.getheader('x-rate-limit-remaining'))
    #Check if we have reached the limit
    if remaining == 0:
        limit = int(api.last_response.getheader('x-rate-limit-limit'))
        reset = int(api.last_response.getheader('x-rate-limit-reset'))
        #Parse the UTC time
        reset = datetime.fromtimestamp(reset)
        #Let the user know we have reached the rate limit
        print "0 of {} requests remaining until {}.".format(limit, reset)

        if wait:
            #Determine the delay and sleep
            delay = (reset - datetime.now()).total_seconds() + buffer
            print "Sleeping for {}s...".format(delay)
            sleep(delay)
            #We have waited for the rate limit reset. OK to proceed.
            return True
        else:
            #We have reached the rate limit. The user needs to handle the rate limit manually.
            return False 

    #We have not reached the rate limit
    return True

Answer 3

回答by Dan Nguyen

For anyone who stumbles upon this on Google, tweepy 3.2+ has additional parameters for the tweepy.apiclass, in particular:

对于在 Google 上偶然发现此问题的任何人，tweepy 3.2+ 为tweepy.api类提供了额外的参数，特别是：

wait_on_rate_limit– Whether or not to automatically wait for rate limits to replenish
wait_on_rate_limit_notify– Whether or not to print a notification when Tweepy is waiting for rate limits to replenish

wait_on_rate_limit– 是否自动等待限速补货
wait_on_rate_limit_notify– 是否在 Tweepy 等待速率限制补充时打印通知

Setting these flags to Truewill delegate the waiting to the API instance, which is good enough for most simple use cases.

将这些标志设置为True会将等待委托给 API 实例，这对于大多数简单用例来说已经足够了。

Answer 4

回答by Mayank Khullar

Just replace

只需更换

api = tweepy.API(auth)

with

和

api = tweepy.API(auth, wait_on_rate_limit=True)

Answer 5

回答by Malik Faiq

import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# will notify user on ratelimit and will wait by it self no need of sleep.
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

Python 使用 Tweepy 避免 Twitter API 限制

提问by 4m1nh4j1

采纳答案by Aaron Hill

回答by Till Hoffmann

回答by Dan Nguyen

回答by Mayank Khullar

回答by Malik Faiq

相关推荐

最近更新

标签

Python 使用 Tweepy 避免 Twitter API 限制

提问by 4m1nh4j1

采纳答案by Aaron Hill

回答by Till Hoffmann

回答by Dan Nguyen

回答by Mayank Khullar

回答by Malik Faiq

相关推荐

如何在python的文本文件中将所有行连接在一起？

Python numpy 2D 数组索引

Python 第二个 y 轴标签被切断

Python 删除熊猫中数据帧的前三行

相关推荐

最近更新

标签