Python 通过 Tweepy 在 Twitter 中获取所有关注者 ID

Question

提问by user1056824

Is it possible to get the full follower list of an account who has more than one million followers, like McDonald's?

是否有可能获得一个拥有超过一百万粉丝的账户的完整粉丝列表，比如麦当劳？

I use Tweepy and follow the code:

我使用 Tweepy 并遵循以下代码：

c = tweepy.Cursor(api.followers_ids, id = 'McDonalds')
ids = []
for page in c.pages():
     ids.append(page)

I also try this:

我也试试这个：

for id in c.items():
    ids.append(id)

But I always got the 'Rate limit exceeded' error and there were only 5000 follower ids.

但我总是收到“超出速率限制”错误，并且只有 5000 个关注者 ID。

Answer 1

采纳答案by alecxe

In order to avoid rate limit, you can/should wait before the next follower page request. Looks hacky, but works:

为了避免速率限制，您可以/应该在下一个关注者页面请求之前等待。看起来很hacky，但有效：

import time
import tweepy

auth = tweepy.OAuthHandler(..., ...)
auth.set_access_token(..., ...)

api = tweepy.API(auth)

ids = []
for page in tweepy.Cursor(api.followers_ids, screen_name="McDonalds").pages():
    ids.extend(page)
    time.sleep(60)

print len(ids)

Hope that helps.

希望有帮助。

Answer 2

回答by aspiringGuru

Use the rate limiting arguments when making the connection. The api will self control within the rate limit.

建立连接时使用速率限制参数。api 会在速率限制内自我控制。

The sleep pause is not bad, I use that to simulate a human and to spread out activity over a time frame with the api rate limiting as a final control.

睡眠暂停还不错，我用它来模拟人类并在一个时间范围内分散活动，并将 api 速率限制作为最终控制。

api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True, compression=True)

also add try/except to capture and control errors.

还添加 try/except 来捕获和控制错误。

example code https://github.com/aspiringguru/twitterDataAnalyse/blob/master/sample_rate_limit_w_cursor.py

示例代码 https://github.com/aspiringguru/twitterDataAnalyse/blob/master/sample_rate_limit_w_cursor.py

I put my keys in an external file to make management easier.

我将我的密钥放在一个外部文件中，以便于管理。

https://github.com/aspiringguru/twitterDataAnalyse/blob/master/keys.py

Answer 3

回答by irritable_phd_syndrom

The answer from alecxe is good, however no one has referred to the docs. The correct information and explanation to answer the question lives in the Twitter API documentation. From the documentation :

来自 alecxe 的回答很好，但是没有人提到文档。回答问题的正确信息和解释位于Twitter API 文档中。从文档：

Results are given in groups of 5,000 user IDs and multiple “pages” of results can be navigated through using the next_cursor value in subsequent requests.

结果以 5,000 个用户 ID 为一组给出，并且可以在后续请求中使用 next_cursor 值浏览多个结果“页面”。

Answer 4

回答by zana saedpanah

I use this code and it works for a large number of followers : there are two functions one for saving followers id after every sleep period and another one to get the list : it is a little missy but I hope to be useful.

我使用此代码，它适用于大量关注者：有两个功能，一个用于在每个睡眠期后保存关注者 ID，另一个用于获取列表：它有点想念，但我希望有用。

def save_followers_status(filename,foloowersid):
    path='//content//drive//My Drive//Colab Notebooks//twitter//'+filename
    if not (os.path.isfile(path+'_followers_status.csv')):
      with open(path+'_followers_status.csv', 'wb') as csvfile:
        filewriter = csv.writer(csvfile, delimiter=',')


    if len(foloowersid)>0:
        print("save followers status of ", filename)
        file = path + '_followers_status.csv'
        # https: // stackoverflow.com / questions / 3348460 / csv - file - written -with-python - has - blank - lines - between - each - row
        with open(file, mode='a', newline='') as csv_file:
            writer = csv.writer(csv_file, delimiter=',')
            for row in foloowersid:
                writer.writerow(np.array(row))
            csv_file.closed

def get_followers_id(person):
    foloowersid = []
    count=0

    influencer=api.get_user( screen_name=person)
    influencer_id=influencer.id
    number_of_followers=influencer.followers_count
    print("number of followers count : ",number_of_followers,'\n','user id : ',influencer_id)
    status = tweepy.Cursor(api.followers_ids, screen_name=person, tweet_mode="extended").items()
    for i in range(0,number_of_followers):
        try:
            user=next(status)
            foloowersid.append([user])
            count += 1
        except tweepy.TweepError:
            print('error limite of twiter sleep for 15 min')
            timestamp = time.strftime("%d.%m.%Y %H:%M:%S", time.localtime())
            print(timestamp)
            if len(foloowersid)>0 :
                print('the number get until this time :', count,'all folloers count is : ',number_of_followers)
                foloowersid = np.array(str(foloowersid))
                save_followers_status(person, foloowersid)
                foloowersid = []
            time.sleep(15*60)
            next(status)
        except :
            print('end of foloowers ', count, 'all followers count is : ', number_of_followers)
            foloowersid = np.array(str(foloowersid))
            save_followers_status(person, foloowersid)      
            foloowersid = []
    save_followers_status(person, foloowersid)
    # foloowersid = np.array(map(str,foloowersid))
    return foloowersid

Python 通过 Tweepy 在 Twitter 中获取所有关注者 ID

提问by user1056824

采纳答案by alecxe

回答by aspiringGuru

回答by irritable_phd_syndrom

回答by zana saedpanah

相关推荐

最近更新

标签

Python 通过 Tweepy 在 Twitter 中获取所有关注者 ID

提问by user1056824

采纳答案by alecxe

回答by aspiringGuru

回答by irritable_phd_syndrom

回答by zana saedpanah

相关推荐

Python 在 Django 中使用来自 RESTFUL API 的数据的正确方法

Python 如何将数据随机分成训练集和测试集？

Python 3：AttributeError：'module'对象在终端中使用urllib没有属性'__path__'

Python input(): "NameError: name 'n' 未定义"

相关推荐

最近更新

标签

Python 3：AttributeError：'module'对象在终端中使用urllib没有属性'path'