使用 Python 请求:会话、Cookie 和 POST
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15778466/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using Python Requests: Sessions, Cookies, and POST
提问by user2238685
I am trying to scrape some selling data using the StubHub API. An example of this data seen here:
我正在尝试使用StubHub API抓取一些销售数据。此数据的示例见此处:
https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata
https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata
You'll notice that if you try and visit that url without logging into stubhub.com, it won't work. You will need to login first.
您会注意到,如果您尝试在未登录 stubhub.com 的情况下访问该网址,它将无法正常工作。您需要先登录。
Once I've signed in via my web browser, I open the URL which I want to scrape in a new tab, then use the following command to retrieve the scraped data:
通过 Web 浏览器登录后,我在新选项卡中打开要抓取的 URL,然后使用以下命令检索抓取的数据:
r = requests.get('https://sell.stubhub.com/sellapi/event/4236070/section/null/seatmapdata')
However, once the browser session expires after ten minutes, I get this error:
但是,一旦浏览器会话在十分钟后过期,我就会收到此错误:
<FormErrors>
<FormField>User Auth Check</FormField>
<ErrorMessage>
Either is not active or the session might have expired. Please login again.
</ErrorMessage>
I think that I need to implement the session ID via cookie to keep my authentication alive and well.
我认为我需要通过 cookie 实现会话 ID 以保持我的身份验证有效。
The Requests library documentation is pretty terrible for someone who has never done this sort of thing before, so I was hoping you folks might be able to help.
对于以前从未做过此类事情的人来说,Requests 库文档非常糟糕,所以我希望你们能提供帮助。
The example provided by Requests is:
Requests 提供的例子是:
s = requests.Session()
s.get('http://httpbin.org/cookies/set/sessioncookie/123456789')
r = s.get("http://httpbin.org/cookies")
print r.text
# '{"cookies": {"sessioncookie": "123456789"}}'
I honestly can't make heads or tails of that. How do I preserve cookies between POST requests?
老实说,我无法对此做出正面或反面。如何在 POST 请求之间保留 cookie?
回答by Micha?
I don't know how stubhub's api works, but generally it should look like this:
我不知道 stubhub 的 api 是如何工作的,但通常它应该是这样的:
s = requests.Session()
data = {"login":"my_login", "password":"my_password"}
url = "http://example.net/login"
r = s.post(url, data=data)
Now your session contains cookies provided by login form. To access cookies of this session simply use
现在您的会话包含由登录表单提供的 cookie。要访问此会话的 cookie,只需使用
s.cookies
Any further actions like another requests will have this cookie
任何进一步的操作,比如另一个请求,都会有这个 cookie

