Java jsoup 发布和 cookie
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6432970/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
jsoup posting and cookie
提问by Gwindow
I'm trying to use jsoup to login to a site and then scrape information, I am running into in a problem, I can login successfully and create a Document from index.php but I cannot get other pages on the site. I know I need to set a cookie after I post and then load it when I'm trying to open another page on the site. But how do I do this? The following code lets me login and get index.php
我正在尝试使用 jsoup 登录到一个站点,然后抓取信息,我遇到了一个问题,我可以成功登录并从 index.php 创建一个文档,但我无法在站点上获取其他页面。我知道我需要在发布后设置一个 cookie,然后在我尝试打开站点上的另一个页面时加载它。但是我该怎么做呢?下面的代码让我登录并获取 index.php
Document doc = Jsoup.connect("http://www.example.com/login.php")
.data("username", "myUsername",
"password", "myPassword")
.post();
I know I can use apache httpclient to do this but I don't want to.
我知道我可以使用 apache httpclient 来做到这一点,但我不想。
采纳答案by Jonathan Hedley
When you login to the site, it is probably setting an authorised session cookie that needs to be sent on subsequent requests to maintain the session.
当您登录该站点时,它可能正在设置一个授权会话 cookie,需要在后续请求中发送该 cookie 以维护会话。
You can get the cookie like this:
您可以像这样获取cookie:
Connection.Response res = Jsoup.connect("http://www.example.com/login.php")
.data("username", "myUsername", "password", "myPassword")
.method(Method.POST)
.execute();
Document doc = res.parse();
String sessionId = res.cookie("SESSIONID"); // you will need to check what the right cookie name is
And then send it on the next request like:
然后在下一个请求中发送它,例如:
Document doc2 = Jsoup.connect("http://www.example.com/otherPage")
.cookie("SESSIONID", sessionId)
.get();
回答by Igor Brusamolin Lobo Santos
//This will get you the response.
Response res = Jsoup
.connect("loginPageUrl")
.data("loginField", "[email protected]", "passField", "pass1234")
.method(Method.POST)
.execute();
//This will get you cookies
Map<String, String> loginCookies = res.cookies();
//And this is the easiest way I've found to remain in session
Document doc = Jsoup.connect("urlYouNeedToBeLoggedInToAccess")
.cookies(loginCookies)
.get();
回答by user1935501
Where the code was:
代码在哪里:
Document doc = Jsoup.connect("urlYouNeedToBeLoggedInToAccess").cookies().get();
I was having difficulties until I changed it to:
我遇到了困难,直到我将其更改为:
Document doc = Jsoup.connect("urlYouNeedToBeLoggedInToAccess").cookies(cookies).get();
Now it is working flawlessly.
现在它可以完美地工作。
回答by iamvinitk
Here is what you can try...
这是您可以尝试的...
import org.jsoup.Connection;
Connection.Response res = null;
try {
res = Jsoup
.connect("http://www.example.com/login.php")
.data("username", "your login id", "password", "your password")
.method(Connection.Method.POST)
.execute();
} catch (IOException e) {
e.printStackTrace();
}
Now save all your cookies and make request to the other page you want.
现在保存您的所有 cookie 并向您想要的其他页面发出请求。
//Store Cookies
cookies = res.cookies();
Making request to another page.
向另一个页面发出请求。
try {
Document doc = Jsoup.connect("your-second-page-link").cookies(cookies).get();
}
catch(Exception e){
e.printStackTrace();
}
Ask if further help needed.
询问是否需要进一步的帮助。