javascript 抓取动态页面内容phantomjs

Question

提问by user985590

My company is using a website that hosts all of our FAQ and customer questions. We have plans to go through and wipe out all of the old data and input new and the service does not have a backup, or archive option for questions we don't want to appear anymore.

我的公司正在使用一个网站，其中包含我们所有的常见问题解答和客户问题。我们计划通过并清除所有旧数据并输入新数据，并且该服务没有备份或存档选项来解决我们不想再出现的问题。

I've gone through and tried to scape the site using perl and mechanize, but I'm missing the customer comments on the page as they are loaded through ajax. I have looked at phantomjs and can get the pages to save to an image using an example page, however, I'd like to get an full page html dump of the page, but can't figure out how. I used this example code on our site

我已经通过并尝试使用 perl 和机械化来逃避网站，但是当他们通过 ajax 加载时，我错过了页面上的客户评论。我看过 phantomjs 并且可以使用示例页面将页面保存到图像中，但是，我想获得页面的完整页面 html 转储，但不知道如何。我在我们的网站上使用了这个示例代码

var page = new WebPage();

page.open('http://espn.go.com/nfl/', function (status) {
//once page loaded, include jQuery from cdn
page.includeJs("http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js", function() {
//once jQuery loaded, run some code
//inserts our custom text into the page
page.evaluate(function(){$("h2").html('Many NFL Players Scared that Chad Moon Will Enter League');});
//take screenshot and exit
page.render('espn.png');
phantom.exit();

});

});

Is there a way using phantomjs that I can just get a full page dump of the data, similar to if I did a view source in chrome? I can do this with perl + mechanize, but don't see how to do this using phantomjs.

有没有一种方法可以使用 phantomjs 来获取数据的整页转储，类似于我在 chrome 中查看源代码？我可以用 perl + mechanize 做到这一点，但不知道如何使用 phantomjs 做到这一点。

Answer 1

回答by McMeep

You can use page.contentto get the full HTML DOM

您可以使用page.content获取完整的 HTML DOM

Answer 2

回答by Radhouane Fazai

I would recommend pjscrape http://nrabinowitz.github.com/pjscrape/if you want to scrape using PhantomJS

如果你想使用 PhantomJS 抓取，我会推荐 pjscrape http://nrabinowitz.github.com/pjscrape/

javascript 抓取动态页面内容phantomjs

提问by user985590

回答by McMeep

回答by Radhouane Fazai

相关推荐

最近更新

标签

javascript 抓取动态页面内容phantomjs

提问by user985590

回答by McMeep

回答by Radhouane Fazai

相关推荐

javascript 如何使用 addEventListener

javascript 在 HTML5 画布中居中（比例字体）文本

Javascript 接口不适用于 android 4.2

javascript 如何使用jsoup提交表单

相关推荐

最近更新

标签