Javascript Phantomjs 不执行 page.evaluate 函数中的函数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12555203/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Phantomjs does not execute function in page.evaluate function
提问by philipDS
I'm scraping a Facebook page with the PhantomJS node module (https://github.com/sgentle/phantomjs-node), but when I try evaluating the page, it does not evaluate the function I pass to it. Executing it in a standalone script and running it with the Node interpreter works.. The same code in an Express.js app does not work.
我正在使用 PhantomJS 节点模块 ( https://github.com/sgentle/phantomjs-node)抓取 Facebook 页面,但是当我尝试评估页面时,它不会评估我传递给它的函数。在独立脚本中执行它并使用 Node 解释器运行它可以工作。 Express.js 应用程序中的相同代码不起作用。
This is my code
这是我的代码
facebookScraper.prototype.scrapeFeed = function (url, cb) {
f = ':scrapeFeed:';
var evaluator = function (s) {
var posts = [];
for (var i = 0; i < FEED_ITEMS; i++) {
log.info(__filename+f+' iterating step ' + i);
log.info(__filename+f+util.inspect(document, false, null));
}
return {
news: posts
};
}
phantom.create(function (ph) {
ph.createPage(function (page) {
log.fine(__filename+f+' opening url ' + url);
page.open(url, function (status) {
log.fine(__filename+f+' opened site? ' + status);
setTimeout(function() {
page.evaluate(evaluator, function (result) {
log.info(__filename+f+'Scraped feed: ' + util.inspect(result, false, null));
cb(result, ph);
});
}, 5000);
});
});
});
};
The output I get:
我得到的输出:
{"level":"fine","message":"PATH/fb_regular.js:scrapeFeed: opening url <URL> ","timestamp":"2012-09-23T18:35:10.151Z"}
{"level":"fine","message":"PATH/fb_regular.js:scrapeFeed: opened site? success","timestamp":"2012-09-23T18:35:12.682Z"}
{"level":"info","message":"PATH/fb_regular.js:scrapeFeed: Scraped feed: null","timestamp":"2012-09-23T18:35:12.687Z"}
So, as you see, it calls the phantom callback function (second parameter in the evaluate function) with a null argument, but it doesn't execute the first parameter (my evaluator function, which prints iterating step X).
因此,如您所见,它使用空参数调用幻像回调函数(评估函数中的第二个参数),但它不执行第一个参数(我的评估器函数,它打印迭代步骤 X)。
Anyone knows what the problem is?
有谁知道问题是什么?
回答by DeadAlready
I'm unsure as to what version of PhantomJS you are using, but as for the documentation of versions 1.6+ logging inside evaluated script will log the result in the contained page. It will not log into your console. To get that you would have to bind logging to the pages onConsoleMessage event:
我不确定您使用的是哪个版本的 PhantomJS,但至于版本 1.6+ 的文档记录在评估脚本中会将结果记录在包含的页面中。它不会登录到您的控制台。为此,您必须将日志记录绑定到 onConsoleMessage 事件页面:
page.onConsoleMessage = function (msg) { console.log(msg); };
As for the result not being available: The page.evaluate function takes arguments like so - first one is a function to be executed and the rest are passed as input to that function. The result is returned directly:
至于结果不可用: page.evaluate 函数采用这样的参数 - 第一个是要执行的函数,其余的作为输入传递给该函数。结果直接返回:
var title = page.evaluate(function (s) {
return document.querySelector(s).innerText;
}, 'title');
console.log(title);
回答by Sriram Srinivasan
evaluate
is run in sandbox mode, which means that none of the variables defined in the containing environment are available, including cb
or even the phantom
object or any functions that you may have defined.
evaluate
在沙箱模式下运行,这意味着在包含环境中定义的任何变量都不可用,包括cb
甚至phantom
对象或您可能定义的任何函数。
You can explicitly tunnel information into the sandbox as additional arguments to evaluate
.
您可以将信息作为附加参数显式地传输到沙箱中evaluate
。
page.evaluate(function(cb){...}, cb);
回答by Artjom B.
PhantomJS' page.evaluate()
function is the door to the DOM context (page context). It is only possible to access the DOM through this function. Since the function is sandboxed, you cannot use variables defined outside of it and they have to be passed in explicitly. There are limitations what can be passed in and out though (docs):
PhantomJS 的page.evaluate()
功能是通往 DOM 上下文(页面上下文)的大门。只能通过这个函数访问 DOM。由于该函数是沙盒的,因此您不能使用在其外部定义的变量,并且必须显式传入它们。可以传入和传出的内容有限制(文档):
Note:The arguments and the return value to the
evaluate
function must be a simple primitive object. The rule of thumb: if it can be serialized via JSON, then it is fine.Closures, functions, DOM nodes, etc. will notwork!
注意:函数的参数和返回值
evaluate
必须是一个简单的原始对象。经验法则:如果它可以通过 JSON 序列化,那就没问题了。闭包功能,DOM节点等,将不工作!
phantomjs-nodeis a bridge between PhantomJS and node.js and as such has a slightly different API than PhantomJS itself. Functions that are synchronous in PhantomJS don't return anything in phantomjs-node, but take a callback where the result is passed in. The callback executes in the outer context and is not sandboxed.
phantomjs-node是 PhantomJS 和 node.js 之间的桥梁,因此它的 API 与 PhantomJS 本身略有不同。在 PhantomJS 中同步的函数不会在 phantomjs-node 中返回任何内容,而是在传入结果的地方进行回调。回调在外部上下文中执行并且没有被沙箱化。
The arguments can be passed in this way:
参数可以通过这种方式传递:
page.evaluate(function(arg1, arg2){
// use arg1 and arg2 in the page
// return `result`
}, function(result){
// use `result` in the node context
}, "some arg1", "another arg");
回答by Fernando Gabrieli
The following worked for me to evaluate a page:
以下内容对我有用以评估页面:
page.evaluate(function(s) {
return document.querySelector(s)
}, 'body').then(res => {
console.log(res)
})
回答by ghf56dhv54
There is someone that have a evaluation block with only a console.log line inside and it never execute, its not always a sandbox problem.
有人有一个评估块,里面只有一个 console.log 行并且它永远不会执行,它并不总是一个沙箱问题。
see link: On PhantomJS I can't include jQuery and without jQuery I can't post form data