使用 curl 执行 javascript 后获取源代码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10514604/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 10:07:29  来源:igfitidea点击:

get sourcecode after javascript execution with curl

javascripthtmlcurl

提问by reox

Is it possible to get the html source code of a webpage with curl and the run a javascript interpreter over it, so i get the generated content?

是否可以使用 curl 获取网页的 html 源代码并在其上运行 javascript 解释器,以便我获得生成的内容?

The Page i need to get uses some encoded and genereated content in there so i want to first run the javascript to get the escaped and generated content... or do i need to regex the javascript and "compile" the javascript on my own? Like

我需要获取的页面在那里使用了一些编码和生成的内容,所以我想首先运行 javascript 来获取转义和生成的内容......还是我需要对 javascript 进行正则表达式并自己“编译”javascript?喜欢

curl <myurl> | perl -ne 'm/unescape\((.*)\)/; print ""' | <now to something with that>

i known there is no javascript engine in curl but can i just call another script / programm to do the job?

我知道 curl 中没有 javascript 引擎,但我可以调用另一个脚本/程序来完成这项工作吗?

回答by dcow

You can do it, but it's more involved than I think you realize. Neither curlnor wgethave Javascript engines, so you'll need something that has one.

你可以做到,但它比我认为你意识到的更复杂。既curl没有wgetJavascript 引擎,也没有 Javascript 引擎,所以你需要有一个引擎。

I would start by looking at PhantomJS.

我会先看看PhantomJS