javascript 查看网站的实际源代码

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8789069/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 04:34:47  来源:igfitidea点击:

viewing actual source code of a website

javascripthtml

提问by tomermes

I'll explain my question with an example. Suggest I go the the url: http://www.google.co.il/#q=university

我会用一个例子来解释我的问题。建议我去网址:http: //www.google.co.il/#q=university

and then I right click and choose "view source", I don't get the real html source, I'm sure of that because if I search in the code unique words that appear in the document I get no results.

然后我右键单击并选择“查看源代码”,我没有得到真正的 html 源代码,我确信这一点,因为如果我在代码中搜索出现在文档中的唯一词,我不会得到任何结果。

I know that in chrome I can mark something and check the component, then I can see the real source code, but I want to use a java program for getting the code so I want to understand the issue of why I don't see the real html source when I go to 'view source'.

我知道在 chrome 中我可以标记一些东西并检查组件,然后我可以看到真正的源代码,但是我想使用 java 程序来获取代码,所以我想了解为什么我看不到的问题当我去“查看源代码”时,真正的 html 源代码。

采纳答案by MrHelper

Well, if you select "view source" you see the actual HTML source code of the page in your address bar. However, it might be that the page(s) you want to view are "obfuscated" by having embedded code which loads external content and puts it in your HTML.

好吧,如果您选择“查看源代码”,您会在地址栏中看到页面的实际 HTML 源代码。但是,您想要查看的页面可能是通过嵌入代码加载外部内容并将其放入您的 HTML 中而“混淆”的。

If you still want to automatically parse such a page in a "nice" you need to run a whole HTML interpreter like for example Webkit - a hell of work, and in principle what you are doing with "inspect element". The other way is that you find the lines in the page-html that load the external contents and then in turn load them on your own. If you are lucky this is not obfuscated on purpose and kind of easy to achive for small tasks.

如果您仍然想以“不错”的方式自动解析这样的页面,您需要运行一个完整的 HTML 解释器,例如 Webkit——这是一项艰巨的工作,原则上您正在使用“检查元素”做些什么。另一种方法是您在 page-html 中找到加载外部内容的行,然后自己加载它们。如果你幸运的话,这不是故意混淆的,并且对于小任务来说很容易实现。

However, if you need the whole DOM structure, you should think about implementing one of the browser engines...

但是,如果您需要整个 DOM 结构,您应该考虑实现其中一个浏览器引擎......

回答by Joachim Isaksson

View source usually does not show any javascript generated content, for seeing that you'll want to use a plugin as for example firebug.

查看源代码通常不会显示任何 javascript 生成的内容,因为您会看到您想要使用插件,例如 firebug。

回答by mrembisz

The only way I know to see the actual source in Java, including javascript made modification would be through a virtual browser framework, like HtmlUnit.

我所知道的在 Java 中查看实际源代码的唯一方法,包括对 javascript 进行修改是通过虚拟浏览器框架,如HtmlUnit

HtmlUnit can execute JS scripts and apply all changes to the DOM tree. You would have to serialize it to get the actual page. Keep in mind there is no such thing as "complete html source". You can only get DOM tree and possibly serialize it.

HtmlUnit 可以执行 JS 脚本并将所有更改应用于 DOM 树。您必须对其进行序列化才能获得实际页面。请记住,没有“完整的 html 源代码”这样的东西。您只能获取 DOM 树并可能对其进行序列化。

回答by gibffe

In the example page you gave, each result element is generated by the JS script function from one of the files loaded; moreover, it does not render text with plain characters but with Unicode instead.

在您给出的示例页面中,每个结果元素都是由 JS 脚本函数从加载的文件之一生成的;此外,它不使用纯字符呈现文本,而是使用 Unicode。

回答by Azodious

What word did you search?

你搜什么词?

I guess view source will show the complete HTML code, even that part which is not visible on the page. try to search again after trimming the search string. and search same string in chrome also, what you tried earlier.

我想查看源代码会显示完整的 HTML 代码,即使是页面上不可见的部分。修剪搜索字符串后尝试再次搜索。并在 chrome 中搜索相同的字符串,你之前尝试过的。

Plus, it will not be updated if JSchanges HTMLafter onloadevent completes.

此外,如果事件完成后发生JS更改,则不会更新。HTMLonload

回答by Jeremiah Orr

The text you're looking for could have been rendered from JavaScript. If you're using Chrome (since you mentioned it), the web developer pane that comes up when you do "inspect element" has a "Resources" tab that lists JavaScript files, stylesheets, etc.

您要查找的文本可能是从 JavaScript 呈现的。如果您使用的是 Chrome(因为您提到了它),则在您执行“检查元素”时出现的 Web 开发人员窗格有一个“资源”选项卡,其中列出了 JavaScript 文件、样式表等。

回答by yatskevich

"View source" gives you a pure response generated by a server. As Joachim Isaksson has already mentioned - use Chrome or Firebug for Firefox.

“查看源代码”为您提供由服务器生成的纯响应。正如 Joachim Isaksson 已经提到的 - 对 Firefox 使用 Chrome 或 Firebug。