java编译究竟是如何进行的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3406942/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How exactly does java compilation take place?
提问by nash
Confused by java compilation process
被java编译过程弄糊涂了
OK i know this: We write java source code, the compiler which is platform independent translates it into bytecode, then the jvm which is platform dependent translates it into machine code.
好的,我知道这一点:我们编写 java 源代码,平台无关的编译器将其翻译成字节码,然后平台相关的 jvm 将其翻译成机器码。
So from start, we write java source code. The compiler javac.exe is a .exe file. What exactly is this .exe file? Isn't the java compiler written in java, then how come there is .exe file which executes it? If the compiler code is written is java, then how come compiler code is executed at the compilation stage, since its the job of the jvm to execute java code. How can a language itself compile its own language code? It all seems like chicken and egg problem to me.
所以从一开始,我们就编写java源代码。编译器 javac.exe 是一个 .exe 文件。这个.exe文件到底是什么?java编译器不是用java写的,那怎么会有.exe文件执行呢?如果编译器代码是java写的,那么编译器代码怎么会在编译阶段执行,因为执行java代码是jvm的工作。语言本身如何编译自己的语言代码?对我来说,这一切似乎都是先有鸡还是先有蛋的问题。
Now what exactly does the .class file contain? Is it a abstract syntax tree in text form, is it tabular information, what is it?
现在 .class 文件到底包含什么?是文本形式的抽象语法树,是表格信息,是什么?
can anybody tell me clear and detailed way about how my java source code gets converted in machine code.
谁能告诉我有关如何将我的 java 源代码转换为机器代码的清晰详细的方法。
采纳答案by Rekin
OK i know this: We write java source code, the compiler which is platform independent translates it into bytecode,
好的,我知道这一点:我们编写 java 源代码,独立于平台的编译器将其翻译成字节码,
Actually the compiler itself worksas a native executable (hence javac.exe). And true, it transforms source file into bytecode. The bytecode is platform independent, because it's targeted at Java Virtual Machine.
实际上,编译器本身工作作为本机的可执行文件(文件javac.exe因此)。确实,它将源文件转换为字节码。字节码是平台无关的,因为它是针对 Java 虚拟机的。
then the jvm which is platform dependent translates it into machine code.
然后依赖于平台的 jvm 将其转换为机器代码。
Not always. As for Sun's JVM there are two jvms: client and server. They both can, but not certainly have to compile to native code.
不总是。对于 Sun 的 JVM,有两个 jvm:客户端和服务器。它们都可以,但不一定必须编译为本机代码。
So from start, we write java source code. The compiler javac.exe is a .exe file. What exactly is this .exe file? Isn't the java compiler written in java, then how come there is .exe file which executes it?
所以从一开始,我们就编写java源代码。编译器 javac.exe 是一个 .exe 文件。这个.exe文件到底是什么?java编译器不是用java写的,那怎么会有.exe文件执行呢?
This exe
file is a wrapped java bytecode. It's for convenience - to avoid complicated batch scripts. It starts a JVM and executes the compiler.
这个exe
文件是一个包装好的java字节码。这是为了方便 - 避免复杂的批处理脚本。它启动一个 JVM 并执行编译器。
If the compiler code is written is java, then how come compiler code is executed at the compilation stage, since its the job of the jvm to execute java code.
如果编译器代码是java写的,那么编译器代码怎么会在编译阶段执行,因为执行java代码是jvm的工作。
That's exactly what wrapping code does.
这正是包装代码的作用。
How can a language itself compile its own language code? It all seems like chicken and egg problem to me.
语言本身如何编译自己的语言代码?对我来说,这一切似乎都是先有鸡还是先有蛋的问题。
True, confusing at first glance. Though, it's not only Java's idiom. The Ada's compiler is also written in Ada itself. It may look like a "chicken and egg problem", but in truth, it's only a bootstrapping problem.
确实,乍一看令人困惑。不过,这不仅仅是 Java 的习惯用法。Ada 的编译器也是用 Ada 本身编写的。这可能看起来像一个“鸡和蛋的问题”,但实际上,这只是一个引导问题。
Now what exactly does the .class file contain? Is it an abstract syntax tree in text form, is it tabular information, what is it?
现在 .class 文件到底包含什么?是文本形式的抽象语法树,是表格信息,是什么?
It's not Abstract Syntax Tree. AST is only used by tokenizer and compiler at compiling time to represent code in memory. .class
file is like an assembly, but for JVM. JVM, in turn, is an abstract machine which can run specialized machine language - targeted only at virtual machine. In it's simplest, .class
file has a very similar structure to normal assembly. At the beginning there are declared all static variables, then comes some tables of extern function signatures and lastly the machine code.
它不是抽象语法树。AST 仅在编译时由标记器和编译器用于表示内存中的代码。.class
文件就像一个程序集,但对于 JVM。反过来,JVM 是一个抽象机器,可以运行专门的机器语言——仅针对虚拟机。最简单的是,.class
文件具有与普通程序集非常相似的结构。一开始声明了所有静态变量,然后是一些外部函数签名表,最后是机器代码。
If You are really curious You can dig into classfile using "javap" utility. Here is sample (obfuscated) output of invoking javap -c Main
:
如果您真的很好奇,您可以使用“javap”实用程序深入研究类文件。这是调用的示例(混淆)输出javap -c Main
:
0: new #2; //class SomeObject
3: dup
4: invokespecial #3; //Method SomeObject."<init>":()V
7: astore_1
8: aload_1
9: invokevirtual #4; //Method SomeObject.doSomething:()V
12: return
So You should have an idea already what it really is.
所以你应该已经知道它到底是什么。
can anybody tell me clear and detailed way about how my java source code gets converted in machine code.
谁能告诉我有关如何将我的 java 源代码转换为机器代码的清晰详细的方法。
I think it should be more clear right now, but here's short summary:
我认为现在应该更清楚了,但这里有一个简短的总结:
You invoke
javac
pointing to your source code file. The internal reader(or tokenizer) of javac reads your file and builds an actual AST out of it. All syntax errors come from this stage.The
javac
hasn't finished its job yet. When it has the AST the true compilation can begin. It's using visitor pattern to traverse AST and resolves external dependencies to add meaning (semantics) to the code. The finished product is saved as a.class
file containing bytecode.Now it's time to run the thing. You invoke
java
with the name of .class file. Now the JVM starts again, but to interpretYour code. The JVM may, or may not compile Your abstract bytecode into the native assembly. The Sun's HotSpot compiler in conjunction with Just In Time compilation may do so if needed. The running code is constantly being profiled by the JVM and recompiled to native code if certain rules are met. Most commonly the hotcode is the first to compile natively.
您调用
javac
指向您的源代码文件。javac的内部读取器(或标记器)读取您的文件并从中构建实际的 AST。所有的语法错误都来自这个阶段。在
javac
还没有完成它的工作呢。当它有 AST 时,真正的编译就可以开始了。它使用访问者模式遍历 AST 并解析外部依赖项以向代码添加含义(语义)。成品保存为.class
包含字节码的文件。现在是时候运行这个东西了。您
java
使用 .class 文件的名称进行调用。现在 JVM 再次启动,但要解释您的代码。JVM 可能会,也可能不会将您的抽象字节码编译成本机程序集。如果需要,Sun 的 HotSpot 编译器与 Just In Time 编译器一起可以这样做。正在运行的代码会不断被 JVM 分析,如果满足某些规则,则会重新编译为本机代码。最常见的是热代码是第一个本地编译的。
Edit: Without the javac
one would have to invoke compiler using something similar to this:
编辑:如果没有,则javac
必须使用类似于此的内容来调用编译器:
%JDK_HOME%/bin/java.exe -cp:myclasspath com.sun.tools.javac.Main fileToCompile
As you can see it's calling Sun's private API so it's bound to Sun JDK implementation. It would make build systems dependent on it. If one switched to any other JDK (wiki lists 5 other than Sun's) then above code should be updated to reflect the change (since it's unlikely the compiler would reside in com.sun.tools.javac package). Other compilers could be written in native code.
如您所见,它正在调用 Sun 的私有 API,因此它绑定到 Sun JDK 实现。它会使构建系统依赖于它。如果切换到任何其他 JDK(wiki 列出了 Sun 之外的 5 个),则应更新上面的代码以反映更改(因为编译器不太可能驻留在 com.sun.tools.javac 包中)。其他编译器可以用本机代码编写。
So the standard way is to ship javac
wrapper with JDK.
所以标准的方法是javac
用 JDK运送包装器。
回答by ZoFreX
The .class file contains bytecode which is sort oflike very high-level Assembly. The compiler could very well be written in Java, but the JVM would have to be compiled to native code to avoid the chicken/egg problem. I believe it is written in C, as are the lower levels of the standard libraries. When the JVM runs, it performs just-in-time compilation to turn that bytecode into native instructions.
.class 文件包含字节码,有点像非常高级的 Assembly。编译器可以用 Java 编写,但必须将 JVM 编译为本机代码以避免鸡/蛋问题。我相信它是用 C 编写的,标准库的较低级别也是如此。当 JVM 运行时,它会执行即时编译以将该字节码转换为本机指令。
回答by matt b
Isn't the java compiler written in java, then how come there is .exe file which executes it?
java编译器不是用java写的,那怎么会有.exe文件执行呢?
Where do you get this information from? The javac
executable could be written in any programming language, it is irrelevant, all that is important is that it is an executable which turns .java
files into .class
files.
你从哪里得到这些信息?该javac
可执行文件可以写在任何编程语言,它是无关紧要的,所有重要的是,它是一个可执行文件,打开.java
文件到.class
文件。
For details on the binary specification of a .class file you might find these chapters in the Java Language Specificationuseful (although possibly a bit technical):
有关 .class 文件的二进制规范的详细信息,您可能会发现Java 语言规范中的这些章节很有用(尽管可能有点技术性):
You can also take a look at the Virtual Machine Specificationwhich covers:
您还可以查看虚拟机规范,其中包括:
回答by Mike Caron
Well, javac and the jvm are typically native binaries. They're written in C or whatever. It's certainly possible to write them in Java, just you need a native version first. This is called "boot strapping".
好吧,javac 和 jvm 通常是本机二进制文件。它们是用 C 或其他语言编写的。当然可以用 Java 编写它们,只是您首先需要一个本机版本。这称为“引导捆绑”。
Fun fact: Most compilers that compile to native code are written in their own language. However, they all had to have a native version written in another language first (usually C). The first C compiler, by comparison, was written in Assembler. I presume that the first assembler was written in machine code. (Or, using butterflies;)
有趣的事实:大多数编译为本机代码的编译器都是用他们自己的语言编写的。然而,他们都必须先有一个用另一种语言(通常是 C)编写的本地版本。相比之下,第一个 C 编译器是用汇编程序编写的。我假设第一个汇编程序是用机器代码编写的。(或者,使用蝴蝶;)
.class files are bytecode generated by javac. They're not textual, they're binary code similar to machine code (but, with a different instruction set and architechture).
.class 文件是由 javac 生成的字节码。它们不是文本,它们是类似于机器代码的二进制代码(但是,具有不同的指令集和架构)。
The jvm, at run time, has two options: It can either intepret the byte code (pretending to be a CPU itself), or it can JIT (just-in-time) compile it into native machine code. The latter is faster, of course, but more complex.
jvm 在运行时有两个选项:它可以解释字节码(假装是 CPU 本身),或者它可以 JIT(即时)将其编译为本地机器代码。后者当然更快,但更复杂。
回答by Paolo
The compiler was originally written in C with bits of C++ and I assume that it still is (why do you think the compiler is written in Java as well?). javac.exe is just the C/C++ code that is the compiler.
编译器最初是用 C 语言编写的,但我认为它仍然是(为什么你认为编译器也是用 Java 编写的?)。javac.exe 只是作为编译器的 C/C++ 代码。
As a side point you could write the compiler in java, but you're right, you have to avoid the chicken and egg problem. To do this you'd would typically write one or more bootstrapping tools in something like C to be able to compile the compiler.
顺便说一句,您可以用 java 编写编译器,但您是对的,您必须避免先有鸡还是先有蛋的问题。为此,您通常会用 C 之类的语言编写一个或多个引导工具,以便能够编译编译器。
The .class file contains the bytecodes, the output of the javac compilation process and these are the instructions that tell the JVM what to do. At runtime these bytecodes have are translated to native CPU instructions (machine code) so they can execute on the specific hardware under the JVM.
.class 文件包含字节码、javac 编译过程的输出以及这些是告诉 JVM 做什么的指令。在运行时,这些字节码已被转换为本地 CPU 指令(机器代码),因此它们可以在 JVM 下的特定硬件上执行。
To complicate this a little, the JVM also optimises and caches machine code produced from the bytecodes to avoid repeatedly translating them. This is known as JIT compilation and occurs as the program is running and bytecodes are being interpreted.
更复杂的是,JVM 还优化和缓存从字节码生成的机器代码,以避免重复翻译它们。这称为 JIT 编译,在程序运行和字节码被解释时发生。
回答by Michael Borgwardt
The compiler javac.exe is a .exe file. What exactly is this .exe file? Isn't the java compiler written in java, then how come there is .exe file which executes it?
编译器 javac.exe 是一个 .exe 文件。这个.exe文件到底是什么?java编译器不是用java写的,那怎么会有.exe文件执行呢?
The Java compiler (at least the one that comes with the Sun/Oracle JDK) is indeed written in Java. javac.exe
is just a launcher that processes the command line arguments, some of which are passed on to the JVM that runs the compiler, and others to the compiler itself.
Java 编译器(至少是 Sun/Oracle JDK 附带的编译器)确实是用 Java 编写的。javac.exe
只是一个处理命令行参数的启动器,其中一些传递给运行编译器的 JVM,其他传递给编译器本身。
If the compiler code is written is java, then how come compiler code is executed at the compilation stage, since its the job of the jvm to execute java code. How can a language itself compile its own language code? It all seems like chicken and egg problem to me.
如果编译器代码是java写的,那么编译器代码怎么会在编译阶段执行,因为执行java代码是jvm的工作。语言本身如何编译自己的语言代码?对我来说,这一切似乎都是先有鸡还是先有蛋的问题。
Many (if not most) compilers are written in the language they compile. Obviously, at some early stage the compiler itself had to be compiled by something else, but after that "bootstrapping", any new version of the compiler can be compiled by an older version.
许多(如果不是大多数)编译器是用它们编译的语言编写的。显然,在某个早期阶段,编译器本身必须由其他东西编译,但在“引导”之后,任何新版本的编译器都可以由旧版本编译。
Now what exactly does the .class file contain? Is it a abstract syntax tree in text form, is it tabular information, what is it?
现在 .class 文件到底包含什么?是文本形式的抽象语法树,是表格信息,是什么?
The details of the class file format are described in the Java Virtual Machine specification.
Java 虚拟机规范中描述了类文件格式的详细信息。
回答by Thorbj?rn Ravn Andersen
Windows doesn't know how to invoke Java programs before installing a Java runtime, and Sun chose to have native commands which collect arguments and then invoke the JVM instead of binding the jar-suffix to the Java engine.
Windows 在安装 Java 运行时之前不知道如何调用 Java 程序,而 Sun 选择使用本机命令收集参数然后调用 JVM,而不是将 jar 后缀绑定到 Java 引擎。
回答by user3684728
- .java file
- compiler(JAVA BUILD)
- .class(bytecode)
- JVM(system software usually build with 'C')
- OPERATING PLATFORM
- PROCESSOR
- .java 文件
- 编译器(JAVA BUILD)
- .class(字节码)
- JVM(系统软件通常用'C'构建)
- 操作平台
- 处理器
回答by Arvind Purushotham
Short Explanation
简短说明
Write code on a text editor, save it in a format that compiler understands - ".java"file extension, javac(java compiler) converts this to ".class"format file (byte code - class file). JVM executes the .class file on the operating system that it sits on.
在文本编辑器上编写代码,以编译器理解的格式保存—— “.java”文件扩展名,javac(java编译器)将其转换为“.class”格式文件(字节码——类文件)。JVM 在它所在的操作系统上执行 .class 文件。
Long Explanation
长说明
Always remember java is not the base language that operating system recognizes. Java source code is interpreted to the operating system by a translator called Java Virtual Machine (JVM). JVM cant understand the code that you write in a editor, it needs compiled code. This is where a compiler comes into picture.
永远记住,java 不是操作系统识别的基本语言。Java 源代码由称为Java 虚拟机 (JVM)的转换器解释给操作系统。JVM 无法理解您在编辑器中编写的代码,它需要编译后的代码。这就是编译器出现的地方。
Every computer process indulges in memory manipulation. We cant just write code in a text editor and compile it. We need to put it in the computer's memory, i.e save it before compiling.
每个计算机进程都沉迷于内存操作。我们不能只在文本编辑器中编写代码并编译它。我们需要把它放在计算机的内存中,即在编译之前保存它。
How will the javac (java compiler) recognize the saved text as the one to be compiled?- We have a separate text format that the compiler recognizes, i.e .java. Save the file in .java extension and the compiler will recognize it and compile it when asked.
javac(java 编译器)如何将保存的文本识别为要编译的文本?- 我们有编译器识别的单独文本格式,即 .java。将文件保存为 .java 扩展名,编译器会识别它并在询问时编译它。
What happens while compiling? -Compiler is a second translator(not a technical term) involved in the process, it translates user understood language(java) into JVM understood language(Byte code - .class format).
编译时会发生什么?-编译器是过程中涉及的第二个翻译器(不是技术术语),它将用户理解的语言(java)翻译成 JVM 理解的语言(字节码 - .class 格式)。
What happens after compiling? -The compiler produces .class file that JVM understands. The program is then executed, i.e the .class file is executed by JVM on the operating system.
编译后会发生什么?-编译器生成 JVM 理解的 .class 文件。然后执行程序,即 .class 文件由操作系统上的 JVM 执行。
Facts you should know
你应该知道的事实
1) Java is not multi-platformit is platform independent.
1) Java 不是多平台的,它是平台无关的。
2) JVM is developed using C/C++. One of the reason why people call Java a slower language than C/C++
2) JVM 是使用C/C++ 开发的。人们称 Java 为比 C/C++ 慢的语言的原因之一
3) Java byte code (.class) is in "Assembly Language", the only language understood by JVM. Any code that produces .class file on compilation or generated Byte code can be run on the JVM.
3) Java 字节码(.class) 是“汇编语言”,JVM 唯一能理解的语言。任何在编译时生成 .class 文件或生成字节代码的代码都可以在 JVM 上运行。