Preparing for an interview, exam, or just curious to learn about what compilers and interpreters do? In a very basic sense, a compiler compiles the entire code altogether for later use whereas an interpreter reads the code line by line at run time.
However, to understand the depths of how modern-day compilers and interpreters work for various programming languages, we need to go through a lot more details.
Given below are the topics we will cover to settle the compiler vs interpreter debate –
- What is a Compiler
- What is an Interpreter
- Just in Time Compiler
- Compiler vs Interpreter: Head to Head Comparison
- C Compiler – Generates Machine Code
- Java Compiler and JVM Interpreter
- Python: Compiler or Interpreter
- Closing Remarks
What is a Compiler
The historical definition of a compiler defines it as a software that converts the source code of a computer program to machine instructions or machine code. Source code is the code that developers write whereas machine code is all 0s and 1s, and consists of instructions for the computer CPU to perform tasks.
The above definition essentially means that a compiler must understand the mechanics of the programming language in which source code is written, and for that reason, compilers are language-specific. Example: One of the C compilers is GCC and the java bytecode compiler is Javac.
Furthermore, the output in the form of machine code too is not 100% generic, it needs to have specific instructions for different kinds of processors. AMD for example may not understand the binary/machine code generated for Intel processors. So, compilers need to be platform-specific too, not all though, you will know in a short while.
There is more..
Nowadays, the term compiler is used to cover many other use cases. It is also used for software used to translate code from one format to another.
The input to the compiler doesn’t necessarily have to be a programming language source code and the output doesn’t necessarily have to be machine code. Take an example of a Java compiler that translates “.Java” files to .class files. “.Class” file is not the final machine code, it is the intermediate bytecode, which needs further translation or interpretation for the machine. But still, we call the software that converts “.Java” to “.Class” as a compiler.
You can also find an alternate compiler for Java that translates Java code to C code by converting Java files to C language files. C language code then can further be compiled to machine code by the C compiler. Many would call this type of software a language source code converter rather than a compiler, but it’s all blurry.
What is an Interpreter?
Again, if we look at the historical definition of an interpreter, an interpreter is a software that reads the source code line by line and generates machine instructions at run time. So, it doesn’t pre-compile anything but interprets the provided input, on the fly, to directly instruct the CPU for performing tasks in sequence.
Wait, What about JIT (Just in Time Compiler)
Just in Time compiler is another variation of compilers that you encounter in today’s world. JIT compiler typically reads pre-compiled bytecode generated by the compiler, and translates it into the machine code on the fly, during run time.
Let us extend our understanding of java compiler. Java compiler converts .java to .class. “.Class: contains the bytecode which runs on the Java virtual machine – JVM. But what does it have to do with JIT?
JVMs’ earlier implementations used to read bytecode and generate machine instructions on the fly. Soon after, JVMs started to implement JIT as well, to convert the entire bytecode into machine code just before the execution, in the memory.
Why JIT? Pre-compiled machine code is typically optimized and CPU performs faster as compared to an interpreter which reads the code line by line to generate machine instructions. To infer, bytecode interpretation at run time is slower than compiling bytecode to machine code using JIT and then executing it.
Compiler Vs Interpreter: Head to Head Comparison
|Compiler picks up the entire piece of code and produces optimized output||Interpreter picks up code line by line|
|Compiler output is stored on the disk for later use||Interpreter works on the provided input on the fly and does not store anything|
|Runs faster since full code is compiled and optimized prior to actual execution||Slower than compiled code since code is read on the fly and interpreted line by line without overall optimization|
|Compiler may or may not produce machine code required by the CPU||Interpreter always produces machine instructions required to CPU for task execution|
Let us see how source code of various programming languages ultimately translates to machine instructions –
C Language (Machine Code by Compiler)
This is a straightforward use case where a C compiler translates C code to the machine instructions. There are multiple tasks performed by the compiler in between though –
- Preprocessing – Expand macros and include files. Also, remove the comments, etc. to generate one clean set of code.
- Compiling – Convert source code to assembly
- Assembly – Convert assembly code to CPU instructions
- Linking – Put together all the modules either beforehand (Static Linking) or at run time (Dynamic Linking).
Once developer written source code is compiled, it is no longer needed for program execution. All that a CPU need is the final compiled code.
Java (Compiler and JVM Interpreter)
Java is clearly a two-step process. The first step is to generate platform-independent bytecode in the form of class files from java files. This is what makes Java a platform-independent language too from a developer perspective since developers just need to worry about generating standard class files.
If classes are platform-specific then what about the different architecture of processors. Well, that is taken care of by the JVMs, you have platform-specific JVMs that produce the output as per the instructional pattern required by the specific platform.
You may want to read about Online Java Compilers
Python (Compiler or Interpreter)
Python is more like Java from a life cycle perspective. There is a minor difference though, developers do not need to compile the code, python implementation takes care of it and converts the source code in .py files to compiled code in .pyc files behind the scene.
.pyc files are then interpreted by the PVM – Python Virtual Machine, at runtime, similar to how Java bytecode is interpreted by the JVM.
From a developer perspective, Python looks like an interpreted language but in practice, it is a compiled language and the code is actually pre-compiled.
Furthermore, the Python ecosystem also has something called Jython that converts .py code to the bytecode that can run on JVM itself instead of PVM. Not only that, but it also has IronPython that makes python code run on .Net environments.
Compiler vs interpreter is more like a scholarly discussion these days and brings in differing views without concrete definitions. The reason is the evolution of both over time.
The ultimate goal is to get to the machine code, whether you do it in one way or the other, and by using one tool or multiple is purely dependent on the use cases.
Furthermore, the journey of the source code is not as simple as it might seem, there are multiple steps in between including code cleanup, removing comments, the inclusion of referred files, pre-compiling, assembly of the code, modules linking, language conversion in some cases, bytecode generation in few others and whatnot.
Well, don’t overlearn either and focus only on the things you really need to deal with! Also, do leave a comment for our readers, my friend!