Digital computers don't understand the languages that humans speak. Instead, they use a binary language called machine code or machine language. Machine code consists of a sequence of simple computer instructions. Each instruction consists of one or more integers, but we can conveniently view them as a string of binary digits or bits (i.e., 1's and 0's). Different computers typically "speak" or "understand" different machine languages. For example, one computer may represent the ADD operation as 10011111
while another might represent the same operation as 000110
. The size of machine code instructions may also vary from one computer to another: 32-bit instructions are still in use, but 64-bit instructions are now the most common.
Furthermore, when a program runs, the operating system (e.g., Windows, Linux, macOS, etc.) acts as a host environment that provides services to the program. These services include essential support such as keyboard, screen, and hard drive access. Unfortunately, how the program accesses those services differs from one operating system to the next. As a result of the differences between machine languages and operating system requirements, programs written in machine language are more focused on the system running the program than on how the program solves a problem. Furthermore, it also means it isn't possible to move machine code between different computers without providing a translation service - usually as a virtual machine.
Writing programs in machine language is slow, tedious, and error-prone. Today, we write most programs in higher-level programming languages that focus more on the problem and less on the system (the hardware and the operating system) running the program (see Figure 2). But computers can't (with rare exceptions) directly execute programs written in high-level languages, so there must be some way of translating a program written in a high-level language into machine language. Two kinds of computer programs perform the necessary translation: compilers and interpreters.
A compiler is a program that translates other programs written in a high-level programming language like C or C++ into machine code or machine language. Some languages, such as Java and C#, take a different route. Compilers for these languages translate the high-level source code into an intermediate form (a representation that lies somewhere between the high-level and actual machine code) called virtual machine code. The virtual machine code then becomes the input to another program called an interpreter or virtual machine (VM), a program that simulates a hardware CPU.
Other languages, such as Javascript and Perl, are completely interpreted. These languages don't use compilers at all. The interpreter reads the source code, written in a high-level language, and interprets the instructions one at a time. That is, the interpreter itself carries out each instruction in the program.
Each approach to running a program written in a high-level programming language has advantages and disadvantages. Programs written in fully compiled languages (e.g., C and C++) execute faster than programs written in partially compiled languages (e.g., Java and C#) and run much faster than programs written in fully interpreted languages (e.g., Javascript and Perl). To give some idea of the difference in performance, let's say that a C++ program, once compiled, executes in time 1. A program in a hybrid language (compiled and interpreted) will generally run in time 3 to 10. In a purely interpreted language, the same program runs in a time of about 100. Contemporary versions of the Java and C# VMs use a just in time (JIT) interpreter that compiles some of the virtual code to machine code while processing it. JIT processors reduce run time to about 1.5 times that of purely compiled language systems. The Python programming language is a bit different.
Python programs have some fully interpreted parts, but these parts are a small portion of the overall program and have minimal impact on the overall runtime. The Python libraries, where the program spends most of its time, are written in C and run very fast. So, Python programs run almost as fast as an equivalent C program.
Conversely, once we compile a program written in purely compiled languages, we can't easily move the resulting executable machine code to a different platform (e.g., you can't run a Windows program on an Apple computer). In contrast, we can easily move programs we write in interpreted languages between different computers.
Interpreted programs are portable because they run on a VM or interpreter. The interpreter is the running program from the hardware and operating system perspectives. We write interpreters and VMs in purely compiled languages, so they are not portable, but the programs they run are. Once we install the interpreter on a system, we can move interpretable programs to the system and run them without further processing.