Beginning Programming All-in-One For Dummies. Wallace Wang
source code with an assembler or compiler
An editor lets you type and save program commands (or source code) in a file. Unless you’ve written a program completely in machine language, your source code may as well have been written in Swahili because processors don’t understand any language other than machine language.
So, to convert your source code into machine language commands, you have to use an assembler (if you wrote your program commands in assembly language) or a compiler (if you wrote your program commands in the C language or a high-level language like Java).
After converting your source code into equivalent machine language commands, an assembler or compiler saves these machine language commands in a separate file, often called an executable file (or just an EXE file). When you buy a program, such as a video game or an antivirus program, you’re really buying an executable file. Without an assembler or a compiler, you can’t create your program.
Compilers translate source code into machine language, which is the native language of a specific processor. But what if you want your program to run on different processors? To do this, you have to compile your program into machine language for each different processor. You wind up with one executable file for each processor, such as an executable file for an Intel processor and a separate executable file for an ARM processor.
Many Mac programs advertise themselves as a universal binary — which means the program actually consists of two executable files smashed into a single file:One executable file contains machine language code for the M-series processor (used in newer Mac computers)
The second executable file contains machine language code for the Intel processor (used in old Mac computers)
Most compilers work only on one specific operating system and processor. So, a Windows compiler may only create programs that run under the Windows operating system. Likewise, a Linux compiler may only create programs that run under the Linux operating system.
If you write a program that runs under Windows, you can recompile it to run under Linux. Unfortunately, you may have to modify your program slightly (or a lot) to make it run under Linux.
Big companies, like Adobe and Microsoft, can afford to pay programmers to write and modify programs to run under different operating systems, such as macOS and Windows. Most smaller companies and individuals don’t have the time to rewrite a program to run under multiple operating systems. That’s why most small companies write programs for Windows — because it’s the largest market. If the program proves popular, they can later justify the time and expense to rewrite that program and compile it to run under macOS.
Choose your compiler carefully. If you use a compiler that can create only Windows programs, you may never be able to recompile that program to run on a different operating system, such as Linux or macOS. One reason Microsoft gives away its compilers for free is to trap people into writing programs that can run only under Windows. For example, if you write a program in C#, you may not be able to run that program on Linux or macOS without major modifications, which most people will probably never do.
To make it easy to create programs for multiple operating systems, you can use a cross-platform compiler. This means you can write a program once and then choose to compile it for two or more operating systems such as macOS and Windows or Android and iOS. Cross-platform tools make it easy to write the same program for multiple operating systems, but you may need to write additional code to take advantage of the unique features of each operating system.
Translating source code with an interpreter
In the old days, compilers were notoriously slow. You could feed source code to a compiler and literally come back the next morning to see if the compiler was done. If you made a single mistake in your program, you had to correct it and recompile your program all over again — with another overnight wait to see if it even worked.
Trying to write a program with such slow compilers proved maddening, so computer scientists created something faster called an interpreter. A computer interpreter is just like a foreign language interpreter who listens to each sentence you speak and then translates that sentence into another language. Type a program command into an interpreter, and the interpreter immediately translates that command into its equivalent machine language command. Type in another command, and the interpreter translates that second command right away.
The problem with interpreters is that they only store the equivalent machine language commands in memory instead of in a separate file like a compiler does. If you want to sell or distribute your program, you have to give people your source code, along with an interpreter that can convert your source code into machine language commands. Because giving away your source code essentially means giving away your program, everyone who wants to sell their programs uses a compiler instead of an interpreter.
The original reason why computer scientists developed interpreters was because compilers were so slow. But after computer scientists started creating faster compilers, most people stopped using interpreters and just used compilers. Nowadays, computer scientists use interpreters for running certain types of programming languages known as scripting languages. (Find out more about scripting languages in Book 1, Chapter 3.)
Combining a compiler with an interpreter to create p-code
Creating separate executable files for each processor can get clumsy, and giving away your source code with an interpreter may be unreasonable. A third approach is to compile your program into an intermediate format called bytecode or pseudocode (often abbreviated as p-code). Unlike compiling source code directly into machine language, you compile your program into a p-code file instead.
You can take this p-code file and copy it on any computer. To run a p-code file, you need a special p-code interpreter, or a virtual machine. The virtual machine acts like an interpreter and runs the instructions compiled into the p-code file.
The advantage of p-code is that you can distribute a single p-code version of your program, which can run on multiple computers. But P-code has a couple disadvantages:
P-code programs don’t run as fast as programs compiled into machine language.
If a computer doesn’t have the right virtual machine installed, it can’t run your program.
The most popular programming language that uses p-code is Java. After you write a Java program, you can compile it into a p-code file, which can run on any computer that has a copy of the Java virtual machine, such as Android, Linux, macOS, and Windows. Microsoft’s .NET framework is similar to p-code that (theoretically) lets you run a program on any computer that can run the complete .NET framework.
The theory behind p-code is that you write a program once, and you can run it anywhere. The reality is that every operating system has its quirks, so it’s more common to write a program and be forced to test it on multiple operating systems. More often than not, a p-code program runs perfectly fine on one operating system (like Windows) but suffers mysterious problems when running on another operating system (such as Linux). Languages, such as Java, are getting better at letting you run the same program on multiple operating systems without major modifications, but be careful because p-code doesn’t always work as well as you may think.