|
|
Introduction At its most fundamental, a computer deals with two types of information: data and programs. Recall from our discussion of ALUs (Arithmetic & Logic Units) that an instruction is a set of bits that control what the ALU does. A long string of instructions, designed to accomplish some particular task, is called a program. The life of a processor is dull:
|
|
Low-Level Languages The instructions that command the processor are, as already mentioned, just strings of bits. At this fundamental level, that's all the processor "understands". This is called machine language, and is the lowest level of computer language. By "low", I don't mean to imply some sort of judgement. I'm not dissing machine language. What I mean is that this is the real language that actually controls the various transistors and logic gates do. It is the bottom floor of a tall building, supporting all of the more complex things that a computer can do. Here's a very short machine-language program written for the 6502, the processor at the heart of the old Apple // computers that were popular in the early 1980s. The bits have been grouped into bytes and converted to hexadecimal: each two-digit number is one byte.
Can you tell what it does? No, you can't. Some of those bytes are instructions, some are data, but there's no easy way to tell which is which. (If you're dying of curiosity, this program draws a short flickering line on the screen.) We can make it a little more readable by replacing the instruction bytes with short names that at least remind us of their function. We call this assembly language. Here is the same program, rewritten in assembly.
The A5 command loads a particular register inside the CPU (called the X register) with the byte that is stored in RAM location C057h. It's a lot easier to remember that LDX does that, rather than A5. It's still not clear why the X register is being loaded with that number. It turns out that in the Apple // circuit design, any access to that memory location would put the screen in graphics mode, rather than text-only. (Keep in mind, this was before GUIs.) The point is, you have no way of knowing that from looking at the program. What this program does depends on the code itself, and also the processor and the other circuitry that is part of the overall design. That makes learning machine or assembly language really hard. Even if you become a master at programming the Apple //, if you wanted to program a different computer, you'd have to start learning again from scratch. What we would really like is a computer programming language that has two critical features:
It was in quest of these attributes that high-level languages were developed. |
|
High-Level Languages are themselves programs, designed to take text written by a programmer (the so-called source code) and to translate that text into machine language the the CPU actually executes (the object code) When that translation takes place is the distinction between the two basic types of high-level languages: compiled and interpreted. In a compiled language, the program is fed to a compiler, which convert the program into machine language once. That machine-language file is then saved, and executed ("run") every time the program is needed. The compiler is not needed again after the conversion has taken place. In contrast, an interpreted language converts the program into machine code every time it is needed, as it is executed. A program called an interpreter needs to be running. It takes the high-level language code, and does what the program says by running appropriate machine code through the processor. The distinction between compilers and interpreters is analogous to the difference between human-language translators and interpreters. A translator takes a work, such as a book, and converts in from one language to another. An interpreter, on the other hand, converts language on the fly, as a conversation is taking place. Readability is achieved by naming functions with human-language equivalents that are, if not self-evident, at least easy to remember once learned. For example, the BASIC command PRINT prints something on the screen. The statement
will cause the words Hello World to appear on the screen. The program
will cause a sequence of integers, counting towards infinity, to scroll up the screen as fast as the computer can do it. Portability is promoted by setting standards, or a specification, where the developer (or developers) define what the language will look like: what commands it will have. Programmers will then write (perhaps in assembly language) either an interpreter or compiler that will implement that language for a particular processor. Some examples of high-level languages you may have heard of are listed below. The year of invention is in parentheses. Each of these languages has evolved through the years as developers add new features or other improvements, so there are many "dialects" of the BASIC language, for example. Ideally, a program written in BASIC will run, without change, on a Macintosh, a Windows machine, or a Linux machine. Unless you are using the same version of BASIC, however, you may need to make minor changes in a program to get it to work on another machine.
Most of these exist in both compiled and interpreted versions. Exceptions are C and C++ (which are always compiled), and Perl (which is always interpreted). Java is a special case, being always compiled and interpreted. The source code is compiled into the object code for a processor that does not actually exist, the so-called Java Virtual Machine. An interpreter, the Java Runtime Environment (JRE) then executes the JVM code, converting it into the machine code for whatever computer model it is actually running on. |
|
|