High-level programming language
Programming language that abstracts details of computing hardware
From Wikipedia, the free encyclopedia
A high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate (or even hide entirely) significant areas of computing systems (e.g. memory management), making the process of developing a program simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is.[1]
High-level refers to a level of abstraction from the hardware details of a processor inherent in machine and assembly code. Rather than dealing with registers, memory addresses, and call stacks, high-level languages deal with variables, arrays, objects, arithmetic and Boolean expressions, functions, loops, threads, locks, and other computer science abstractions, intended to facilitate correctness and maintainability. Unlike low-level assembly languages, high-level languages have few, if any, language elements that translate directly to a machine's native opcodes. Other features, such as string handling, object-oriented programming features, and file input/output, may also be provided. A high-level language allows for source code that is detached and separated from the machine details. That is, unlike low-level languages like assembly and machine code, high-level language code may result in data movements without the programmer's knowledge. Some control of what instructions to execute is handed to the compiler.
History
In the early decades of electronic computing, programs were usually written in machine code or assembly language, which closely reflected the instruction set of a particular machine. High-level programming languages developed as a way to express algorithms in forms closer to mathematics, business rules, or human-readable procedure, while a compiler or interpreter translated the program into machine-executable instructions. In the 1950s and 1960s, the term autocode was often used for compiler-based high-level languages; examples included early Autocode systems as well as languages such as Fortran and COBOL.[2]
The earliest high-level language designed for a computer is generally identified as Plankalkül, developed by Konrad Zuse in the 1940s.[3] It included advanced features such as structured data, assignment, conditional execution, iteration, and subroutines, but it was not implemented during Zuse's lifetime. Because Zuse worked in wartime and postwar Germany, his work remained largely isolated from the main line of language development, though it influenced Heinz Rutishauser's Superplan and, to a lesser extent, early work associated with ALGOL.[4]
The first high-level language to achieve wide practical adoption was Fortran, developed at IBM in the 1950s under John Backus.[5] Fortran was aimed at scientific and numerical computing, allowing programmers to write algebraic formulas and loops without manually managing every machine instruction. Its success helped demonstrate that compiled high-level languages could produce efficient programs while greatly reducing programming effort.[6]
Several other influential languages appeared around the same period. Lisp, created by John McCarthy, introduced symbolic list processing and made ideas from the lambda calculus central to practical programming-language design.[7] COBOL was designed for business data processing and became associated with English-like syntax, records, and file-oriented commercial applications.[8] The ALGOL family, especially ALGOL 60, influenced later language design through block structure, nested procedures, recursion, lexical scoping, a distinction between call by value and call by name, and the use of Backus–Naur form to define syntax formally.[9]
During the 1960s and 1970s, high-level languages diversified into distinct programming styles. BASIC made interactive programming accessible in time-sharing and educational environments, while Pascal emphasized structured programming and teaching. C combined high-level control structures with comparatively low-level access to memory and hardware, making it influential in operating systems and systems software. Simula introduced the class-and-object model that became central to object-oriented programming, and Smalltalk later developed object-oriented programming around message passing and a highly interactive programming environment.[10][11]
By the 1980s and 1990s, high-level language development increasingly reflected large-scale software engineering, graphical computing, and the growth of the Internet. Ada was designed for large, long-lived systems, particularly in defense and embedded contexts. C++ extended C with classes and other abstraction mechanisms, allowing object-oriented and generic programming while retaining systems-programming capabilities.[12] Scripting and rapid-application languages such as Perl, Python, Ruby, and PHP emphasized programmer productivity and text, web, or application scripting. Java promoted portable bytecode execution through the Java virtual machine, while JavaScript became the standard high-level scripting language of web browsers.[13][14][15]
In the 21st century, newer high-level languages have often focused on safer abstraction, concurrency, large codebases, and interoperability with existing platforms. C# developed alongside the .NET platform, Scala and F# blended object-oriented and functional programming, and Kotlin targeted interoperability with Java while adding concise syntax and null-safety features.[16] Go was designed for simple syntax, fast compilation, and built-in concurrency support.[17] Rust emphasized memory safety and thread safety without a garbage collector, while Swift was introduced as a modern language for Apple-platform software development.[18][19] TypeScript extended JavaScript with static typing for large-scale JavaScript applications.[20]
Abstraction penalty
A high-level language provides features that standardize common tasks, permit rich debugging, and maintain architectural agnosticism. On the other hand, a low-level language requires the coder to work at a lower-level of abstraction which is generally more challenging, but does allow for optimizations that are not possible with a high-level language. This abstraction penalty for using a high-level language instead of a low-level language is real, but in practice, low-level optimizations rarely improve performance at the user experience level.[21][permanent dead link][22][23] None the less, code that needs to run quickly and efficiently may require the use of a lower-level language, even if a higher-level language would make the coding easier to write and maintain. In many cases, critical portions of a program mostly in a high-level language are coded in assembly in order to meet tight timing or memory constraints. A well-designed compiler for a high-level language can produce code comparable in efficiency to what could be coded by hand in assembly, and the higher-level abstractions sometimes allow for optimizations that beat the performance of hand-coded assembly.[24] Since a high-level language is designed independent of a specific computing system architecture, a program written in such a language can run on any computing context with a compatible compiler or interpreter.
Unlike a low-level language that is inherently tied to processor hardware, a high-level language can be improved, and new high-level languages can evolve from others with the goal of aggregating the most popular constructs with improved features. For example, Scala maintains backward compatibility with Java. Code written in Java continues to be usable even if a developer switches to Scala. This makes the transition easier and extends the lifespan of a codebase. In contrast, low-level programs rarely survive beyond the system architecture which they were written for.
Relative meaning
The terms high-level and low-level are inherently relative, and languages can be compared as higher or lower level to each other. Sometimes the C language is considered as either high-level or low-level depending on one's perspective. Regardless, most agree that C is higher level than assembly and lower level than most other languages.
C supports constructs such as expression evaluation, parameterized and recursive functions, data types and structures which are generally not supported in assembly or directly by a processor but C does provide lower-level features such as auto-increment and pointer math. But C lacks many higher-level abstracts common in other languages such as garbage collection and a built-in string type. In the introduction of The C Programming Language (second edition) by Brian Kernighan and Dennis Ritchie, C is described as "not a very high level" language.[25]
Assembly language is higher-level than machine code, but still highly tied to the processor hardware. However, assembly may provide some higher-level features such as macros, relatively limited expressions, constants, variables, procedures, and data structures.
Machine code is at a slightly higher level abstraction than the microcode or micro-operations used internally in many processors.[26]
Execution modes
The source code of a high-level language may be processed in various ways, such as:
- Compiled
- A compiler transforms source code into other code. In some cases, a compiler generates native machine code that is interpreted by the processor; however, many execution models today involve generating an intermediate representation (i.e. bytecode) that is later interpreted in software or converted to native code at runtime (via JIT compilation).
- Sometimes the hardware is optimized to the run time requirements of specific languages. For example, the Burroughs Large Systems were target machines for ALGOL 60.[27]
- Transpiled
- Code may be translated into source code of another language (typically lower-level) for which a compiler or interpreter is available. JavaScript and the C are common targets for such translators. For example, C and C++ code can be seen as generated from Eiffel code when using the EiffelStudio IDE. In Eiffel, the translated process is referred to as transcompiling or transcompiled, and the Eiffel compiler as a transcompiler or source-to-source compiler.
- Software interpreted
- A software interpreter performs the actions encoded in source code without generating native machine code.
- Hardware interpreted
- Although uncommon, a processor with a high-level language computer architecture can process a high-level language without a programmed compilation step.[28]
Note that a language is not strictly interpreted or compiled. Rather, an execution model involves a compiler or an interpreter and the same language might be used with different execution models. For example, ALGOL 60 and Fortran have both been interpreted even though they were more typically compiled. Similarly, Java shows the difficulty of trying to apply these labels to languages, rather than to implementations. Java is compiled to bytecode which is then executed by either interpreting in a Java virtual machine (JVM) or JIT compiled.