C (programming language)
Paradigm | Multi-paradigm: imperative (procedural), structured |
---|---|
Designed by | Dennis Ritchie |
Developer | Dennis Ritchie & Bell Labs (creators); ANSI X3J11 (ANSI C); ISO/IEC JTC1/SC22/WG14 (ISO C) |
First appeared | 1972[2] |
Stable release | C17
/ June 2018 |
Preview release | |
Typing discipline | Static, weak, manifest, nominal |
OS | Cross-platform |
Filename extensions | .c, .h |
Website | www www |
Major implementations | |
K&R C, GCC, Clang, Intel C, C++Builder, Microsoft Visual C++, Watcom C | |
Dialects | |
Cyclone, Unified Parallel C, Split-C, Cilk, C* | |
Influenced by | |
B (BCPL, CPL), ALGOL 68,[4] Assembly, PL/I, FORTRAN | |
Influenced | |
Numerous: AMPL, AWK, csh, C++, C--, C#, Objective-C, D, Go, Java, JavaScript, Julia, Limbo, LPC, Perl, PHP, Pike, Processing, Python, Rust, Seed7, Vala, Verilog (HDL),[5] Nim, Zig | |
|
C (/ˈsiː/, as in the letter c) is a general-purpose, procedural computer programming language supporting structured programming, lexical variable scope, and recursion, with a static type system. By design, C provides constructs that map efficiently to typical machine instructions. It has found lasting use in applications previously coded in assembly language. Such applications include operating systems and various application software for computer architectures that range from supercomputers to PLCs and embedded systems.
A successor to the programming language B, C was originally developed at Bell Labs by Dennis Ritchie between 1972 and 1973 to construct utilities running on Unix. It was applied to re-implementing the kernel of the Unix operating system.[6] During the 1980s, C gradually gained popularity. It has become one of the most widely used programming languages,[7][8] with C compilers from various vendors available for the majority of existing computer architectures and operating systems. C has been standardized by ANSI since 1989 (ANSI C) and by the International Organization for Standardization (ISO).
C is an imperative procedural language. It was designed to be compiled to provide low-level access to memory and language constructs that map efficiently to machine instructions, all with minimal runtime support. Despite its low-level capabilities, the language was designed to encourage cross-platform programming. A standards-compliant C program written with portability in mind can be compiled for a wide variety of computer platforms and operating systems with few changes to its source code.[9]
As of January 2021, C was ranked first in the TIOBE index, a measure of the popularity of programming languages, moving up from the no. 2 spot the previous year.[10]
Overview[]
Like most procedural languages in the ALGOL tradition, C has facilities for structured programming and allows lexical variable scope and recursion. Its static type system prevents unintended operations. In C, all executable code is contained within subroutines (also called "functions", though not strictly in the sense of functional programming). Function parameters are always passed by value (except arrays). Pass-by-reference is simulated in C by explicitly passing pointer values. C program source text is free-format, using the semicolon as a statement terminator and curly braces for grouping blocks of statements.
The C language also exhibits the following characteristics:
- The language has a small, fixed number of keywords, including a full set of control flow primitives:
if/else
,for
,do/while
,while
, andswitch
. User-defined names are not distinguished from keywords by any kind of sigil. - It has a large number of arithmetic, bitwise, and logic operators:
+
,+=
,++
,&
,||
, etc. - More than one assignment may be performed in a single statement.
- Functions:
- Function return values can be ignored, when not needed.
- Function and data pointers permit ad hoc run-time polymorphism.
- Functions may not be defined within the lexical scope of other functions.
- Data typing is static, but weakly enforced; all data has a type, but implicit conversions are possible.
- Declaration syntax mimics usage context. C has no "define" keyword; instead, a statement beginning with the name of a type is taken as a declaration. There is no "function" keyword; instead, a function is indicated by the presence of a parenthesized argument list.
- User-defined (typedef) and compound types are possible.
- Heterogeneous aggregate data types (
struct
) allow related data elements to be accessed and assigned as a unit. - Union is a structure with overlapping members; only the last member stored is valid.
- Array indexing is a secondary notation, defined in terms of pointer arithmetic. Unlike structs, arrays are not first-class objects: they cannot be assigned or compared using single built-in operators. There is no "array" keyword in use or definition; instead, square brackets indicate arrays syntactically, for example
month[11]
. - Enumerated types are possible with the
enum
keyword. They are freely interconvertible with integers. - Strings are not a distinct data type, but are conventionally implemented as null-terminated character arrays.
- Heterogeneous aggregate data types (
- Low-level access to computer memory is possible by converting machine addresses to typed pointers.
- Procedures (subroutines not returning values) are a special case of function, with an untyped return type
void
. - A preprocessor performs macro definition, source code file inclusion, and conditional compilation.
- There is a basic form of modularity: files can be compiled separately and linked together, with control over which functions and data objects are visible to other files via
static
andextern
attributes. - Complex functionality such as I/O, string manipulation, and mathematical functions are consistently delegated to library routines.
While C does not include certain features found in other languages (such as object orientation and garbage collection), these can be implemented or emulated, often through the use of external libraries (e.g., the GLib Object System or the Boehm garbage collector).
Relations to other languages[]
Many later languages have borrowed directly or indirectly from C, including C++, C#, Unix's C shell, D, Go, Java, JavaScript (including transpilers), Julia, Limbo, LPC, Objective-C, Perl, PHP, Python, Ruby, Rust, Swift, Verilog and SystemVerilog (hardware description languages).[5] These languages have drawn many of their control structures and other basic features from C. Most of them (Python being a dramatic exception) also express highly similar syntax to C, and they tend to combine the recognizable expression and statement syntax of C with underlying type systems, data models, and semantics that can be radically different.
History[]
Early developments[]
Year | C Standard[9] |
---|---|
1972 | Birth |
1978 | K&R C |
1989/1990 | ANSI C and ISO C |
1999 | C99 |
2011 | C11 |
2017 | C17 |
TBD | C2x |
The origin of C is closely tied to the development of the Unix operating system, originally implemented in assembly language on a PDP-7 by Dennis Ritchie and Ken Thompson, incorporating several ideas from colleagues. Eventually, they decided to port the operating system to a PDP-11. The original PDP-11 version of Unix was also developed in assembly language.[6]
Thompson desired a programming language to make utilities for the new platform. At first, he tried to make a Fortran compiler, but soon gave up the idea. Instead, he created a cut-down version of the recently developed BCPL systems programming language. The official description of BCPL was not available at the time,[11] and Thompson modified the syntax to be less wordy, producing the similar but somewhat simpler B.[6] However, few utilities were ultimately written in B because it was too slow, and B could not take advantage of PDP-11 features such as byte addressability.
In 1972, Ritchie started to improve B, which resulted in creating a new language C.[12] The C compiler and some utilities made with it were included in Version 2 Unix.[13]
At Version 4 Unix, released in November 1973, the Unix kernel was extensively re-implemented in C.[6] By this time, the C language had acquired some powerful features such as struct
types.
Preprocessor was introduced around 1973 at the urging of and also in recognition of the usefulness of the file-inclusion mechanisms available in BCPL and PL/I. Its original version provided only included files and simple string replacements: #include
and #define
of parameterless macros. Soon after that, it was extended, mostly by Mike Lesk and then by John Reiser, to incorporate macros with arguments and conditional compilation.[6]
Unix was one of the first operating system kernels implemented in a language other than assembly. Earlier instances include the Multics system (which was written in PL/I) and Master Control Program (MCP) for the Burroughs B5000 (which was written in ALGOL) in 1961. In around 1977, Ritchie and Stephen C. Johnson made further changes to the language to facilitate portability of the Unix operating system. Johnson's Portable C Compiler served as the basis for several implementations of C on new platforms.[12]
K&R C[]
In 1978, Brian Kernighan and Dennis Ritchie published the first edition of The C Programming Language.[1] This book, known to C programmers as K&R, served for many years as an informal specification of the language. The version of C that it describes is commonly referred to as "K&R C". As this was released in 1978, it is also referred to as C78.[14] The second edition of the book[15] covers the later ANSI C standard, described below.
K&R introduced several language features:
- Standard I/O library
long int
data typeunsigned int
data type- Compound assignment operators of the form
=op
(such as=-
) were changed to the formop=
(that is,-=
) to remove the semantic ambiguity created by constructs such asi=-10
, which had been interpreted asi =- 10
(decrementi
by 10) instead of the possibly intendedi = -10
(leti
be −10).
Even after the publication of the 1989 ANSI standard, for many years K&R C was still considered the "lowest common denominator" to which C programmers restricted themselves when maximum portability was desired, since many older compilers were still in use, and because carefully written K&R C code can be legal Standard C as well.
In early versions of C, only functions that return types other than int
must be declared if used before the function definition; functions used without prior declaration were presumed to return type int
.
For example:
long some_function();
/* int */ other_function();
/* int */ calling_function()
{
long test1;
register /* int */ test2;
test1 = some_function();
if (test1 > 0)
test2 = 0;
else
test2 = other_function();
return test2;
}
The int
type specifiers which are commented out could be omitted in K&R C, but are required in later standards.
Since K&R function declarations did not include any information about function arguments, function parameter type checks were not performed, although some compilers would issue a warning message if a local function was called with the wrong number of arguments, or if multiple calls to an external function used different numbers or types of arguments. Separate tools such as Unix's lint utility were developed that (among other things) could check for consistency of function use across multiple source files.
In the years following the publication of K&R C, several features were added to the language, supported by compilers from AT&T (in particular PCC[16]) and some other vendors. These included:
void
functions (i.e., functions with no return value)- functions returning
struct
orunion
types (rather than pointers) - assignment for
struct
data types - enumerated types
The large number of extensions and lack of agreement on a standard library, together with the language popularity and the fact that not even the Unix compilers precisely implemented the K&R specification, led to the necessity of standardization.
ANSI C and ISO C[]
During the late 1970s and 1980s, versions of C were implemented for a wide variety of mainframe computers, minicomputers, and microcomputers, including the IBM PC, as its popularity began to increase significantly.
In 1983, the American National Standards Institute (ANSI) formed a committee, X3J11, to establish a standard specification of C. X3J11 based the C standard on the Unix implementation; however, the non-portable portion of the Unix C library was handed off to the IEEE working group 1003 to become the basis for the 1988 POSIX standard. In 1989, the C standard was ratified as ANSI X3.159-1989 "Programming Language C". This version of the language is often referred to as ANSI C, Standard C, or sometimes C89.
In 1990, the ANSI C standard (with formatting changes) was adopted by the International Organization for Standardization (ISO) as ISO/IEC 9899:1990, which is sometimes called C90. Therefore, the terms "C89" and "C90" refer to the same programming language.
ANSI, like other national standards bodies, no longer develops the C standard independently, but defers to the international C standard, maintained by the working group ISO/IEC JTC1/SC22/WG14. National adoption of an update to the international standard typically occurs within a year of ISO publication.
One of the aims of the C standardization process was to produce a superset of K&R C, incorporating many of the subsequently introduced unofficial features. The standards committee also included several additional features such as function prototypes (borrowed from C++), void
pointers, support for international character sets and locales, and preprocessor enhancements. Although the syntax for parameter declarations was augmented to include the style used in C++, the K&R interface continued to be permitted, for compatibility with existing source code.
C89 is supported by current C compilers, and most modern C code is based on it. Any program written only in Standard C and without any hardware-dependent assumptions will run correctly on any platform with a conforming C implementation, within its resource limits. Without such precautions, programs may compile only on a certain platform or with a particular compiler, due, for example, to the use of non-standard libraries, such as GUI libraries, or to a reliance on compiler- or platform-specific attributes such as the exact size of data types and byte endianness.
In cases where code must be compilable by either standard-conforming or K&R C-based compilers, the __STDC__
macro can be used to split the code into Standard and K&R sections to prevent the use on a K&R C-based compiler of features available only in Standard C.
After the ANSI/ISO standardization process, the C language specification remained relatively static for several years. In 1995, Normative Amendment 1 to the 1990 C standard (ISO/IEC 9899/AMD1:1995, known informally as C95) was published, to correct some details and to add more extensive support for international character sets.[17]
C99[]
The C standard was further revised in the late 1990s, leading to the publication of ISO/IEC 9899:1999 in 1999, which is commonly referred to as "C99". It has since been amended three times by Technical Corrigenda.[18]
C99 introduced several new features, including inline functions, several new data types (including long long int
and a complex
type to represent complex numbers), variable-length arrays and flexible array members, improved support for IEEE 754 floating point, support for variadic macros (macros of variable arity), and support for one-line comments beginning with //
, as in BCPL or C++. Many of these had already been implemented as extensions in several C compilers.
C99 is for the most part backward compatible with C90, but is stricter in some ways; in particular, a declaration that lacks a type specifier no longer has int
implicitly assumed. A standard macro __STDC_VERSION__
is defined with value 199901L
to indicate that C99 support is available. GCC, Solaris Studio, and other C compilers now support many or all of the new features of C99. The C compiler in Microsoft Visual C++, however, implements the C89 standard and those parts of C99 that are required for compatibility with C++11.[19][needs update]
In addition, support for Unicode identifiers (variable / function names) in the form of escaped characters (e.g. \U0001f431
) is now required. Support for raw Unicode names like