Evaluation strategy

From Wikipedia, the free encyclopedia

In a programming language, an evaluation strategy is a set of rules for evaluating expressions.[1] The term is often used to refer to the more specific notion of a parameter-passing strategy[2] that defines whether to evaluate the parameters of a function call, and if so in what order (the evaluation order)[3] and the kind of value that is passed to the function for each parameter (the binding strategy).[4] The notion of reduction strategy is distinct,[5] although some authors conflate the two terms and the definition of each term is not widely agreed upon.[6]

To illustrate, a function application may evaluate the argument before evaluating the function's body and pass the address, giving the function the ability to look up the argument's current value and modify it via assignment.[7]

Evaluation strategy is specified by the programming language definition, and is not a function of any specific implementation. The calling convention defines implementation-specific parameter passing details.

Table[]

This is a table of evaluation strategies and representative languages by year introduced. The representative languages are listed in chronological order, starting with the language(s) that introduced the strategy (if any) and followed by prominent languages that use the strategy.[8]: 434

Evaluation strategy Representative Languages Year first introduced
Call by reference FORTRAN II, PL/I 1958
Call by value ALGOL, C, Scheme 1960
Call by name ALGOL 60, Simula 1960
Call by copy-restore Fortran IV, Ada[9] 1962
Call by need Haskell, R[10] 1971[11]
Call by reference parameters C++, PHP,[12] C#,[13] Visual Basic .NET[14] ?
Call by reference to const C, C++ ?
Call by sharing Java, Python, Ruby ?

Evaluation orders[]

While the order of operations defines the abstract syntax tree of the expression, the evaluation order defines the order in which expressions are evaluated. For example, the Python program

def f(x):
    print(x)
    return x

f(1) + f(2)

outputs 1 2 due to Python's left-to-right evaluation order, but a similar program in OCaml:

let f x =  print_string (string_of_int x); x ;;
f 1 + f 2

outputs 2 1 due to OCaml's right-to-left evaluation order.

The evaluation order is mainly visible in code with side effects, but it also affects the performance of the code because a rigid order inhibits instruction scheduling. For this reason language standards such as C++ traditionally left the order undefined, although languages such as Java and C# define the evaluation order as left-to-right[8]: 240–241 and the C++17 standard has added constraints on the evaluation order.[15]

Strict evaluation []

Applicative order is a family of evaluation orders in which a function's arguments are evaluated completely before the function is applied. [16] This has the effect of making the function strict, i.e. the function's result is undefined if any of the arguments are undefined, so applicative order evaluation more commonly called strict evaluation. Furthermore, a function call is performed as soon as it is encountered in a procedure, so it is also called eager evaluation or greedy evaluation.[17][18] Some authors refer to strict evaluation as "call by value" due to the call-by-value binding strategy requiring strict evaluation.[3]

Common Lisp, Eiffel and Java evaluate function arguments left-to-right. C leaves the order undefined.[19] Scheme requires the execution order to be the sequential execution of an unspecified permutation of the arguments.[20] OCaml similarly leaves the order unspecified, but in practice evaluates arguments right-to-left due to the design of its abstract machine.[21] All of these are strict evaluation.

Non-strict evaluation []

A non-strict evaluation order is an evaluation order that is not strict, that is, a function may return a result before all of its arguments are fully evaluated.[22]: 46–47 The prototypical example is normal order evaluation, which does not evaluate any of the arguments until they are needed in the body of the function.[23] Normal order evaluation has the property that it terminates without error whenever any other evaluation order terminates without error.[24] Note that lazy evaluation is classified in this article as a binding technique rather than an evaluation order. But this distinction is not always followed and some authors define lazy evaluation as normal order evaluation or vice-versa,[25][26] or confuse non-strictness with lazy evaluation.[22]: 43–44

Boolean expressions in many languages use a form of non-strict evaluation called short-circuit evaluation, where evaluation returns as soon as it can be determined that an unambiguous Boolean will result—for example, in a disjunctive expression (OR) where true is encountered, or in a conjunctive expression (AND) where false is encountered, and so forth.[26] Conditional expressions similarly use non-strict evaluation - only one of the branches is evaluated.[22]

Comparison of applicative order and normal order evaluation[]

With normal order evaluation, expressions containing an expensive computation, an error, or an infinite loop will be ignored if not needed,[3] allowing the specification of user-defined control flow constructs, a facility not available with applicative order evaluation. Normal order evaluation uses complex structures such as thunks for unevaluated expressions, compared to the call stack used in applicative order evaluation.[27] Normal order evaluation has historically had a lack of usable debugging tools due to its complexity.[28]

Strict binding strategies[]

Call by value[]

In call by value, the evaluated value of the argument expression is bound to the corresponding variable in the function (frequently by copying the value into a new memory region). If the function or procedure is able to assign values to its parameters, only its local variable is assigned—that is, anything passed into a function call is unchanged in the caller's scope when the function returns.

Implicit limitations[]

In some cases, the term "call by value" is problematic, as the value which is passed is not the value of the variable as understood by the ordinary meaning of value, but an implementation-specific reference to the value. The effect is that what syntactically looks like call by value may end up rather behaving like call by reference or call by sharing, often depending on very subtle aspects of the language semantics.

The reason for passing a reference is often that the language technically does not provide a value representation of complicated data, but instead represents them as a data structure while preserving some semblance of value appearance in the source code. Exactly where the boundary is drawn between proper values and data structures masquerading as such is often hard to predict. In C, an array (of which strings are special cases) is a data structure but the name of an array is treated as (has as value) the reference to the first element of the array, while a struct variable's name refers to a value even if it has fields that are vectors. In Maple, a vector is a special case of a table and therefore a data structure, but a list (which gets rendered and can be indexed in exactly the same way) is a value. In Tcl, values are "dual-ported" such that the value representation is used at the script level, and the language itself manages the corresponding data structure, if one is required. Modifications made via the data structure are reflected back to the value representation and vice versa.

The description "call by value where the value is a reference" is common (but should not be understood as being call by reference); another term is call by sharing. Thus the behaviour of call by value Java or Visual Basic and call by value C or Pascal are significantly different: in C or Pascal, calling a function with a large structure as an argument will cause the entire structure to be copied (except if it's actually a reference to a structure), potentially causing serious performance degradation, and mutations to the structure are invisible to the caller. However, in Java or Visual Basic only the reference to the structure is copied, which is fast, and mutations to the structure are visible to the caller.

Call by reference[]

Call by reference (or pass by reference) is an evaluation strategy where a parameter is bound to an implicit reference to the variable used as argument, rather than a copy of its value.

This typically means that the function can modify (i.e., assign to) the variable used as argument—something that will be seen by its caller. Call by reference can therefore be used to provide an additional channel of communication between the called function and the calling function. A call-by-reference language makes it more difficult for a programmer to track the effects of a function call, and may introduce subtle bugs. A simple litmus test for whether a language supports call-by-reference semantics is if it's possible to write a traditional swap(a, b) function in the language.[29]

Call by reference can be simulated in languages that use call by value and don't exactly support call by reference, by making use of references (objects that refer to other objects), such as pointers (objects representing the memory addresses of other objects). Languages such as C, ML and Rust use this technique. It is not a separate evaluation strategy—the language calls by value—but sometimes it is referred to as "call by address" or "pass by address". In ML, references are type- and memory-safe, similar to Rust.

In purely functional languages there is typically no semantic difference between the two strategies (since their data structures are immutable, so there is no possibility for a function to modify any of its arguments), so they are typically described as call by value even though implementations frequently use call by reference internally for the efficiency benefits.

Following is an example that demonstrates call by reference in the E programming language:

def modify(var p, &q) {
    p := 27 # passed by value: only the local parameter is modified
    q := 27 # passed by reference: variable used in call is modified
}

? var a := 1
# value: 1
? var b := 2
# value: 2
? modify(a, &b)
? a
# value: 1
? b
# value: 27

Following is an example of call by address that simulates call by reference in C:

void modify(int p, int* q, int* r) {
    p = 27; // passed by value: only the local parameter is modified
    *q = 27; // passed by value or reference, check call site to determine which
    *r = 27; // passed by value or reference, check call site to determine which
}

int main() {
    int a = 1;
    int b = 1;
    int x = 1;
    int* c = &x;
    modify(a, &b, c); // a is passed by value, b is passed by reference by creating a pointer (call by value),
                    // c is a pointer passed by value
                    // b and x are changed
    return 0;
}

Call by sharing[]

Call by sharing (also known as "call by object" or "call by object-sharing") is an evaluation strategy first noted by Barbara Liskov in 1974 for the CLU language.[30] It is used by languages such as Python,[31] Java (for object references), Ruby, JavaScript, Scheme, OCaml, AppleScript, and many others. However, the term "call by sharing" is not in common use; the terminology is inconsistent across different sources. For example, in the Java community, they say that Java is call by value.[32] Call by sharing implies that values in the language are based on objects rather than primitive types, i.e., that all values are "boxed". Because they are boxed they can be said to pass by copy of reference (where primitives are boxed before passing and unboxed at called function).

The semantics of call by sharing differ from call by reference: "In particular it is not call by value because mutations of arguments performed by the called routine will be visible to the caller. And it is not call by reference because access is not given to the variables of the caller, but merely to certain objects".[33] So, for example, if a variable was passed, it is not possible to simulate an assignment on that variable in the callee's scope.[34] However, since the function has access to the same object as the caller (no copy is made), mutations to those objects, if the objects are mutable, within the function are visible to the caller, which may appear to differ from call by value semantics. Mutations of a mutable object within the function are visible to the caller because the object is not copied or cloned—it is shared.

For example, in Python, lists are mutable, so:

def f(a_list):
    a_list.append(1)

m = []
f(m)
print(m)

outputs [1] because the append method modifies the object on which it is called.

Assignments within a function are not noticeable to the caller, because, in these languages, passing the variable only means passing (access to) the actual object referred to by the variable, not access to the original (caller's) variable. Since the rebound variable only exists within the scope of the function, the counterpart in the caller retains its original binding.

Compare the Python mutation above with the code below, which binds the formal argument to a new object:

def f(a_list):
    a_list = [1]

m = []
f(m)
print(m)

outputs [], because the statement a_list = [1] reassigns a new list to the variable rather than to the location it references.

For immutable objects, there is no real difference between call by sharing and call by value, except if object identity is visible in the language. The use of call by sharing with mutable objects is an alternative to input/output parameters: the parameter is not assigned to (the argument is not overwritten and object identity is not changed), but the object (argument) is mutated.[35]

Call by copy-restore[]

Call by copy-restore—also known as "copy-in copy-out", "call by value result", "call by value return" (as termed in the Fortran community)—is a special case of call by reference where the provided reference is unique to the caller. This variant has gained attention in multiprocessing contexts and Remote procedure call:[36] if a parameter to a function call is a reference that might be accessible by another thread of execution, its contents may be copied to a new reference that is not; when the function call returns, the updated contents of this new reference are copied back to the original reference ("restored").

The semantics of call by copy-restore also differ from those of call by reference, where two or more function arguments alias one another (i.e., point to the same variable in the caller's environment). Under call by reference, writing to one will affect the other; call by copy-restore avoids this by giving the function distinct copies, but leaves the result in the caller's environment undefined depending on which of the aliased arguments is copied back first—will the copies be made in left-to-right order both on entry and on return?

When the reference is passed to the callee uninitialized, this evaluation strategy may be called "call by result".

Non-strict binding strategies[]

Call by name[]

Call by name is an evaluation strategy where the arguments to a function are not evaluated before the function is called—rather, they are substituted directly into the function body (using capture-avoiding substitution) and then left to be evaluated whenever they appear in the function. If an argument is not used in the function body, the argument is never evaluated; if it is used several times, it is re-evaluated each time it appears. (See Jensen's Device.)

Call-by-name evaluation is occasionally preferable to call-by-value evaluation. If a function's argument is not used in the function, call by name will save time by not evaluating the argument, whereas call by value will evaluate it regardless. If the argument is a non-terminating computation, the advantage is enormous. However, when the function argument is used, call by name is often slower, requiring a mechanism such as a thunk.

Today's .NET languages can simulate call by name using delegates or Expression<T> parameters. The latter results in an abstract syntax tree being given to the function. Eiffel provides agents, which represent an operation to be evaluated when needed. Seed7 provides call by name with function parameters. Java programs can accomplish similar lazy evaluation using lambda expressions and the java.util.function.Supplier<T> interface.

Call by need[]

Call by need is a memoized variant of call by name, where, if the function argument is evaluated, that value is stored for subsequent use. If the argument is pure (i.e., free of side effects), this produces the same results as call by name, saving the cost of recomputing the argument.

Haskell is a well-known language that uses call-by-need evaluation. Because evaluation of expressions may happen arbitrarily far into a computation, Haskell only supports side effects (such as mutation) via the use of monads. This eliminates any unexpected behavior from variables whose values change prior to their delayed evaluation.

In R's implementation of call by need, all arguments are passed, meaning that R allows arbitrary side effects.

Lazy evaluation is the most common implementation of call-by-need semantics, but variations like optimistic evaluation exist. .NET languages implement call by need using the type Lazy<T>.

Graph reduction is an efficient implementation of lazy evaluation.

Call by macro expansion[]

Call by macro expansion is similar to call by name, but uses textual substitution rather than capture, thereby avoiding substitution. But macro substitution may cause mistakes, resulting in , leading to undesired behavior. Hygienic macros avoid this problem by checking for and replacing shadowed variables that are not parameters.

Call by future[]

"Call by future", also known as "parallel call by name", is a concurrent evaluation strategy in which the value of a future expression is computed concurrently with the flow of the rest of the program with promises, also known as futures. When the promise's value is needed, the main program blocks until the promise has a value (the promise or one of the promises finishes computing, if it has not already completed by then).

This strategy is non-deterministic, as the evaluation can occur at any time between creation of the future (i.e., when the expression is given) and use of the future's value. It is similar to call by need in that the value is only computed once, and computation may be deferred until the value is needed, but it may be started before. Further, if the value of a future is not needed, such as if it is a local variable in a function that returns, the computation may be terminated partway through.

If implemented with processes or threads, creating a future will spawn one or more new processes or threads (for the promises), accessing the value will synchronize these with the main thread, and terminating the computation of the future corresponds to killing the promises computing its value.

If implemented with a coroutine, as in .NET async/await, creating a future calls a coroutine (an async function), which may yield to the caller, and in turn be yielded back to when the value is used, cooperatively multitasking.

Optimistic evaluation[]

Optimistic evaluation is another call-by-need variant where the function's argument is partially evaluated for some amount of time (which may be adjusted at runtime). After that time has passed, evaluation is aborted and the function is applied using call by need.[37] This approach avoids some the call-by-need strategy's runtime expenses while retaining desired termination characteristics.

See also[]

References[]

  1. ^ Araki, Shota; Nishizaki, Shin-ya (November 2014). "Call-by-name evaluation of RPC and RMI calculi". Theory and Practice of Computation: 1. doi:10.1142/9789814612883_0001. Retrieved 21 August 2021.
  2. ^ Turbak, Franklyn; Gifford, David (18 July 2008). Design Concepts in Programming Languages. MIT Press. p. 309. ISBN 978-0-262-30315-6.
  3. ^ Jump up to: a b c Wilhelm, Reinhard; Seidl, Helmut (10 November 2010). Compiler Design: Virtual Machines. Springer Science & Business Media. p. 61. ISBN 978-3-642-14909-2.
  4. ^ Crank, Erik; Felleisen, Matthias (1991). "Parameter-passing and the lambda calculus". Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '91: 2. CiteSeerX 10.1.1.23.4385. doi:10.1145/99583.99616.
  5. ^ Nita, Stefania Loredana; Mihailescu, Marius (2017). "Introduction". Practical Concurrent Haskell: 3. doi:10.1007/978-1-4842-2781-7_1.
  6. ^ Pierce, Benjamin C. (2002). Types and Programming Languages. MIT Press. p. 56. ISBN 0-262-16209-1.
  7. ^ Daniel P. Friedman; Mitchell Wand (2008). Essentials of Programming Languages (third ed.). Cambridge, MA: The MIT Press. ISBN 978-0262062794.
  8. ^ Jump up to: a b Scott, Michael Lee (2016). Programming language pragmatics (Fourth ed.). Waltham, MA: Elsevier. ISBN 9780124104778.
  9. ^ Hasti, Rebecca. "Parameter Passing". CS 536: Introduction to Programming Languages and Compilers. University of Wisconsin. Retrieved 22 August 2021.
  10. ^ Fay, Colin (30 July 2018). "About lazy evaluation". R-bloggers. Retrieved 21 August 2021.
  11. ^ Wadsworth, Christopher P. (1971). Semantics and Pragmatics of the Lambda Calculus (PhD). Oxford University.
  12. ^ "PHP: Passing by Reference - Manual". www.php.net. Retrieved 2021-07-04.
  13. ^ BillWagner. "Passing Parameters - C# Programming Guide". docs.microsoft.com. Retrieved 2021-07-04.
  14. ^ KathleenDollard. "Passing Arguments by Value and by Reference - Visual Basic". docs.microsoft.com. Retrieved 2021-07-04.
  15. ^ Filipek, Bartlomiej. "Stricter Expression Evaluation Order in C++17". C++ Stories. Retrieved 24 August 2021.
  16. ^ Abelson, Harold; Sussman, Gerald Jay (1996). "Normal Order and Applicative Order". Structure and interpretation of computer programs (Second ed.). Cambridge, Mass.: MIT Press. ISBN 0-262-01153-0.
  17. ^ Reese, Richard M. (14 October 2015). Learning Java Functional Programming. Packt Publishing Ltd. p. 106. ISBN 978-1-78528-935-4.
  18. ^ Antani, Ved; Timms, Simon; Mantyla, Dan (31 August 2016). JavaScript: Functional Programming for JavaScript Developers. Packt Publishing Ltd. p. 614. ISBN 978-1-78712-557-5.
  19. ^ Seacord, Robert C. "EXP30-C. Do not depend on the order of evaluation for side effects". SEI CERT C Coding Standard. Carnegie Mellon University. Retrieved 23 August 2021.
  20. ^ Anglade, S.; Lacrampe, J. J.; Queinnec, C. (October 1994). "Semantics of combinations in scheme" (PDF). ACM SIGPLAN Lisp Pointers. VII (4): 15–20. doi:10.1145/382109.382669.
  21. ^ "Why are OCaml function arguments evaluated right-to-left?". OCaml. 30 November 2017.
  22. ^ Jump up to: a b c Tremblay, G. (April 2000). "Lenient evaluation is neither strict nor lazy". Computer Languages. 26 (1): 43–66. CiteSeerX 10.1.1.137.9885. doi:10.1016/S0096-0551(01)00006-6.
  23. ^ George, Lai (March 1987). Efficient evaluation of normal order through strictness information (MSc). University of Utah. p. 10.
  24. ^ Borning, Alan (Autumn 1999). "Applicative vs Normal Order Evaluation in Functional Languages" (PDF). CSE 505: Concepts of Programming Languages. University of Washington. Retrieved 23 August 2021.
  25. ^ Abelson, Harold; Sussman, Gerald Jay (1996). "Normal Order and Applicative Order". Structure and interpretation of computer programs (Second ed.). Cambridge, Mass.: MIT Press. ISBN 0-262-01153-0.
  26. ^ Jump up to: a b Sturm, Oliver (11 April 2011). Functional Programming in C#: Classic Programming Techniques for Modern Projects. John Wiley and Sons. p. 91. ISBN 978-0-470-74458-1.
  27. ^ Marlow, Simon. "Why can't I get a stack trace?". Haskell Implementors Workshop 2012. Retrieved 25 August 2021.
  28. ^ Nilsson, Henrik (1999). "Tracing piece by piece: affordable debugging for lazy functional languages". Proceedings of the fourth ACM SIGPLAN international conference on Functional programming - ICFP '99: 36–47. CiteSeerX 10.1.1.451.6513. doi:10.1145/317636.317782.
  29. ^ "Java is Pass-by-Value, Dammit!". Retrieved 2016-12-24.
  30. ^ Liskov, Barbara; Atkinson, Russ; Bloom, Toby; Moss, Eliot; Schaffert, Craig; Scheifler, Craig; Snyder, Alan (October 1979). "CLU Reference Manual" (PDF). Laboratory for Computer Science. Massachusetts Institute of Technology. Archived from the original (PDF) on 2006-09-22. Retrieved 2011-05-19.
  31. ^ Lundh, Fredrik. "Call By Object". effbot.org. Retrieved 2011-05-19.
  32. ^ "Java is Pass-by-Value, Dammit!". Retrieved 2016-12-24.
  33. ^ CLU Reference Manual (1974), p. 14-15.
  34. ^ Note: in CLU language, "variable" corresponds to "identifier" and "pointer" in modern standard usage, not to the general/usual meaning of variable.
  35. ^ "CA1021: Avoid out parameters". Microsoft.
  36. ^ "RPC: Remote Procedure Call Protocol Specification Version 2". tools.ietf.org. IETF. Retrieved 7 April 2018.
  37. ^ Ennals, Robert; Jones, Simon Peyton (August 2003). "Optimistic Evaluation: a fast evaluation strategy for non-strict programs".


Further reading[]

External links[]

  • The interactive on-line Geometry of Interaction visualiser, implementing a graph-based machine for several common evaluation strategies.
Retrieved from ""