PostsAbout me

Evaluation of function calls in Haskell

(This post is discussed in episode 15 of the Haskell Weekly Podcast.)

Chapter 27 of Haskell Programming from first principles (by Christopher Allen and Julie Moronuki) is about the evaluation system of Haskell, with a focus on non-strictness. In the section Preventing sharing on purpose, they write you want to prevent sharing the result of a function call when it would mean storing some big data just to calculate a small result. Two examples are provided to demonstrate the alternatives. In the first, the result of g _ is not shared but calculated twice:

Prelude> f x = (x 3) + (x 10)
Prelude> g' = \_ -> trace "hi g'" 2
Prelude> f g'
hi g'
hi g'

In the second, the result of g _ is shared, i.e. calculated only once and the result is stored:

Prelude> g = const (trace "hi g" 2)
Prelude> f g
hi g

(Edited to add:) In practice, sharing is usually achieved with a let expression or a where construct.

(Note that this latter is called a “point-free” definition.)

The authors conclude that

functions aren’t shared when there are named arguments but are when the arguments are elided, as in pointfree. So, one way to prevent sharing is adding named arguments.

(Quoted from version 1.0RC4 of the book.)

In this post I analyze the runtime differences between point-free and pointful definitions.

Behind the scenes

As Tom Ellis describes, the definitions of g and f translate to the following (in a close approximation to the “Core” language used during compilation):

f = \x -> let {x3 = x 3; x10 = x 10} in (+) x3 x10
g = let {tg = trace "hi g" 2} in \y -> const tg y
g' = \_ -> trace "hi g'" 2

(Calling f g with these definitions does not result in the same trace in GHCi 8.6.5 as with the original definitions. However, the code has the expected behavior if loaded into GHCi from a source file like that below.)

Two things to point out here. First, every function definition is a lambda. Second, g was turned into a let expression because we can only apply functions to variables or literals (in Core), not to function calls. Edited to add: It would be reasonable to ask why g = const (trace "hi g" 2) doesn’t translate to \y -> let {tg = trace "hi g" 2} in const tg y (similar to f), to which the pragmatic answer is that apparently the order is the following:

  1. not-fully-applied functions are turned into lambdas,
  2. parameters that are function calls are turned into named variables, and
  3. named function arguments from the left-hand side of = are moved to the right as a lambda.

Evaluation with sharing

This is what happens during the evaluation of f g:

ans = f g

ans is a function call, so its evaluation proceeds with substituting g for the argument of f:

ans = let {x3 = g 3; x10 = g 10} in (+) x3 x10

ans is a let expression, so we put the following thunks for x3 and x10 on the heap under some unique name:

-- Heap:
ans_x3 = g 3
ans_x10 = g 10

…and then proceed with evaluating the in part:

ans = (+) ans_x3 ans_x10

During the evaluation of this function call, ans_x3 will be evaluated (or potentially ans_x10 first, or both in parallel). ans_x3 is a function call, so first we evaluate g to a lambda. As g is a let expression, we create a closure for trace "hi g" 2 on the heap, and then continue with the in part of g (\y -> const tg y). This is a lambda now, meaning it’s in weak head normal form, so the heap contents for g is overwritten with that:

-- Heap:
g_tg = trace "hi g" 2
g = \y -> const g_tg y

Back to ans_x3, now the argument 3 is substituted in the definition of g:

ans_x3 = const g_tg 3

This is a function call, with const already a lambda \x _ -> x, so the arguments can now be substituted in the body, leaving us with

-- Heap:
ans_x3 = g_tg  -- (Pointer to the same address as g_tg.)

During the evaluation of g_tg, the magic printout happens (hi g on stdout), and its value is resolved to be 2, so the heap is updated as such:

-- Heap:
g_tg = 2

And ans_x3 is a pointer to the same memory content 2.

Analogously, the evaluation of ans_x10 proceeds as such:

ans_x10 = const g_tg 10
ans_x10 = g_tg
-- let ans_x10 points to the memory location of g_tg:
ans_x10 = 2

Finally, ans = (+) ans_x3 ans_x10, which evaluates to ans = 4.

Evaluation without sharing

In contrast, the evaluation of f g' proceeds as follows:

ans' = f g'
ans' = let {x3 = g' 3; x10 = g' 10} in (+) x3 x10
-- Heap:
ans_x3'  = g' 3
ans_x10' = g' 10
ans' = (+) ans_x3' ans_x10'
ans_x3' = trace "hi g'" 2

Now hi g' is printed, and the heap is updated:

-- Heap:
ans_x3' = 2

When evaluating ans_x10', we again print hi g', and store the result of the trace under a different thunk:

-- Heap:
ans_x10' = 2

Now ans' evaluates to (+) 2 2, i.e. 4.

Attempt at verifying my translated definitions

I attempted to verify what I was saying above about the definitions of f, g, g' in Core, using the -ddump-simpl compiler flag of GHCi, but it didn’t fulfil my expectations.


module Sharing where

import Debug.Trace

f x = (x (3::Int)) + (x 10) :: Int
g   = const (trace "hi g" (2::Int))  -- share
g'  = \_ -> trace "hi g'" (2::Int)   -- don't share
g'' = let {tg = trace "hi g" (2::Int)} in \y -> const tg y  -- share (equivalent to g)

In GHCi:

Prelude> :set -ddump-simpl -dsuppress-all -Wno-missing-signatures
Prelude> :l Sharing
[1 of 1] Compiling Sharing          ( Sharing.hs, interpreted )

==================== Tidy Core ====================
Result size of Tidy Core
  = {terms: 52, types: 39, coercions: 0, joins: 0/0}

f = \ x_a1Fl -> + $fNumInt (x_a1Fl (I# 3#)) (x_a1Fl (I# 10#))
g = \ @ b_a1Gi -> const (trace (unpackCString# "hi g"#) (I# 2#))
g' = \ @ p_a1G6 -> \ _ -> trace (unpackCString# "hi g'"#) (I# 2#)
tg_r1F4 = trace (unpackCString# "hi g"#) (I# 2#)
g'' = \ @ b_a1FJ -> \ y_a1Fn -> const tg_r1F4 y_a1Fn

... and some more stuff

Nonetheless, as a SO answer describes, we can see that a function application in Core is defined as Expr Atom, where Atom is var | Literal. I attempted to install ghc-core but the build failed, so further analysis is put on the shelf.


So, what’s the essential difference between g and g'?

g = const (trace "hi g" 2) is a function application where the argument is a function application, which is treated as a let expression. When you evaluate g (), the auxiliary variable introduced by the let – i.e.,tg = trace "hi g" 2 – is evaluated to a literal and its value is stored on the heap. On subsequent calls, some other argument can be applied to the const tg function, but its first argument tg is already evaluated.

In contrast, g' = \_ -> trace "hi g'" 2 is a lambda, so it is already fully evaluated, and nothing in it can be simplified further. If we apply g' first to the argument (), the expression g' () will evaluate to the body of g' with the unused parameter discarded, i.e. trace "hi g'" 2. If we later evaluate g' [], then it again results in the (same) body after the (dummy) function application. Nowhere during this process did we store the value of trace "hi g'" 2: in particular, we didn’t update the definition of g' to \_ -> 2, simply because that is not the definition of g'. (But could we have updated it? Even though functions are always pure, I think the answer is generally no: sometimes the result of a function is bigger than the definition, and the result is not needed often enough to warrant this speed–memory tradeoff.)

Recall the original wording:

functions aren’t shared when there are named arguments but are when the arguments are elided, as in pointfree.

As we saw, functions themselves are never shared. Rather, if g is a partially applied function whose argument is a function application fun arg, then g is equivalent to a let expression, and after its first evaluation g will change to a lambda with fun arg already evaluated.

As a generally-okay heuristic, point-free definitions allow sharing inner function calls, whereas nothing in a lambda (or a function with all arguments on the left-hand side) is shared.

Further resources

More details on similar behavior are given by Tom Ellis in his talk Haskell programs: how do they run? (free registration required to watch the talk).

The talk of David Luposchainsky (a.k.a. quchen) goes into more depth – down to the Core –, in which he uses his own implementation of the spineless tagless graph reduction machine (STG), to visualize the evaluation of any given Haskell code (link to repo).