A For-Loop of One’s Own

Here’s how to make your own conditionally-executed for-loop with procedurally-generated recursive functions.

You’ll never need it and it’s almost definitely slower than the vanilla for-loop in your language, but it’s a fun exercise.

To make a for-loop, we need a couple things:

  1. Conditional evaluation of passed arguments
  2. A way to run a given body of code repeatedly

Now, you could setup a for-loop without any sort of goto by just copy-pasting the body n times, but that’s less fun.

For conditional evaluation we can either ask the user to wrap their body of code in a lambda-form, or we can use a macro. Every language I’ve seen that has a for-loop automatically conditionally executes the given body of code, so we’ll go with a macro.

The form of the for-loop will be

(loop-for (<index-var> <initial-value>) <predicate> <incrementor>
    <body>)

For example,

(loop-for (i 0) (> i 10) (incf i)
    (print i))

We’ll start by defining a macro with the right args

(defmacro loop-for (start-body predicate-body incrementor-body &body body)
    nil)

CL-USER> (loop-for 1 1 1 (print "dog"))
NIL

As you can see, it ignores everything its passed and returns nil.

Let’s add in a hardcoded example of what we want, ignoring most of the input args for now.

(defmacro loop-for (start-body predicate-body incrementor-body &body body)
  `(labels ((recurse (n bound)
              (unless (>= n bound)
                ,@body
                (recurse (1+ n) bound))))
     (recurse 0 3)))


CL-USER> (loop-for 1 1 1 (print "dog"))

"dog" 
"dog" 
"dog" 
NIL

For non-lispers: Macros don’t evaluate their given args unless explicitly told to. They’re almost like functions but they accept and return expressions. “Labels” lets us create a lexical scope and put functions inside that are only visible to the code inside that labels form, that can be self-referential. We create a function called “recurse”, and set as its body a recursive function whose base case is when n >= the given bound of 3. The ,@body form takes the given “body” of code, in our case (print “dog”), and splices it into the code we wrote. The whole labels form is prefixed with a `, which indicates that we’re writing the code we want the macro to return, to grossly summarise. Within that code, the ,@ form lets us slot in the given body. It can help to, initially, think of writing macros as writing skeletons of code where unevaluated code will slot in, and then by calling them, we can replace the call to the macro with the code the macro returns. It’s like procedural inlining. There’s a whole lot more that macros can do, and this view is inherently limiting, but it’ll do as a basic run-down if you haven’t encountered them before.

Here we’ve hardcoded the starting-index n to be 0, the predicate to be whether n is less than 3, and the incrementor to be adding 1 to n. The only point in this being a macro right now is to spare the user from wrapping their code-body in a lambda-form.

Now let’s setup the other args to be respected.

(defmacro loop-for (start-body predicate-body incrementor-body &body body)
  `(let (,start-body)
    (labels ((recurse ()
               (unless ,predicate-body
                 ,@body
                 ,incrementor-body
                 (recurse))))
      (recurse))))

CL-USER> (loop-for (i 0) (> i 2) (incf i) (print (+ i 1)))

1 
2 
3 
NIL

I’ve slotted in the start-body of (i 0) into a let form, so the index symbol is available for the other bodies as a local variable.

Now just to test whether this really does conditionally execute the given code body, we’ll define the var x as ‘not-run, then pass a for loop with a predicate that will immediately resolve to nil.

CL-USER> (let ((x 'not-run))
           (loop-for (i 3) (> i 1) (incf i) (setf x i))
           x)
NOT-RUN

The predicate failed at the start, so the code-body never ran, and by using a macro we’ve avoided the implicit applicative-order-evaluation.

That’s almost all there is to it. We could call it a day here, but there’s just one issue. If the user defined a function “recurse” themselves, it would be shadowed by the “recurse” function we defined in our labels form.

CL-USER> (defun recurse (x) (print x))
RECURSE

CL-USER> (loop-for (i 0) (> i 2) (incf i) (recurse "dog"))

(LABELS RECURSE) called with invalid number of arguments: 1
   [Condition of type SB-INT:SIMPLE-PROGRAM-ERROR]

As you can see, our globally defined recurse function that takes 1 arg doesn’t run. Instead, the recurse we defined in our labels which accepts no args runs, but we passed it an arg in our code body, thus we get an invalid-number-of-arguments error.

So, do we come up with another name? An obscure name? No symbol we hardcode is guaranteed to be unbound. Normally we could avoid name-collisions with procedurally generated functions by just using lambdas, but we’re doing recursion here, so we need some symbol that we can “goto”, some name for the function we repeatedly jump to.

Luckily, common lisp has #’gensym, a function that returns an unbound symbol, and that’s all we need to guarantee we don’t shadow anything when we run our for-loop. [1] We just need to run it once in a let surrounding the labels, and we’re good-to-go.

(defmacro loop-for (start-body predicate-body incrementor-body &body body)
  (let ((func-name (gensym)))
    `(let (,start-body)
       (labels ((,func-name ()
                  (unless ,predicate-body
                    ,@body
                    ,incrementor-body
                    (,func-name))))
         (,func-name)))))

CL-USER> (loop-for (i 0) (> i 2) (incf i) (print (1+ i)))

1 
2 
3 
NIL

And we can see what the generated code looks like by macroexpanding a quoted call to it.

CL-USER> (macroexpand-1 `(loop-for (i 0) (> i 2) (incf i) (print (1+ i))))
(LET ((I 0))
  (LABELS ((#:G590 ()
             (UNLESS (> I 2) (PRINT (1+ I)) (INCF I) (#:G590))))
    (#:G590)))

For fun, let’s compare speeds between our loop-for and a couple built-in equivalents. [2]

CL-USER> (time (loop-for (i 0) (> i 300000000) (incf i) (sqrt i)))
Evaluation took:
  0.270 seconds of real time
  0.250000 seconds of total run time (0.250000 user, 0.000000 system)
  92.59% CPU
  321,936,264 processor cycles
  0 bytes consed

CL-USER> (time (dotimes (i 300000000) (sqrt i)))
Evaluation took:
  0.273 seconds of real time
  0.265625 seconds of total run time (0.265625 user, 0.000000 system)
  97.44% CPU
  327,006,474 processor cycles
  0 bytes consed

CL-USER> (time (loop for i below 300000000 do (sqrt i)))
Evaluation took:
  0.283 seconds of real time
  0.265625 seconds of total run time (0.265625 user, 0.000000 system)
  93.99% CPU
  338,789,178 processor cycles
  0 bytes consed

Our loop-for takes roughly the same amount of time as the built-in dotimes and a basic built-in loop form, when we’re just grabbing the square-root of our loop index. Ain’t that cool?

Notes

  1. If your language doesn’t have a way to return guaranteed unbound names, a hack would be to generate a UUID.
  2. I’m using the #’time macro here which takes just one sample and doesn’t account for the OS doing other things while the code is running, so the results tend to vary a good deal. #’time isn’t good for much more than a ballpark estimate of how code compares to other code on the same machine, when that machine’s state hasn’t notably changed between comparisons. What you could do to get a less temporally-sensitive comparison is disassemble a lambda wrapping the code you wrote, and compare that. The loop-for we made disassembled into 63 bytes of assembly, while the dotimes and built-in loop form both disassembled into 61 bytes, using Steel Bank Common Lisp 2.0.0.

Leave a Reply

Your email address will not be published. Required fields are marked *