[ planet-factor ]

John Benediktsson: Faster Leap Year?

Last week, Falk Hüffner wrote about making a leap year check in three instructions:

With the following code, we can check whether a year 0 ≤ y ≤ 102499 is a leap year with only about 3 CPU instructions:

bool is_leap_year_fast(uint32_t y) {
    return ((y * 1073750999) & 3221352463) <= 126976;
}

How does this work? The answer is surprisingly complex. This article explains it, mostly to have some fun with bit-twiddling; at the end, I’ll briefly discuss the practical use.

This is how a leap year check is typically implemented:

bool is_leap_year(uint32_t y) {
    if ((y % 4) != 0) return false;
    if ((y % 100) != 0) return true;
    if ((y % 400) == 0) return true;
    return false;
}

It would be fun to see how that works in Factor and compare the relative performance between a simple version and the new super-fast-highly-optimized 3 instruction version. To do that, we can use the benchmark word to record execution time by calling it repeatedly and returning an average time-per-call in nanoseconds:

TYPED: average-benchmark ( n: fixnum quot -- nanos-per-call )
    over [ '[ _ _ times ] benchmark ] dip /f ; inline

Note: We are forcing the iteration loop above to be fixnum to reduce its overhead, and due to the design of the benchmark words below, are going to have code blocks with predictable inputs. Testing your program with random inputs is also important to see the impact of CPU optimizations such as cache and branch predictions, or across multiple CPU architectures. Performance is also impacted by use of code generation features such as inline and compiler steps such as dead-code elimination. Benchmarking is hard.

Simple implementation

The simple – and typical – implementation can be easily written as:

: leap-year? ( year -- ? )
    dup 100 divisor? 400 4 ? divisor? ;

And in fact, that’s how it is implemented in the standard library.

We can write a quick benchmarking word. This ensures we are using the optimizing compiler and also asserts that the result of the word is as expected:

: bench-leap-year ( n year ? -- nanos )
     '[ _ leap-year? _ assert= ] average-benchmark ;

And then call it one hundred million times, to see how long it takes each call on average:

IN: scratchpad 100,000,000 2028 t bench-leap-year .
10.53904317

Just under 11 nanoseconds, including the loop and the assert…

Fast implementation

The fast implementation suggested by Falk can be written directly as:

: fast-leap-year? ( year -- ? )
    1073750999 * 3221352463 bitand 126976 <= ;

And then write a benchmarking word:

: bench-fast-leap-year ( n year ? -- nanos )
     '[ _ fast-leap-year? _ assert= ] average-benchmark ;

And see how long it takes:

IN: scratchpad 100,000,000 2028 t bench-fast-leap-year .
4.74783302

Just under 5 nanoseconds…

Faster implementation

Well, generally Factor supports arbitrarily large integers by allowing integers to implicitly promote from word-sized fixnum to overflow into bignum. And, as they say, you can write the C programming language in any language.

A faster implementation might check the input is a fixnum and then force math without overflow:

TYPED: faster-leap-year? ( year: fixnum -- ? )
    1073750999 fixnum*fast 3221352463 fixnum-bitand 126976 fixnum<= ;

And write a benchmark word:

: bench-faster-leap-year ( n year ? -- nanos )
     '[ _ faster-leap-year? _ assert= ] average-benchmark ;

It’s a bit faster:

IN: scratchpad 100,000,000 2028 t bench-faster-leap-year .
3.24267145

Just under 4 nanoseconds…

Fastest implementation

But, to make sure that we take advantage of the least amount of instructions possible, we can make it slightly-less-safe by declaring the input to be a fixnum to avoid the run-time type checks. This could cause issues if it is called with other types on the stack.

: fastest-leap-year? ( year -- ? )
    { fixnum } declare
    1073750999 fixnum*fast 3221352463 fixnum-bitand 126976 fixnum<= ;

And write a benchmark word:

: bench-fastest-leap-year ( n year ? -- nanos )
     '[ _ fastest-leap-year? _ assert= ] average-benchmark ;

And then you can see it gets quite fast indeed:

IN: scratchpad 100,000,000 2028 t bench-fastest-leap-year .
2.82150476

Just under 3 nanoseconds!

But, is it also just 3 instructions?

IN: scratchpad \ fastest-leap-year? disassemble
000075f0afa19490: 89056a5bd1fe          mov [rip-0x12ea496], eax
000075f0afa19496: 498b06                mov rax, [r14]
000075f0afa19499: 48c1f804              sar rax, 0x4
000075f0afa1949d: 4869c0d7230040        imul rax, rax, 0x400023d7
000075f0afa194a4: bb0ff001c0            mov ebx, 0xc001f00f
000075f0afa194a9: 4821d8                and rax, rbx
000075f0afa194ac: 4881f800f00100        cmp rax, 0x1f000
000075f0afa194b3: b801000000            mov eax, 0x1
000075f0afa194b8: 48bb5c0e388cf0750000  mov rbx, 0x75f08c380e5c
000075f0afa194c2: 480f4ec3              cmovle rax, rbx
000075f0afa194c6: 498906                mov [r14], rax
000075f0afa194c9: 8905315bd1fe          mov [rip-0x12ea4cf], eax
000075f0afa194cf: c3                    ret

Pretty close!

There is an extra instruction near the beginning to untag our fixnum input. Due to the convention around handling booleans in Factor, there are a couple of extra instructions at the end for converting the result into a return value of either t or f.

And it could get even faster if either the assert= was removed, or the code was made inline so the function prologue and epilogue could be elided into the outer scope.

So much fun.

Wed, 21 May 2025 15:00:00

John Benediktsson: Raylib

Raylib is a very neat C library that has become popular as a “simple and easy-to-use library to enjoy videogames programming”. Originally released in 2014, it has seen lots of updates including with the latest version 5.5 representing 11 years of updates since the original version 1.0 release.

You can sense the love this library has received from the extensive feature list:

  • NO external dependencies, all required libraries included with raylib
  • Multiplatform: Windows, Linux, MacOS, RPI, Android, HTML5… and more!
  • Written in plain C code (C99) using PascalCase/camelCase notation
  • Hardware accelerated with OpenGL (1.1, 2.1, 3.3, 4.3 or ES 2.0)
  • Unique OpenGL abstraction layer: rlgl
  • Powerful Fonts module (SpriteFonts, BMfonts, TTF, SDF)
  • Multiple texture formats support, including compressed formats (DXT, ETC, ASTC)
  • Full 3d support for 3d Shapes, Models, Billboards, Heightmaps and more!
  • Flexible Materials system, supporting classic maps and PBR maps
  • Animated 3d models supported (skeletal bones animation)
  • Shaders support, including Model shaders and Postprocessing shaders
  • Powerful math module for Vector, Matrix and Quaternion operations: raymath
  • Audio loading and playing with streaming support (WAV, OGG, MP3, FLAC, XM, MOD)
  • VR stereo rendering support with configurable HMD device parameters
  • Huge examples collection with +120 code examples!
  • Bindings to +60 programming languages!
  • Free and open source. Check [LICENSE].

In 2019, we first included support for version 2.5 of raylib. Over the years, we have updated our bindings to include new functions and features of new versions. Our most recent two releases, Factor 0.99 and Factor 0.100 included support for version 4.5. And in the current development cycle, we have updated to version 5.0 and then again updated to version 5.5.

It is possible to maintain support for multiple versions, but we have chosen for now to just target the most recent stable release as best we can. Generally, the raylib library has been quite stable, with the occasional deprecation or breaking change which seem to be easy to adjust for.

As a simple demonstration, we can start with the basic window example – an extremely simple program in the spirit of achieving a black triangle moment – to make sure everything works. We can directly translate this example into Factor using our Raylib bindings – which import the C functions using our convention for kebab-case word names.

USE: raylib

: basic-window ( -- )

    800 450 "raylib [core] example - basic window" init-window

    60 set-target-fps

    [ window-should-close ] [

        begin-drawing

        RAYWHITE clear-background

        "Congrats! You created your first window!"
        190 200 20 LIGHTGRAY draw-text

        end-drawing

    ] until close-window ;

And, if you run the example – you’ll end up with this:

Note: if you’re wondering why the background isn’t white, it’s because RAYWHITE is a special version of not-quite-white that matches the Raylib logo.

Of course, your examples can become more complex. For example, Joseph Oziel released a fun game called BitGuessr written in Factor using Raylib. And, given the extensive feature list, your simple program written quickly for a game jam might end up including great audio, images, keyboard, mouse, gamepad, 3d rendering, and other features relatively easily!

I’d love to see more demos written in Factor and Raylib. Happy coding!

Tue, 20 May 2025 15:00:00

John Benediktsson: Shuffle Syntax

Some might describe shuffle words as one of the fundamental building blocks of Factor. Others might describe them as a code smell and seek to use dataflow combinators or other higher-level words to reduce code complexity.

Whatever your opinion is, they are useful concepts in a concatenative language. Besides the basic shuffle words – like dup, swap, rot – we have had the shuffle vocabulary which provides some “additional shuffle words” for awhile, as well as a syntax word that can perform arbitrary shuffles:

IN: scratchpad USE: shuffle

IN: scratchpad { 1 2 3 4 5 } [
                   shuffle( a b c d e -- d e c a b )
               ] with-datastack .
{ 4 5 3 1 2 }

This would be quite useful, except that it has had a fundamental issue – the way it is implemented uses a macro to curry the stack arguments into an array, and then pull the stack arguments back out of the array onto the stack in the requested order.

For example, we can look at a simple swap and a complex shuffle:

IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[
    2 0 <array> dup >R >R R> 3 set-slot
    R> >R R> dup >R >R R> 2 set-slot
    R> >R R> dup >R >R R> >R R> 3 slot R> >R R> >R R> 2 slot
]

IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
    5 0 <array> dup >R >R R> 6 set-slot
    R> >R R> dup >R >R R> 5 set-slot
    R> >R R> dup >R >R R> 4 set-slot
    R> >R R> dup >R >R R> 3 set-slot
    R> >R R> dup >R >R R> 2 set-slot
    R> >R R> dup >R >R R> >R R> 3 slot
    R> dup >R >R R> >R R> 2 slot R> dup >R >R R> >R R> 5 slot
    R> dup >R >R R> >R R> 4 slot R> >R R> >R R> 6 slot
]

And not only would this be less-than-efficient, it would also turn literal arguments that were on the stack into run-time arguments and potentially cause a Cannot apply 'call' to a run-time computed value error if one of the shuffled arguments is a quotation they hope to use.

This bug was described on our issue tracker and I spent some time recently looking into it.

It turns out that we can use the stack checker to indicate that a shuffle is taking place, and use some "special" machinery to allow the optimizing compiler to generate efficient and correct code for these arbitrary shuffles.

After applying a small fix, we can see that the earlier examples are now quite simple:

IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[ swap ]

IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
    ( 10791205 10791206 10791207 10791208 10791209 -- 10791206 10791205 10791208 10791207 10791209 )
]

This is available in the latest development version.

Sun, 18 May 2025 15:00:00

John Benediktsson: Even More Brainf*ck

I was distracted a little by some recent explorations building a Brainfuck interpreter in Factor and had a couple of follow-ups to add to the conversation.

First, I realized my initial quick-and-dirty Brainfuck interpreter didn’t support nested loops. Specifically, the logic for beginning or ending a loop just searched for the nearest [ or ] character without considering nesting. This was fixed today so that will no longer be an issue.

Second, despite the Brainfuck compiler implicitly making an AST (abstract syntax tree) for Brainfuck by virtue of generating a quotation, I thought it would be more fun to build and generate one intentionally.

We can model the Brainfuck commands as operations using the following tuples and singletons:

TUPLE: ptr n ;
TUPLE: mem n ;
SINGLETONS: output input debug ;
TUPLE: loop ops ;

Next, we can build a parser using EBNF to convert the textual commands into our Brainfuck AST:

EBNF: ast-brainfuck [=[

inc-ptr  = (">")+     => [[ length ptr boa ]]
dec-ptr  = ("<")+     => [[ length neg ptr boa ]]
inc-mem  = ("+")+     => [[ length mem boa ]]
dec-mem  = ("-")+     => [[ length neg mem boa ]]
output   = "."        => [[ output ]]
input    = ","        => [[ input ]]
debug    = "#"        => [[ debug ]]
space    = [ \t\n\r]+ => [[ f ]]
unknown  = (.)        => [[ "Invalid input" throw ]]

ops   = inc-ptr|dec-ptr|inc-mem|dec-mem|output|input|debug|space
loop  = "[" {loop|ops}+ "]" => [[ second sift loop boa ]]

code  = (loop|ops|unknown)* => [[ sift ]]

]=]

This is interesting, because now we can more easily analyze a piece of Brainfuck code, such as the Hello, World example that I have been frequently using:

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " ast-brainfuck .
V{
    T{ mem { n 10 } }
    T{ loop
        { ops
            V{
                T{ ptr { n 1 } }
                T{ mem { n 7 } }
                T{ ptr { n 1 } }
                T{ mem { n 10 } }
                T{ ptr { n 1 } }
                T{ mem { n 3 } }
                T{ ptr { n 1 } }
                T{ mem { n 1 } }
                T{ ptr { n -4 } }
                T{ mem { n -1 } }
            }
        }
    }
    T{ ptr { n 1 } }
    T{ mem { n 2 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 1 } }
    output
    T{ mem { n 7 } }
    output
    output
    T{ mem { n 3 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 2 } }
    output
    T{ ptr { n -2 } }
    T{ mem { n 15 } }
    output
    T{ ptr { n 1 } }
    output
    T{ mem { n 3 } }
    output
    T{ mem { n -6 } }
    output
    T{ mem { n -8 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 1 } }
    output
    T{ ptr { n 1 } }
    output
}

And then we can implement those operations against a brainfuck state object, by deferring to words from our current implementation:

GENERIC: op ( brainfuck op -- brainfuck )
M: ptr op n>> (>) ;
M: mem op n>> (+) ;
M: output op drop (.) ;
M: input op drop (,) ;
M: debug op drop (#) ;
M: loop op [ get-memory zero? ] swap ops>> '[ _ [ op ] each ] until ;

And now this Brainfuck AST represents a hybrid execution model somewhere between the compiled and interpreted versions:

: hybrid-brainfuck ( code -- )
    [ <brainfuck> ] dip ast-brainfuck [ op ] each drop ;

And see that it works:

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " hybrid-brainfuck
Hello World!

We also gain some potential for building code optimization techniques that operate on an AST as a step before actual compilation or execution – for example, coalescing adjacent increment and decrement operations or some other more complex analysis.

That, however, is likely to remain an exercise for the reader!

Sat, 17 May 2025 15:00:00

John Benediktsson: More Brainf*ck

Almost 16 years ago, I wrote about implementing the Brainfuck programming language in Factor. It is a curious programming language, sometimes considered one of the most famous esoteric programming languages.

In any event – and encouraged by a question I was asked recently – I spent some time thinking about the current process of “compiling” the Brainfuck into quotations versus how an interpreter might work instead.

As a quick reminder, our current implementation expands a program written in Brainfuck into an equivalent form in Factor, and then allows it to be run:

IN: scratchpad USE: brainfuck

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " run-brainfuck
Hello World!


IN: scratchpad [
                   "
                   ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
                   >++.>+.+++++++..+++.>++.<<+++++++++++++++
                   .>.+++.------.--------.>+.>.
                   " run-brainfuck
               ] expand-macros .
[
    <brainfuck> 10 (+)
    [ get-memory zero? ] [
        1 (>) 7 (+) 1 (>) 10 (+) 1 (>) 3 (+) 1 (>) 1 (+) 4 (<)
        1 (-)
    ] until 1 (>) 2 (+) (.) 1 (>) 1 (+) (.) 7 (+) (.) (.) 3 (+)
    (.) 1 (>) 2 (+) (.) 2 (<) 15 (+) (.) 1 (>) (.) 3 (+)
    (.) 6 (-) (.) 8 (-) (.) 1 (>) 1 (+) (.) 1 (>) (.) drop flush
]

Notice the coalescing that it performs to collapse multiple identical operators into a single action.

But, this got me curious about a couple of things:

  1. How could we cleanly write a Factor word that is implemented in Brainfuck?
  2. How could we write a Factor interpreter for Brainfuck, and what benefits would it have?

Building a syntax word

Thankfully, the answer to the first question is simple using parsing words.

SYNTAX: BRAINFUCK:
    scan-new-word ";" parse-tokens concat
    '[ _ run-brainfuck ] ( -- ) define-declared ;

Now, we can define a Hello, World, complete with inline comments:

BRAINFUCK: hello
    +++++ +++               ! Set Cell #0 to 8
    [
        >++++               ! Add 4 to Cell #1; this will always set Cell #1 to 4
        [                   ! as the cell will be cleared by the loop
            >++             ! Add 4*2 to Cell #2
            >+++            ! Add 4*3 to Cell #3
            >+++            ! Add 4*3 to Cell #4
            >+              ! Add 4 to Cell #5
            <<<<-           ! Decrement the loop counter in Cell #1
        ]                   ! Loop till Cell #1 is zero
        >+                  ! Add 1 to Cell #2
        >+                  ! Add 1 to Cell #3
        >-                  ! Subtract 1 from Cell #4
        >>+                 ! Add 1 to Cell #6
        [<]                 ! Move back to the first zero cell you find; this will
                            ! be Cell #1 which was cleared by the previous loop
        <-                  ! Decrement the loop Counter in Cell #0
    ]                       ! Loop till Cell #0 is zero

    ! The result of this is:
    ! Cell No :   0   1   2   3   4   5   6
    ! Contents:   0   0  72 104  88  32   8
    ! Pointer :   ^

    >>.                     ! Cell #2 has value 72 which is 'H'
    >---.                   ! Subtract 3 from Cell #3 to get 101 which is 'e'
    +++++ ++..+++.          ! Likewise for 'llo' from Cell #3
    >>.                     ! Cell #5 is 32 for the space
    <-.                     ! Subtract 1 from Cell #4 for 87 to give a 'W'
    <.                      ! Cell #3 was set to 'o' from the end of 'Hello'
    +++.----- -.----- ---.  ! Cell #3 for 'rl' and 'd'
    >>+.                    ! Add 1 to Cell #5 gives us an exclamation point
    >++.                    ! And finally a newline from Cell #6
    ;

And, it works!

IN: scratchpad hello
Hello World!

Note: we are using ! as a comment character which is the convention in Factor. Some Brainfuck implementations use that character to indicate embedded program inputs.

That’s pretty cool, and a neat example of using the lexer, the parser, and macros.

Building an interpreter

The answer to the second question might be more complex and nuanced, but thankfully we can re-use some of the current implementation to make a quick-and-dirty interpreter:

: end-loop ( str i -- str j/f )
    CHAR: ] swap pick index-from dup [ 1 + ] when ;

: start-loop ( str i -- str j/f )
    1 - CHAR: [ swap pick last-index-from dup [ 1 + ] when ;

: interpret-brainfuck-from ( str i brainfuck -- str next/f brainfuck )
    2over swap ?nth [ 1 + ] 2dip {
        { CHAR: > [ 1 (>) ] }
        { CHAR: < [ 1 (<) ] }
        { CHAR: + [ 1 (+) ] }
        { CHAR: - [ 1 (-) ] }
        { CHAR: . [ (.) ] }
        { CHAR: , [ (,) ] }
        { CHAR: # [ (#) ] }
        { CHAR: [ [ get-memory zero? [ [ end-loop ] dip ] when ] }
        { CHAR: ] [ get-memory zero? [ [ start-loop ] dip ] unless ] }
        { f [ [ drop f ] dip ] }
        [ blank? [ "Invalid input" throw ] unless ]
    } case ;

: interpret-brainfuck ( str -- )
    0 <brainfuck> [ interpret-brainfuck-from over ] loop 3drop ;

And give it a try:

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " interpret-brainfuck
Hello, World!

It works! But now I’m curious about relative performance. Let’s build a silly benchmark equivalent to cat to redirect the contents of an input-stream to an output-stream. We can compare our compiled run-brainfuck macro to a version that uses the interpreter we made above and then to a native version implemented using stream-copy.

: cat1 ( -- ) ",[.,]" run-brainfuck ;

: cat2 ( -- ) ",[.,]" interpret-brainfuck ;

: cat3 ( -- ) input-stream get output-stream get stream-copy* ;

First, we should make sure they both work:

IN: scratchpad "Compiled" [ cat1 ] with-string-reader
Compiled

IN: scratchpad "Interpreted" [ cat2 ] with-string-reader
Interpreted

IN: scratchpad "Factor" [ cat3 ] with-string-reader
Factor

Okay, so it seems to work!

For quick performance testing, lets compare them outputting to a null stream:

IN: scratchpad [
                   1,000,000 CHAR: a <string> [
                       [ cat1 ] with-null-writer
                   ] with-string-reader
               ] time
Running time: 0.059820291 seconds

IN: scratchpad [
                   1,000,000 CHAR: a <string> [
                       [ cat2 ] with-null-writer
                   ] with-string-reader
               ] time
Running time: 0.13840325 seconds

IN: scratchpad [
                   1,000,000 CHAR: a <string> [
                       [ cat3 ] with-null-writer
                   ] with-string-reader
               ] time
Running time: 0.015008417 seconds

The compiled one is a bit more than 2x faster than the interpreted version, but both are slower than the native version.

Let’s try comparing our “Hello, World” example – where the operator coalescing that the compiled version does might help:

: hello1 ( -- )
   "
   ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
   >++.>+.+++++++..+++.>++.<<+++++++++++++++
   .>.+++.------.--------.>+.>.
   " run-brainfuck ;

: hello2 ( -- )
   "
   ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
   >++.>+.+++++++..+++.>++.<<+++++++++++++++
   .>.+++.------.--------.>+.>.
   " interpret-brainfuck ;

We can see that the compiled one is now more like 7x faster:

IN: scratchpad [ [ 10,000 [ hello1 ] times ] with-null-writer ] time
Running time: 0.018075292 seconds

IN: scratchpad [ [ 10,000 [ hello2 ] times ] with-null-writer ] time
Running time: 0.133718416 seconds

Obviously, this is somewhat comparing apples and oranges because it’s ignoring compilation time in the comparison, and I didn’t spend any time on optimizing the interpreted version – for example, stripping blanks or validating inputs before doing the interpreter loop – but it’s a useful starting point for understanding tradeoffs.

How long does it currently take to compile?

IN: scratchpad [
                    [
                       gensym [
                           "
                           ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
                           >++.>+.+++++++..+++.>++.<<+++++++++++++++
                           .>.+++.------.--------.>+.>.
                           " run-brainfuck
                       ] ( -- ) define-declared
                   ] with-compilation-unit
               ] time
Running time: 0.02294075 seconds

…a bit more time than it takes to call it 10,000 times. Interesting.

Fri, 16 May 2025 15:00:00

John Benediktsson: Roman Sort

Factor has included support for Roman numerals since at least 2008. Sometimes recent events – such as the election of Pope Leo XIV or the Super Bowl LIX – revive modern interest in how they work and how to computationally work with them.

There was a blog post a few days ago about sorting Roman numerals which pointed out that sorting them alphabetically worked pretty well. Given that we have a pretty good roman vocabulary, I thought we could explore different methods of sorting a sequence of strings representing Roman numbers.

Let’s first try the mostly-but-not-entirely-correct method suggested in the blog post:

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad sort .
{ "I" "II" "III" "IV" "IX" "V" "VI" "VII" "VIII" }

Well, that’s almost correct, but the number IX – the number 9 – sorts in the middle, rather than at the end. Could we fix this by using sort-by to convert the string to a number before calling compare to produce a sorted output?

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad [ roman> ] sort-by .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }

That fixes it, but how often do we end up calling the conversion function?

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad SYMBOL: calls

IN: scratchpad [ calls inc roman> ] sort-by .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }

IN: scratchpad calls get .
40

Wow, 40 times – that seems like a lot!

Perhaps we could try the Schwartzian transform – also known as decorate-sort-undecorate – at the expense of using intermediate storage by saving the keys for each element, then sorting, and then returning only the values:

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad [ [ roman> ] keep ] map>alist sort-keys values .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }

That seems like a lot of code to do a simple thing. Instead, we can use the map-sort abstraction which implements the same method:

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad [ roman> ] map-sort .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }

Does this make much of a difference? Let’s compare each method:

: roman-sort1 ( seq -- sorted-seq ) [ roman> ] sort-by ;

: roman-sort2 ( seq -- sorted-seq ) [ roman> ] map-sort ;

We can time sorting a list of 100,000 random Roman numbers under 1,000:

IN: scratchpad 100,000 1,000 [1..b] randoms [ >ROMAN ] map

IN: scratchpad [ [ roman-sort1 ] time ]
               [ [ roman-sort2 ] time ] bi
Running time: 4.164076625 seconds
Running time: 0.154227583 seconds

You can see that the decorate-sort-undecorate pattern is quite a bit faster in this case. This is not always true, but generally depends on how much work the key function is doing.

Mon, 12 May 2025 15:00:00

John Benediktsson: Recamán’s Sequence

I love reading John D. Cook’s blog which covers various mathematical concepts, code snippets, and other curious explorations. Today, he celebrated his 5,000th post which is a pretty incredible milestone. In honor of that, let’s implement the Recamán’s sequence which he wrote about yesterday:

I recently ran into Recamán’s sequence. N. J. A. Sloane, the founder of the Online Encyclopedia of Integer Sequences calls Recamán’s sequence one of his favorites. It’s sequence A005132 in the OEIS.

This sequence starts at 0 and the nth number in the sequence is the result of moving forward or backward n steps from the previous number. You are allowed to move backward if the result is positive and a number you haven’t already visited. Otherwise you move forward.

Here’s Python code to generate the first N elements of the sequence.

def recaman(N):
    a = [0]*N
    for n in range(1, N):
        proposal = a[n-1] - n
        if proposal > 0 and proposal not in set(a):
            a[n] = proposal
        else:
            a[n] = a[n-1] + n
    return a

For example, recaman(10) returns

[0, 1, 3, 6, 2, 7, 13, 20, 12, 21]

Direct Implementation

The code for this subtract if you can, add if you can’t sequence is not particularly complex, but it is often fun to see what an algorithm looks like when implemented in a different language. So, we’ll try a direct implementation in Factor first and then talk about some variations.

Using local variables, we can keep our version very similar to the original:

:: recaman ( N -- seq )
    N 0 <array> :> a
    N [1..b) [| n |
        n 1 - a nth n - :> proposal
        proposal 0 > proposal a member? not and [
            proposal n a set-nth
        ] [
            n 1 - a nth n + n a set-nth
        ] if
    ] each a ;

And, show that it works:

IN: scratchpad 10 recaman .
{ 0 1 3 6 2 7 13 20 12 21 }

Short-Circuit Logic

One difference that is subtle is that Python’s and operation is short-circuiting so that if the first expression returns False, the second expression is not evaluated. We can make that change ourselves:

- proposal 0 > proposal a member? not and [
+ proposal 0 > [ proposal a member? not ] [ f ] if [

Or use short-circuit combinators to do that for us:

- proposal 0 > proposal a member? not and [
+ proposal { [ 0 > ] [ a member? not ] } 1&& [

Reduce Repetitions

We could also notice repetitions that occur, for example both branches of the if set the nth value of the array. In addition, we could keep the proposal on the stack instead of naming it as a local variable, and then just replace it when the branch evaluates to f.

Incorporating these changes results in this version:

:: recaman ( N -- seq )
    N 0 <array> :> a
    N [1..b) [| n |
        n 1 - a nth n -
        dup { [ 0 > ] [ a member? not ] } 1&& [
            drop n 1 - a nth n +
        ] unless n a set-nth
    ] each a ;

There is still repetition as each iteration looks up the previous value twice. Instead, we could fix that by storing that value as a local variable:

:: recaman ( N -- seq )
    N 0 <array> :> a
    N [1..b) [| n |
        n 1 - a nth :> prev
        prev n - dup { [ 0 > ] [ a member? not ] } 1&&
        [ drop prev n + ] unless n a set-nth
    ] each a ;

Sometimes it is simpler to extract an inner loop and then see how it looks:

:: next-recaman ( prev n a -- next )
    prev n - dup { [ 0 > ] [ a member? not ] } 1&&
    [ drop prev n + ] unless ;

:: recaman ( N -- seq )
    N 0 <array> :> a
    N [1..b) [
        [ 1 - a nth ] [ a next-recaman ] [ a set-nth ] tri
    ] each a ;

The retrieval of the previous value is un-necessary work on each iteration. We could remove that by keeping the previous value on the stack and use the make vocabulary to accumulate values. We also start from index 0 instead of 1 as a simplification:

:: next-recaman ( prev n -- next )
    prev n - dup { [ 0 > ] [ building get member? not ] } 1&&
    [ drop prev n + ] unless ;

:: recaman ( N -- seq )
    [ 0 N <iota> [ next-recaman dup , ] each drop ] { } make ;

Constant-time Lookups

But, there is a big performance issue. Searching the previous values to see if it had been generated before uses a linear-time algorithm. It would be much faster to use a bit-vector to track the previously seen numbers and a constant-time lookup:

:: next-recaman ( prev n seen -- next )
    prev n - dup { [ 0 > ] [ seen nth not ] } 1&&
    [ drop prev n + ] unless t over seen set-nth ;

:: recaman ( N -- seq )
    0 N dup <bit-vector> '[ _ next-recaman dup ] map-integers nip ;

This allows us to generate, for example, one million values in the sequence in a small amount of time:

IN: scratchpad [ 1,000,000 recaman last . ] time
1057164

Running time: 0.049441708 seconds

Or even calculate the nth value without saving the sequence:

:: nth-recaman ( N -- elt )
    0 N dup <bit-vector> '[ _ next-recaman ] each-integer ;

Using Generators

We could even use the generators vocabulary to make an infinite stream that we can sample from:

:: next-recaman ( prev n seen -- next )
    prev n - dup { [ 0 > ] [ seen nth not ] } 1&&
    [ drop prev n + ] unless t over seen set-nth ;

GEN: recaman ( -- gen )
    0 0 ?V{ } clone '[
        [ _ next-recaman dup yield ] [ 1 + ] bi t
    ] loop 2drop ;

And then take some values from it:

IN: scratchpad recaman 10 take .
{ 0 1 3 6 2 7 13 20 12 21 }

It is often pleasantly surprising to explore seemingly simple topics – for example, looking at this algorithm and thinking about different time complexity, data structures, or programming language libraries that might be useful.

I have added this today to the Factor development version.

Tue, 6 May 2025 17:00:00

John Benediktsson: Filtering Errors

There was a discussion today on the Factor Discord server about how one might filter or reject objects in a sequence based on whether a quotation throws an error or not when applied to each item. We’re going to implement that now in Factor and hopefully learn a few things.

First, we need a way to see if a quotation throws an error or not. One way would be to call the quotation and use recover to return a different boolean if an error was thrown. That might look something like this:

: throws? ( quot -- ? )
    '[ @ f ] [ drop t ] recover ; inline

And that kinda works:

IN: scratchpad 10 20 [ + ] throws?

--- Data stack:
30
f

IN: scratchpad 10 "20" [ + ] throws?

--- Data stack:
10
"20"
t

But you can see that in the working first case, it did not throw an error and the stack has the single output of the quotation and f, but in the failed second case it has the two inputs to the quotation and a t indicating that an error was thrown.

So, ideally we could modify this so that when the quotation succeeds, we drop the outputs, and when the quotation fails, we drop the inputs, and then can consistently produce a single boolean output for any quotation this applies to.

Luckily for us, we can use the drop-outputs and drop-inputs words which infer the stack effect and handles each case correctly. This would change our implementation slightly:

: throws? ( quot -- ? )
    [ '[ _ drop-outputs f ] ]
    [ '[ drop _ drop-inputs t ] ] bi recover ; inline

And now we can see that it handles both cases nicely:

IN: scratchpad 10 20 [ + ] throws?

--- Data stack:
f

IN: scratchpad 10 "20" [ + ] throws?

--- Data stack:
t

Now, we can combine that with our sequence operations and make these useful words:

: filter-errors ( ... seq quot -- ... subseq )
    '[ _ throws? ] filter ; inline

: reject-errors ( ... seq quot -- ... subseq )
    '[ _ throws? ] reject ; inline

And then see how they work:

IN: scratchpad { t "5" 123 } [ string>number ] filter-errors .
{ t 123 }

IN: scratchpad { t "5" 123 } [ string>number ] reject-errors .
{ "5" }

These words were added recently to the development version of Factor.

Mon, 5 May 2025 15:00:00

Blogroll


planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.

Syndicate