[ planet-factor ]

John Benediktsson: Shuffle Syntax

Some might describe shuffle words as one of the fundamental building blocks of Factor. Others might describe them as a code smell and seek to use dataflow combinators or other higher-level words to reduce code complexity.

Whatever your opinion is, they are useful concepts in a concatenative language. Besides the basic shuffle words – like dup, swap, rot – we have had the shuffle vocabulary which provides some “additional shuffle words” for awhile, as well as a syntax word that can perform arbitrary shuffles:

IN: scratchpad USE: shuffle

IN: scratchpad { 1 2 3 4 5 } [
                   shuffle( a b c d e -- d e c a b )
               ] with-datastack .
{ 4 5 3 1 2 }

This would be quite useful, except that it has had a fundamental issue – the way it is implemented uses a macro to curry the stack arguments into an array, and then pull the stack arguments back out of the array onto the stack in the requested order.

For example, we can look at a simple swap and a complex shuffle:

IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[
    2 0 <array> dup >R >R R> 3 set-slot
    R> >R R> dup >R >R R> 2 set-slot
    R> >R R> dup >R >R R> >R R> 3 slot R> >R R> >R R> 2 slot
]

IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
    5 0 <array> dup >R >R R> 6 set-slot
    R> >R R> dup >R >R R> 5 set-slot
    R> >R R> dup >R >R R> 4 set-slot
    R> >R R> dup >R >R R> 3 set-slot
    R> >R R> dup >R >R R> 2 set-slot
    R> >R R> dup >R >R R> >R R> 3 slot
    R> dup >R >R R> >R R> 2 slot R> dup >R >R R> >R R> 5 slot
    R> dup >R >R R> >R R> 4 slot R> >R R> >R R> 6 slot
]

And not only would this be less-than-efficient, it would also turn literal arguments that were on the stack into run-time arguments and potentially cause a Cannot apply 'call' to a run-time computed value error if one of the shuffled arguments is a quotation they hope to use.

This bug was described on our issue tracker and I spent some time recently looking into it.

It turns out that we can use the stack checker to indicate that a shuffle is taking place, and use some "special" machinery to allow the optimizing compiler to generate efficient and correct code for these arbitrary shuffles.

After applying a small fix, we can see that the earlier examples are now quite simple:

IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[ swap ]

IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
    ( 10791205 10791206 10791207 10791208 10791209 -- 10791206 10791205 10791208 10791207 10791209 )
]

This is available in the latest development version.

Sun, 18 May 2025 15:00:00

John Benediktsson: Even More Brainf*ck

I was distracted a little by some recent explorations building a Brainfuck interpreter in Factor and had a couple of follow-ups to add to the conversation.

First, I realized my initial quick-and-dirty Brainfuck interpreter didn’t support nested loops. Specifically, the logic for beginning or ending a loop just searched for the nearest [ or ] character without considering nesting. This was fixed today so that will no longer be an issue.

Second, despite the Brainfuck compiler implicitly making an AST (abstract syntax tree) for Brainfuck by virtue of generating a quotation, I thought it would be more fun to build and generate one intentionally.

We can model the Brainfuck commands as operations using the following tuples and singletons:

TUPLE: ptr n ;
TUPLE: mem n ;
SINGLETONS: output input debug ;
TUPLE: loop ops ;

Next, we can build a parser using EBNF to convert the textual commands into our Brainfuck AST:

EBNF: ast-brainfuck [=[

inc-ptr  = (">")+     => [[ length ptr boa ]]
dec-ptr  = ("<")+     => [[ length neg ptr boa ]]
inc-mem  = ("+")+     => [[ length mem boa ]]
dec-mem  = ("-")+     => [[ length neg mem boa ]]
output   = "."        => [[ output ]]
input    = ","        => [[ input ]]
debug    = "#"        => [[ debug ]]
space    = [ \t\n\r]+ => [[ f ]]
unknown  = (.)        => [[ "Invalid input" throw ]]

ops   = inc-ptr|dec-ptr|inc-mem|dec-mem|output|input|debug|space
loop  = "[" {loop|ops}+ "]" => [[ second sift loop boa ]]

code  = (loop|ops|unknown)* => [[ sift ]]

]=]

This is interesting, because now we can more easily analyze a piece of Brainfuck code, such as the Hello, World example that I have been frequently using:

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " ast-brainfuck .
V{
    T{ mem { n 10 } }
    T{ loop
        { ops
            V{
                T{ ptr { n 1 } }
                T{ mem { n 7 } }
                T{ ptr { n 1 } }
                T{ mem { n 10 } }
                T{ ptr { n 1 } }
                T{ mem { n 3 } }
                T{ ptr { n 1 } }
                T{ mem { n 1 } }
                T{ ptr { n -4 } }
                T{ mem { n -1 } }
            }
        }
    }
    T{ ptr { n 1 } }
    T{ mem { n 2 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 1 } }
    output
    T{ mem { n 7 } }
    output
    output
    T{ mem { n 3 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 2 } }
    output
    T{ ptr { n -2 } }
    T{ mem { n 15 } }
    output
    T{ ptr { n 1 } }
    output
    T{ mem { n 3 } }
    output
    T{ mem { n -6 } }
    output
    T{ mem { n -8 } }
    output
    T{ ptr { n 1 } }
    T{ mem { n 1 } }
    output
    T{ ptr { n 1 } }
    output
}

And then we can implement those operations against a brainfuck state object, by deferring to words from our current implementation:

GENERIC: op ( brainfuck op -- brainfuck )
M: ptr op n>> (>) ;
M: mem op n>> (+) ;
M: output op drop (.) ;
M: input op drop (,) ;
M: debug op drop (#) ;
M: loop op [ get-memory zero? ] swap ops>> '[ _ [ op ] each ] until ;

And now this Brainfuck AST represents a hybrid execution model somewhere between the compiled and interpreted versions:

: hybrid-brainfuck ( code -- )
    [ <brainfuck> ] dip ast-brainfuck [ op ] each drop ;

And see that it works:

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " hybrid-brainfuck
Hello World!

We also gain some potential for building code optimization techniques that operate on an AST as a step before actual compilation or execution – for example, coalescing adjacent increment and decrement operations or some other more complex analysis.

That, however, is likely to remain an exercise for the reader!

Sat, 17 May 2025 15:00:00

John Benediktsson: More Brainf*ck

Almost 16 years ago, I wrote about implementing the Brainfuck programming language in Factor. It is a curious programming language, sometimes considered one of the most famous esoteric programming languages.

In any event – and encouraged by a question I was asked recently – I spent some time thinking about the current process of “compiling” the Brainfuck into quotations versus how an interpreter might work instead.

As a quick reminder, our current implementation expands a program written in Brainfuck into an equivalent form in Factor, and then allows it to be run:

IN: scratchpad USE: brainfuck

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " run-brainfuck
Hello World!


IN: scratchpad [
                   "
                   ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
                   >++.>+.+++++++..+++.>++.<<+++++++++++++++
                   .>.+++.------.--------.>+.>.
                   " run-brainfuck
               ] expand-macros .
[
    <brainfuck> 10 (+)
    [ get-memory zero? ] [
        1 (>) 7 (+) 1 (>) 10 (+) 1 (>) 3 (+) 1 (>) 1 (+) 4 (<)
        1 (-)
    ] until 1 (>) 2 (+) (.) 1 (>) 1 (+) (.) 7 (+) (.) (.) 3 (+)
    (.) 1 (>) 2 (+) (.) 2 (<) 15 (+) (.) 1 (>) (.) 3 (+)
    (.) 6 (-) (.) 8 (-) (.) 1 (>) 1 (+) (.) 1 (>) (.) drop flush
]

Notice the coalescing that it performs to collapse multiple identical operators into a single action.

But, this got me curious about a couple of things:

  1. How could we cleanly write a Factor word that is implemented in Brainfuck?
  2. How could we write a Factor interpreter for Brainfuck, and what benefits would it have?

Building a syntax word

Thankfully, the answer to the first question is simple using parsing words.

SYNTAX: BRAINFUCK:
    scan-new-word ";" parse-tokens concat
    '[ _ run-brainfuck ] ( -- ) define-declared ;

Now, we can define a Hello, World, complete with inline comments:

BRAINFUCK: hello
    +++++ +++               ! Set Cell #0 to 8
    [
        >++++               ! Add 4 to Cell #1; this will always set Cell #1 to 4
        [                   ! as the cell will be cleared by the loop
            >++             ! Add 4*2 to Cell #2
            >+++            ! Add 4*3 to Cell #3
            >+++            ! Add 4*3 to Cell #4
            >+              ! Add 4 to Cell #5
            <<<<-           ! Decrement the loop counter in Cell #1
        ]                   ! Loop till Cell #1 is zero
        >+                  ! Add 1 to Cell #2
        >+                  ! Add 1 to Cell #3
        >-                  ! Subtract 1 from Cell #4
        >>+                 ! Add 1 to Cell #6
        [<]                 ! Move back to the first zero cell you find; this will
                            ! be Cell #1 which was cleared by the previous loop
        <-                  ! Decrement the loop Counter in Cell #0
    ]                       ! Loop till Cell #0 is zero

    ! The result of this is:
    ! Cell No :   0   1   2   3   4   5   6
    ! Contents:   0   0  72 104  88  32   8
    ! Pointer :   ^

    >>.                     ! Cell #2 has value 72 which is 'H'
    >---.                   ! Subtract 3 from Cell #3 to get 101 which is 'e'
    +++++ ++..+++.          ! Likewise for 'llo' from Cell #3
    >>.                     ! Cell #5 is 32 for the space
    <-.                     ! Subtract 1 from Cell #4 for 87 to give a 'W'
    <.                      ! Cell #3 was set to 'o' from the end of 'Hello'
    +++.----- -.----- ---.  ! Cell #3 for 'rl' and 'd'
    >>+.                    ! Add 1 to Cell #5 gives us an exclamation point
    >++.                    ! And finally a newline from Cell #6
    ;

And, it works!

IN: scratchpad hello
Hello World!

Note: we are using ! as a comment character which is the convention in Factor. Some Brainfuck implementations use that character to indicate embedded program inputs.

That’s pretty cool, and a neat example of using the lexer, the parser, and macros.

Building an interpreter

The answer to the second question might be more complex and nuanced, but thankfully we can re-use some of the current implementation to make a quick-and-dirty interpreter:

: end-loop ( str i -- str j/f )
    CHAR: ] swap pick index-from dup [ 1 + ] when ;

: start-loop ( str i -- str j/f )
    1 - CHAR: [ swap pick last-index-from dup [ 1 + ] when ;

: interpret-brainfuck-from ( str i brainfuck -- str next/f brainfuck )
    2over swap ?nth [ 1 + ] 2dip {
        { CHAR: > [ 1 (>) ] }
        { CHAR: < [ 1 (<) ] }
        { CHAR: + [ 1 (+) ] }
        { CHAR: - [ 1 (-) ] }
        { CHAR: . [ (.) ] }
        { CHAR: , [ (,) ] }
        { CHAR: # [ (#) ] }
        { CHAR: [ [ get-memory zero? [ [ end-loop ] dip ] when ] }
        { CHAR: ] [ get-memory zero? [ [ start-loop ] dip ] unless ] }
        { f [ [ drop f ] dip ] }
        [ blank? [ "Invalid input" throw ] unless ]
    } case ;

: interpret-brainfuck ( str -- )
    0 <brainfuck> [ interpret-brainfuck-from over ] loop 3drop ;

And give it a try:

IN: scratchpad "
               ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
               >++.>+.+++++++..+++.>++.<<+++++++++++++++
               .>.+++.------.--------.>+.>.
               " interpret-brainfuck
Hello, World!

It works! But now I’m curious about relative performance. Let’s build a silly benchmark equivalent to cat to redirect the contents of an input-stream to an output-stream. We can compare our compiled run-brainfuck macro to a version that uses the interpreter we made above and then to a native version implemented using stream-copy.

: cat1 ( -- ) ",[.,]" run-brainfuck ;

: cat2 ( -- ) ",[.,]" interpret-brainfuck ;

: cat3 ( -- ) input-stream get output-stream get stream-copy* ;

First, we should make sure they both work:

IN: scratchpad "Compiled" [ cat1 ] with-string-reader
Compiled

IN: scratchpad "Interpreted" [ cat2 ] with-string-reader
Interpreted

IN: scratchpad "Factor" [ cat3 ] with-string-reader
Factor

Okay, so it seems to work!

For quick performance testing, lets compare them outputting to a null stream:

IN: scratchpad [
                   1,000,000 CHAR: a <string> [
                       [ cat1 ] with-null-writer
                   ] with-string-reader
               ] time
Running time: 0.059820291 seconds

IN: scratchpad [
                   1,000,000 CHAR: a <string> [
                       [ cat2 ] with-null-writer
                   ] with-string-reader
               ] time
Running time: 0.13840325 seconds

IN: scratchpad [
                   1,000,000 CHAR: a <string> [
                       [ cat3 ] with-null-writer
                   ] with-string-reader
               ] time
Running time: 0.015008417 seconds

The compiled one is a bit more than 2x faster than the interpreted version, but both are slower than the native version.

Let’s try comparing our “Hello, World” example – where the operator coalescing that the compiled version does might help:

: hello1 ( -- )
   "
   ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
   >++.>+.+++++++..+++.>++.<<+++++++++++++++
   .>.+++.------.--------.>+.>.
   " run-brainfuck ;

: hello2 ( -- )
   "
   ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
   >++.>+.+++++++..+++.>++.<<+++++++++++++++
   .>.+++.------.--------.>+.>.
   " interpret-brainfuck ;

We can see that the compiled one is now more like 7x faster:

IN: scratchpad [ [ 10,000 [ hello1 ] times ] with-null-writer ] time
Running time: 0.018075292 seconds

IN: scratchpad [ [ 10,000 [ hello2 ] times ] with-null-writer ] time
Running time: 0.133718416 seconds

Obviously, this is somewhat comparing apples and oranges because it’s ignoring compilation time in the comparison, and I didn’t spend any time on optimizing the interpreted version – for example, stripping blanks or validating inputs before doing the interpreter loop – but it’s a useful starting point for understanding tradeoffs.

How long does it currently take to compile?

IN: scratchpad [
                    [
                       gensym [
                           "
                           ++++++++++[>+++++++>++++++++++>+++>+<<<<-]
                           >++.>+.+++++++..+++.>++.<<+++++++++++++++
                           .>.+++.------.--------.>+.>.
                           " run-brainfuck
                       ] ( -- ) define-declared
                   ] with-compilation-unit
               ] time
Running time: 0.02294075 seconds

…a bit more time than it takes to call it 10,000 times. Interesting.

Fri, 16 May 2025 15:00:00

John Benediktsson: Roman Sort

Factor has included support for Roman numerals since at least 2008. Sometimes recent events – such as the election of Pope Leo XIV or the Super Bowl LIX – revive modern interest in how they work and how to computationally work with them.

There was a blog post a few days ago about sorting Roman numerals which pointed out that sorting them alphabetically worked pretty well. Given that we have a pretty good roman vocabulary, I thought we could explore different methods of sorting a sequence of strings representing Roman numbers.

Let’s first try the mostly-but-not-entirely-correct method suggested in the blog post:

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad sort .
{ "I" "II" "III" "IV" "IX" "V" "VI" "VII" "VIII" }

Well, that’s almost correct, but the number IX – the number 9 – sorts in the middle, rather than at the end. Could we fix this by using sort-by to convert the string to a number before calling compare to produce a sorted output?

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad [ roman> ] sort-by .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }

That fixes it, but how often do we end up calling the conversion function?

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad SYMBOL: calls

IN: scratchpad [ calls inc roman> ] sort-by .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }

IN: scratchpad calls get .
40

Wow, 40 times – that seems like a lot!

Perhaps we could try the Schwartzian transform – also known as decorate-sort-undecorate – at the expense of using intermediate storage by saving the keys for each element, then sorting, and then returning only the values:

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad [ [ roman> ] keep ] map>alist sort-keys values .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }

That seems like a lot of code to do a simple thing. Instead, we can use the map-sort abstraction which implements the same method:

IN: scratchpad 10 [1..b) [ >ROMAN ] map

IN: scratchpad [ roman> ] map-sort .
{ "I" "II" "III" "IV" "V" "VI" "VII" "VIII" "IX" }

Does this make much of a difference? Let’s compare each method:

: roman-sort1 ( seq -- sorted-seq ) [ roman> ] sort-by ;

: roman-sort2 ( seq -- sorted-seq ) [ roman> ] map-sort ;

We can time sorting a list of 100,000 random Roman numbers under 1,000:

IN: scratchpad 100,000 1,000 [1..b] randoms [ >ROMAN ] map

IN: scratchpad [ [ roman-sort1 ] time ]
               [ [ roman-sort2 ] time ] bi
Running time: 4.164076625 seconds
Running time: 0.154227583 seconds

You can see that the decorate-sort-undecorate pattern is quite a bit faster in this case. This is not always true, but generally depends on how much work the key function is doing.

Mon, 12 May 2025 15:00:00

John Benediktsson: Recamán’s Sequence

I love reading John D. Cook’s blog which covers various mathematical concepts, code snippets, and other curious explorations. Today, he celebrated his 5,000th post which is a pretty incredible milestone. In honor of that, let’s implement the Recamán’s sequence which he wrote about yesterday:

I recently ran into Recamán’s sequence. N. J. A. Sloane, the founder of the Online Encyclopedia of Integer Sequences calls Recamán’s sequence one of his favorites. It’s sequence A005132 in the OEIS.

This sequence starts at 0 and the nth number in the sequence is the result of moving forward or backward n steps from the previous number. You are allowed to move backward if the result is positive and a number you haven’t already visited. Otherwise you move forward.

Here’s Python code to generate the first N elements of the sequence.

def recaman(N):
    a = [0]*N
    for n in range(1, N):
        proposal = a[n-1] - n
        if proposal > 0 and proposal not in set(a):
            a[n] = proposal
        else:
            a[n] = a[n-1] + n
    return a

For example, recaman(10) returns

[0, 1, 3, 6, 2, 7, 13, 20, 12, 21]

Direct Implementation

The code for this subtract if you can, add if you can’t sequence is not particularly complex, but it is often fun to see what an algorithm looks like when implemented in a different language. So, we’ll try a direct implementation in Factor first and then talk about some variations.

Using local variables, we can keep our version very similar to the original:

:: recaman ( N -- seq )
    N 0 <array> :> a
    N [1..b) [| n |
        n 1 - a nth n - :> proposal
        proposal 0 > proposal a member? not and [
            proposal n a set-nth
        ] [
            n 1 - a nth n + n a set-nth
        ] if
    ] each a ;

And, show that it works:

IN: scratchpad 10 recaman .
{ 0 1 3 6 2 7 13 20 12 21 }

Short-Circuit Logic

One difference that is subtle is that Python’s and operation is short-circuiting so that if the first expression returns False, the second expression is not evaluated. We can make that change ourselves:

- proposal 0 > proposal a member? not and [
+ proposal 0 > [ proposal a member? not ] [ f ] if [

Or use short-circuit combinators to do that for us:

- proposal 0 > proposal a member? not and [
+ proposal { [ 0 > ] [ a member? not ] } 1&& [

Reduce Repetitions

We could also notice repetitions that occur, for example both branches of the if set the nth value of the array. In addition, we could keep the proposal on the stack instead of naming it as a local variable, and then just replace it when the branch evaluates to f.

Incorporating these changes results in this version:

:: recaman ( N -- seq )
    N 0 <array> :> a
    N [1..b) [| n |
        n 1 - a nth n -
        dup { [ 0 > ] [ a member? not ] } 1&& [
            drop n 1 - a nth n +
        ] unless n a set-nth
    ] each a ;

There is still repetition as each iteration looks up the previous value twice. Instead, we could fix that by storing that value as a local variable:

:: recaman ( N -- seq )
    N 0 <array> :> a
    N [1..b) [| n |
        n 1 - a nth :> prev
        prev n - dup { [ 0 > ] [ a member? not ] } 1&&
        [ drop prev n + ] unless n a set-nth
    ] each a ;

Sometimes it is simpler to extract an inner loop and then see how it looks:

:: next-recaman ( prev n a -- next )
    prev n - dup { [ 0 > ] [ a member? not ] } 1&&
    [ drop prev n + ] unless ;

:: recaman ( N -- seq )
    N 0 <array> :> a
    N [1..b) [
        [ 1 - a nth ] [ a next-recaman ] [ a set-nth ] tri
    ] each a ;

The retrieval of the previous value is un-necessary work on each iteration. We could remove that by keeping the previous value on the stack and use the make vocabulary to accumulate values. We also start from index 0 instead of 1 as a simplification:

:: next-recaman ( prev n -- next )
    prev n - dup { [ 0 > ] [ building get member? not ] } 1&&
    [ drop prev n + ] unless ;

:: recaman ( N -- seq )
    [ 0 N <iota> [ next-recaman dup , ] each drop ] { } make ;

Constant-time Lookups

But, there is a big performance issue. Searching the previous values to see if it had been generated before uses a linear-time algorithm. It would be much faster to use a bit-vector to track the previously seen numbers and a constant-time lookup:

:: next-recaman ( prev n seen -- next )
    prev n - dup { [ 0 > ] [ seen nth not ] } 1&&
    [ drop prev n + ] unless t over seen set-nth ;

:: recaman ( N -- seq )
    0 N dup <bit-vector> '[ _ next-recaman dup ] map-integers nip ;

This allows us to generate, for example, one million values in the sequence in a small amount of time:

IN: scratchpad [ 1,000,000 recaman last . ] time
1057164

Running time: 0.049441708 seconds

Or even calculate the nth value without saving the sequence:

:: nth-recaman ( N -- elt )
    0 N dup <bit-vector> '[ _ next-recaman ] each-integer ;

Using Generators

We could even use the generators vocabulary to make an infinite stream that we can sample from:

:: next-recaman ( prev n seen -- next )
    prev n - dup { [ 0 > ] [ seen nth not ] } 1&&
    [ drop prev n + ] unless t over seen set-nth ;

GEN: recaman ( -- gen )
    0 0 ?V{ } clone '[
        [ _ next-recaman dup yield ] [ 1 + ] bi t
    ] loop 2drop ;

And then take some values from it:

IN: scratchpad recaman 10 take .
{ 0 1 3 6 2 7 13 20 12 21 }

It is often pleasantly surprising to explore seemingly simple topics – for example, looking at this algorithm and thinking about different time complexity, data structures, or programming language libraries that might be useful.

I have added this today to the Factor development version.

Tue, 6 May 2025 17:00:00

John Benediktsson: Filtering Errors

There was a discussion today on the Factor Discord server about how one might filter or reject objects in a sequence based on whether a quotation throws an error or not when applied to each item. We’re going to implement that now in Factor and hopefully learn a few things.

First, we need a way to see if a quotation throws an error or not. One way would be to call the quotation and use recover to return a different boolean if an error was thrown. That might look something like this:

: throws? ( quot -- ? )
    '[ @ f ] [ drop t ] recover ; inline

And that kinda works:

IN: scratchpad 10 20 [ + ] throws?

--- Data stack:
30
f

IN: scratchpad 10 "20" [ + ] throws?

--- Data stack:
10
"20"
t

But you can see that in the working first case, it did not throw an error and the stack has the single output of the quotation and f, but in the failed second case it has the two inputs to the quotation and a t indicating that an error was thrown.

So, ideally we could modify this so that when the quotation succeeds, we drop the outputs, and when the quotation fails, we drop the inputs, and then can consistently produce a single boolean output for any quotation this applies to.

Luckily for us, we can use the drop-outputs and drop-inputs words which infer the stack effect and handles each case correctly. This would change our implementation slightly:

: throws? ( quot -- ? )
    [ '[ _ drop-outputs f ] ]
    [ '[ drop _ drop-inputs t ] ] bi recover ; inline

And now we can see that it handles both cases nicely:

IN: scratchpad 10 20 [ + ] throws?

--- Data stack:
f

IN: scratchpad 10 "20" [ + ] throws?

--- Data stack:
t

Now, we can combine that with our sequence operations and make these useful words:

: filter-errors ( ... seq quot -- ... subseq )
    '[ _ throws? ] filter ; inline

: reject-errors ( ... seq quot -- ... subseq )
    '[ _ throws? ] reject ; inline

And then see how they work:

IN: scratchpad { t "5" 123 } [ string>number ] filter-errors .
{ t 123 }

IN: scratchpad { t "5" 123 } [ string>number ] reject-errors .
{ "5" }

These words were added recently to the development version of Factor.

Mon, 5 May 2025 15:00:00

John Benediktsson: Human Sorting Improved

Factor has had human-friendly sorting since December 2007. It is unclear if it is related, but Ned Batchelder wrote about “human sorting” around the same time with links to a couple of other blog posts discussing similar topics, so perhaps it was in the zeitgeist of the time.

In any event, Ned recently wrote about “human sorting improved” which deals with the topic of how to sort two strings that are human-equivalent but are different and should probably have an ordering. This was the result of fixing a problem that actually happened in the coverage.py project.

For example, comparing "x1y" and "x001y" using the original algorithm would consider these to be equal given the same human keys: { "x" 1 "y" }.

You can see this in Factor 0.100:

IN: scratchpad USE: sorting.human

IN: scratchpad "x1y" "x001y" human<=> .
+eq+

Ned suggests that instead – if two strings are equal using the human sorting method – they should be compared for lexicographic ordering as a tie-breaker.

I have made that change in the latest development version of Factor so that the behavior is changed:

IN: scratchpad "x1y" "x001y" human<=> .
+gt+

Sun, 30 Mar 2025 15:00:00

John Benediktsson: Sports Betting

I have lately been curious about the Zig Programming Language and how it might be useful to Factor. In the process of doing some research, I bumped into a blog post about Converting decimal odds to probabilities with Zig, which shares some Zig source code for calculating sports betting odds:

As sports betting expands across the world, more and more players don’t understand tha math behind it. So, the purpose of this article it’s to clarify how to convert from decimal odds to the probability behind them. The calculations are implemented in Zig, a system programming language perfect to create CLI tools because it’s fast and simple.

It is particularly interesting given the growth and popularity of both sports betting on platforms like DraftKings, but also prediction markets like Polymarket and Manifold. You can read about some background information in Sports Betting Odds: How They Work and How To Read Them.

Note: I am not encouraging gambling, and in addition you should do your own research on the various platforms as there can be controversy about potential manipulation and transparency of the markets.

I don’t know much about (and do not do any) sports betting, but I thought it would be fun to learn about and then to translate some of these concepts to Factor.

Decimal

The “decimal odds” are popular in Europe and are the total returns including each original $1 bet if you win. These “decimal odds” effectively represent inverse probabilities. That is, if the odds are 4.5, then the probability is about 22% (which is 1/4.5).

Note: a variant of this is “fractional odds” popular in Britain which might instead quote those odds as 7/2 or 7-2 and describe the amount won ($7) in addition to the original bet ($2).

We can convert decimal odds to probabilities:

: odds>probs ( m -- n ) recip 100 * ;

: probs>odds ( n -- m ) 100 / recip ;

Sometimes the odds are quoted based on different outcomes – for example, in many team games there are three outcomes: win, lose, or tie. These probabilities should sum to 1, and seem to be often quoted with the payout included (typically less than 100%). The example given in the original blogpost is this:

We can calculate the payout for a group of odds:

: compute-payout ( odds -- k )
    [ recip ] map-sum recip ;

And given the example from the blogpost, the payout is about 95%:

IN: scratchpad { 1.27 6.00 10.25 } compute-payout .
0.9509054938365546

Which means, you could convert odds to probabilities using this payout number:

: compute-probs ( odds -- probs )
    dup compute-payout '[ odds>probs _ * ] map ;

This shows the different probabilities of the outcomes, using printf to format the sequence:

IN: scratchpad { 1.27 6.0 10.25 } compute-probs "%[%.2f, %]" printf
{ 74.87, 15.85, 9.28 }

Moneyline

The “moneyline odds” are more popular in the United States. The original blogpost that I mention above did not go into details, but I was curious, so we will explore how moneyline payouts work. These are quoted as positive (underdogs) or negative (favorites) values, and are the amount you would need to bet to receive $100 of winnings.

Using the NFL example from the Investopedia article.

Let’s say a betting website (also known as an online sportsbook) priced an NFL game between the Pittsburgh Steelers and the Kansas City Chiefs with the following moneyline odds.

Steelers: +585

Chiefs: -760

We can convert moneyline odds to decimal odds:

: moneyline>odds ( moneyline -- odds )
    100 / dup 1 < [ recip neg ] when 1 + ;

: odds>moneyline ( odds -- moneyline )
    1 - dup 1 < [ recip neg ] when 100 * ;

And use that to calculate the implied relative probabilities of that game’s winner:

IN: scratchpad { 585 -760 } [ moneyline>odds ] map
               compute-probs "%[%.2f, %]" printf
{ 14.18, 85.82 }

Parlay

There are also “parlay odds” which are multiple bets linked together as one.

Let’s say you want to bet three heavy favorites on the moneyline because you’re confident each team will win, but not sure if they’ll cover the spread.

So you bet Packers -300 against the Lions, Patriots -200 vs. the Jets, and Eagles -150 at the Washington Commanders.

We can convert these to odds:

IN: scratchpad { -300 -200 -150 } [ moneyline>odds ] map .
{ 1+1/3 1+1/2 1+2/3 }

And write a word to convert it to a parlay outcome:

: compute-parlay ( moneyline -- parlay )
    [ moneyline>odds ] map product odds>moneyline ;

So, this example would have a parlay payout of 233:

IN: scratchpad { -300 -200 -150 } compute-parlay .
233+1/3

And a probability of 30%:

IN: scratchpad 233+1/3 moneyline>odds odds>probs .
30

There are probably aspects of this that I did not cover above, but it’s kinda fun to explore new topics that you don’t know much about and to learn!

Wed, 26 Mar 2025 15:00:00

Blogroll


planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.

Syndicate