Admittedly, I struggle sometimes when I read the word “monoid”. It seems to always remind me of that saying “A Monad is just a Monoid in the Category of Endofunctors” which is both a tongue-twister, requires repeated effort to understand, and is sometimes used in casual conversation when jealously describing the features and capabilities of the Haskell programming language.
In any event, the topic of monoids came up recently on the Factor Discord server. Slava Pestov, the original creator of the Factor programming language, was describing recent work he was doing on some fun mathematical problems:
I’m searching for examples of finitely-presented monoids that cannot be presented by finite complete rewriting systems:
- β¨a, b | aba=aa, baa=aabβ© – my first result in this space.
- β¨a, b | bab=aaa, bbb=bbβ© – explore the equivalence class of a8 in this remarkable monoid.
He clarified in the discussion that “the Knuth-Bendix algorithm can solve many cases but not these two, which is how I found them in the first place”.
The second link above – made extra fun because it uses a=π
and b=π
to make a more emojiful experience – describes this specific problem in more
detail and presents it as a simple game to play. You can see the available
rules, the current state, and the next possible states achieved by applying
either of the rules, which are bi-directional.
Your pie recipe calls for 10 apples, but you only have 8 apples.
Can you turn your 8 apples into 10 apples with these two magic spells?
- πππ βοΈ πππ
- πππ βοΈ ππ
Current state:
ππππππππ
Tap to cast a spell:
- ππππππππ
- ππππππππ
- ππππππππ
- ππππππππ
- ππππππππ
- ππππππππ
When exploring things like this, many questions come to mind. For example:
After the link was shared, I must have clicked through about 5000 different state transitions hoping to randomly stumble upon the solution. And, eventually, recognized that it might be a good idea – or even possibly poetic – to do that exploration using Factor.
Warning: Spoilers ahead!
Let’s start by writing the rules:
CONSTANT: rules {
{ "bab" "aaa" }
{ "bbb" "bb" }
}
For convenience, we will make a sequence containing all the rules – since these are bi-directional and can be applied in either direction – using a literal expression.
CONSTANT: all-rules $[
rules dup [ swap ] assoc-map append
]
We can make a word that takes a from
node and applies a
quotation to the
result of rules a -> b
. Notice that we’re able to use our previous work on
finding subsequences:
:: each-move ( from a b quot: ( next -- ) -- )
from dup a subseq-indices [
cut a length tail b glue quot call
] with each ; inline
And then a word that returns all the next states:
: all-moves ( from -- moves )
[ all-rules [ first2 [ , ] each-move ] with each ] { } make ;
It’s often good practice to try each step out during development, so let’s do that and show the first six possible next states match the ones from the game:
IN: scratchpad "aaaaaaaa" all-moves .
{
"babaaaaa"
"ababaaaa"
"aababaaa"
"aaababaa"
"aaaababa"
"aaaaabab"
}
The next state is nice to have, but we’re generally going to be accumulating paths which are a series of states achieved by traversing the graph of all possible states:
:: all-paths% ( path -- )
path last all-rules [
first2 [ path swap suffix , ] each-move
] with each ;
: all-paths ( paths -- paths' )
[ [ all-paths% ] each ] { } make members ;
So, these are the first two steps in traversing the graph. You can see that some of the possible second moves end up circling back to the starting position, which makes sense since the rules are bi-directional and if applied can be un-applied on the next step.
IN: scratchpad { { "aaaaaaaa" } } all-paths dup . nl all-paths .
{
{ "aaaaaaaa" "babaaaaa" }
{ "aaaaaaaa" "ababaaaa" }
{ "aaaaaaaa" "aababaaa" }
{ "aaaaaaaa" "aaababaa" }
{ "aaaaaaaa" "aaaababa" }
{ "aaaaaaaa" "aaaaabab" }
}
{
{ "aaaaaaaa" "babaaaaa" "aaaaaaaa" }
{ "aaaaaaaa" "babaaaaa" "babbabaa" }
{ "aaaaaaaa" "babaaaaa" "babababa" }
{ "aaaaaaaa" "babaaaaa" "babaabab" }
{ "aaaaaaaa" "ababaaaa" "aaaaaaaa" }
{ "aaaaaaaa" "ababaaaa" "ababbaba" }
{ "aaaaaaaa" "ababaaaa" "abababab" }
{ "aaaaaaaa" "aababaaa" "aaaaaaaa" }
{ "aaaaaaaa" "aababaaa" "aababbab" }
{ "aaaaaaaa" "aaababaa" "aaaaaaaa" }
{ "aaaaaaaa" "aaababaa" "babbabaa" }
{ "aaaaaaaa" "aaaababa" "aaaaaaaa" }
{ "aaaaaaaa" "aaaababa" "babababa" }
{ "aaaaaaaa" "aaaababa" "ababbaba" }
{ "aaaaaaaa" "aaaaabab" "aaaaaaaa" }
{ "aaaaaaaa" "aaaaabab" "babaabab" }
{ "aaaaaaaa" "aaaaabab" "abababab" }
{ "aaaaaaaa" "aaaaabab" "aababbab" }
}
Let’s solve for the shortest paths, we keep track of states we’ve previously seen to avoid cycles, and we iterate using breadth-first-search until we find any solutions:
:: shortest-paths ( from to -- moves )
HS{ from } clone :> seen
{ { from } } :> stack!
f [
drop
! find all next possibilities
stack all-paths
! reject ones that circle back to visited nodes
[ last seen in? ] reject
! reject any that are over the length of ``to``
to length '[ last length _ > ] reject stack!
! add the newly visited nodes
stack [ last seen adjoin ] each
! stop when we find any solutions
stack [ last to = ] filter dup empty?
] loop ;
Note: we reject any states that are longer than our goal state. This provides a nice way to cull the graph and make the search performance more reasonable. You could also choose not do that, and exhaustively search into that area. However, while this is not generally a valid approach to solving these types of problems, it is specifically a valid approach to this one.
There are quite a few shortest paths:
IN: scratchpad "aaaaaaaa" "aaaaaaaaaa" shortest-paths length .
560
Each of those contain 16 nodes, which means 15 rules were applied:
IN: scratchpad "aaaaaaaa" "aaaaaaaaaa" shortest-paths first length .
16
But they only go through a seemingly small number of nodes:
IN: scratchpad "aaaaaaaa" "aaaaaaaaaa" shortest-paths concat members length .
43
How many nodes are there in total in the graph? Let’s find out!
:: full-graph ( from to -- seen )
HS{ from } clone :> seen
{ { from } } [
! find all next possibilities
all-paths
! reject any that are over the length of ``to``
to length '[ last length _ > ] reject
! only include ones that visit new nodes
[ last seen ?adjoin ] filter
] until-empty seen ;
We can see that the shortest solutions go through about 15% of the nodes:
IN: scratchpad "aaaaaaaa" "aaaaaaaaaa" full-graph cardinality .
279
We can use our graph traversal approach and Graphviz to visualize where solutions are found, showing how some areas of the graph are quite hard to randomly get out of and then on the correct path to a solution. We draw the starting node green, the ending node blue, and the nodes that involved in the shortest path as gray:
And that’s kind of interesting, but if we cluster nodes by their depth when first discovered, some other patterns show up:
Such a fun problem!
Phil Eaton issued a HYIBY? – have you implemented bitcask yet? – challenge yesterday. Of course, I immediately realized that I have not and also that it would be fun to build in Factor.
Bitcask is described in the original Bitcask paper as a “log-structured hash-table for fast key/value data”, and was part of the software developed by Basho Technologies as part of the Riak distributed database. Besides the original paper, various developers over the years have bumped into Bitcask and implemented it in different programming languages. Arpit Bhayani, for example, has a nice blog post describing Bitcask that is worth reading for more background.
At its core, Bitcask describes an append-only storage mechanism for building a key-value database. It consists of one-or-more data files, each of which has an optional index file to allow faster recovery when initializing the database, and generally supports GET, PUT, and DELETE operations.
Our data file contains a series of entry records. Each record consists of a
key length
, value length
, key bytes
, and value bytes
. A simple
word provides a way to write these bytes to a file:
: write-entry-bytes ( key value -- )
[ dup length 4 >be write ] bi@ [ write ] bi@ ;
Then, using the serialize vocabulary we can store Factor objects quite simply:
: write-entry ( key value -- )
[ object>bytes ] bi@ write-entry-bytes ;
We need the ability to store tombstone records which indicate that a key has been deleted from the database. In this case, we choose to store a zero-sized value to indicate that:
: write-tombstone ( key -- )
object>bytes f write-entry-bytes ;
Assuming that a data file has had it’s seek position moved to the beginning of an entry record, we can read the value that it contains, or return a boolean indicating that it is not found because it was stored as a tombstone:
: read-entry ( -- value/f ? )
4 read be> 4 read be> [
drop f f
] [
[ seek-relative seek-input ]
[ read bytes>object t ] bi*
] if-zero ;
Our index file contains hints that provide a way to recover the record
offsets into the data files. These hints consist of a series of index
records. Each record consists of a key length
, key bytes
, and file offset
.
We can write our index
mapping of keys to offsets:
: write-index ( index -- )
[
[ object>bytes dup length 4 >be write write ]
[ 4 >be write ] bi*
] assoc-each ;
And then read it back into memory:
: read-index ( -- index )
H{ } clone [
4 read [
be> read bytes>object 4 read be>
swap pick set-at t
] [ f ] if*
] loop ;
We want to make the index files optional, continuing to recover the index by first seeking to the last entry that we have in the index, and then continuing to iterate across the records in the data file to recover the full index, making sure to delete any items that are subsequently observed to contain tombstone entries:
: recover-index ( index -- index' )
dup values [ maximum seek-absolute seek-input ] unless-empty
[
tell-input 4 read [
be> 4 read be> [ read bytes>object ] dip
[ pick delete-at drop ] [
[ pick set-at ]
[ seek-relative seek-input ] bi*
] if-zero t
] [ drop f ] if*
] loop ;
The associative mapping
protocol
describes the words that an assoc
should support. This type of object
provides a mapping of key
to value
, with ways to add, update, and
delete these mappings.
We want our bitcask
type to use a single data file, reading and
recovering from an index file, and then providing ways to modify – by
appending to the data file – the database.
TUPLE: bitcask path index ;
:: <bitcask> ( path -- bitcask )
path dup touch-file
path ".idx" append dup touch-file
binary [ read-index ] with-file-reader
path binary [ recover-index ] with-file-reader
bitcask boa ;
INSTANCE: bitcask assoc
The application should control when and how these index files are persisted:
: save-index ( bitcask -- )
dup path>> ".idx" append binary
[ index>> write-index ] with-file-writer ;
The first operation we support will be set-at
, updating the index after
writing the entry.
M:: bitcask set-at ( value key bitcask -- )
bitcask path>> binary [
tell-output
key value write-entry
key bitcask index>> set-at
] with-file-appender ;
Next, we support at*
, to lookup a value by seeking in the data file and
reading the entry:
M:: bitcask at* ( key bitcask -- value/f ? )
key bitcask index>> at* [
bitcask path>> binary [
seek-absolute seek-input read-entry
] with-file-reader
] [ drop f f ] if ;
And finally, delete-at
removes a key from the index after writing a
tombstone:
M:: bitcask delete-at ( key bitcask -- )
key bitcask index>> key? [
bitcask path>> binary [
key write-tombstone
key bitcask index>> delete-at
] with-file-appender
] when ;
The assoc-size
of our database is the size of the index:
M: bitcask assoc-size
index>> assoc-size ;
It is helpful to implement >alist
to provide a conversion to an assocation
list, although if the
database gets quite large, this might be of less practical value:
M:: bitcask >alist ( bitcask -- alist )
bitcask path>> binary [
bitcask index>> [
seek-absolute seek-input read-entry t assert=
] { } assoc-map-as
] with-file-reader ;
And a way to clear-assoc
by writing tombstones and clearing the index:
M:: bitcask clear-assoc ( bitcask -- )
bitcask path>> binary [
bitcask index>>
dup keys [ write-tombstone ] each
clear-assoc
] with-file-appender ;
There are some elements desirable in a production database that are not implemented, for example:
This is now available in the development version in the bitcask vocabulary!
One of my first and most memorable graphical programs was implementing John Conway’s Game of Life. At the time, that implementation was as a Java applet. I’ve revisited it periodically in different programming languages including several years ago when I started to implement the Game of Life in Factor – something I’ve always wanted to write about.
The Game of Life is a two-dimensional grid of square cells with fairly simple logic. Each cell can be either live or dead. Each cell interacts with its eight neighboring cells with the following rules determining the next state of the game board:
You can run this in any release since Factor 0.98:
IN: scratchpad "game-of-life" run
And it will look something like this:
Let’s go ahead and build it!
We will model our two-dimensional game board as an array of arrays. And in particular, since each cell has only two states, we will use bit-arrays to reduce the memory requirements by efficiently storing the state, one bit for each cell.
: <grid> ( rows cols -- grid )
'[ _ <bit-array> ] replicate ;
: grid-dim ( grid -- rows cols )
[ length ] [ first length ] bi ;
Making a random grid, which is useful in testing:
: random-grid ( rows cols -- grid )
'[ _ { t f } ?{ } randoms-as ] replicate ;
And a word we can use for debugging, to print a grid out:
: grid. ( grid -- )
[ [ CHAR: # CHAR: . ? ] "" map-as print ] each ;
Some implementations choose to make the game boards infinite, but we are instead going to build a wraparound game board. This allows, for example, a glider shape to fly off the bottom right and then re-appear on the top left of the board, which is a lot more fun to watch.
A useful word calculates adjacent indices for a cell – that wrap at a
max
value of rows or columns:
:: adjacent-indices ( n max -- n-1 n n+1 )
n [ max ] when-zero 1 -
n
n 1 + dup max = [ drop 0 ] when ;
Test it out, showing how it might work in a hypothetical 10 x 10
grid:
! in the middle
IN: scratchpad 3 10 adjacent-indices 3array .
{ 2 3 4 }
! at the start, wrapped around
IN: scratchpad 0 10 adjacent-indices 3array .
{ 9 0 1 }
! at the end, wrapped around
IN: scratchpad 9 10 adjacent-indices 3array .
{ 8 9 0 }
The main game logic requires counting neighbors for each cell. Since each
cell can have 8 neighbors, we can store this count in a half-byte – a
nibble – which can hold the values
[0..15]
. In the batteries-included standard
library, we
have a nibble-arrays
vocabulary
that makes this easy.
The simplest implementation would just iterate across the game board, and for each cell that is live, increment the count for the neighboring indices around it:
:: count-neighbors ( grid -- counts )
grid grid-dim :> ( rows cols )
rows [ cols <nibble-array> ] replicate :> neighbors
grid [| row j |
j rows adjacent-indices
[ neighbors nth ] tri@ :> ( above same below )
row [| cell i |
cell [
i cols adjacent-indices
[ [ above [ 1 + ] change-nth ] tri@ ]
[ nip [ same [ 1 + ] change-nth ] bi@ ]
[ [ below [ 1 + ] change-nth ] tri@ ]
3tri
] when
] each-index
] each-index neighbors ;
Then the last piece of game logic we need is to adjust the grid cells according to the rules – making some transition from live to dead, and others from dead to live based on their state and the neighboring counts.
:: next-step ( grid -- )
grid count-neighbors :> neighbors
grid [| row j |
j neighbors nth :> neighbor-row
row [| cell i |
i neighbor-row nth
cell [
2 3 between? i row set-nth
] [
3 = [ t i row set-nth ] when
] if
] each-index
] each-index ;
Before we move on to creating a graphical user interface for the game, let’s try it out in the Factor listener:
! Create a random 10x10 grid
IN: scratchpad 10 10 random-grid
! Print it out
IN: scratchpad dup grid.
#..#..#.##
##....####
..###.####
.##...#..#
.##....###
..###..#.#
...###.#..
.###....##
#...###.##
.##..#.#..
! Compute the neighbors for each cell
IN: scratchpad dup count-neighbors .
{
N{ 5 5 4 1 2 3 4 6 5 5 }
N{ 5 3 4 4 3 4 4 7 7 7 }
N{ 6 5 4 3 1 4 4 6 6 5 }
N{ 5 4 5 5 2 3 3 6 7 4 }
N{ 5 4 5 5 2 2 3 3 5 3 }
N{ 3 3 4 5 4 3 4 3 6 2 }
N{ 3 3 6 6 5 2 3 2 5 3 }
N{ 4 2 3 4 6 5 4 4 4 4 }
N{ 4 5 5 4 3 3 3 4 4 4 }
N{ 5 3 2 3 4 4 5 4 5 6 }
}
! Compute the next generation
IN: scratchpad dup next-step
! Print it out
IN: scratchpad dup grid.
.....#....
.#..#.....
...#......
.....##...
......##.#
##...#.#.#
##...###.#
.##.......
....###...
.###......
It works!
In Factor, one of the ways we can build user
interfaces is
using gadgets and OpenGL rendering
instructions.
We start by modeling our game as a
gadget with
a grid
object, a size
that specifies the rendered pixels-per-cell, and
a timer
to control the speed of repainting new generations.
TUPLE: grid-gadget < gadget grid size timer ;
Our default gadget will have cells that are 20 pixels square, and repaint 10 times per second:
: <grid-gadget> ( grid -- gadget )
grid-gadget new
swap >>grid
20 >>size
dup '[ _ dup grid>> next-step relayout-1 ]
f 1/10 seconds <timer> >>timer ;
Gadgets are grafted onto the render hierarchy, and then later ungrafted when they are removed. We handle that state change by stopping the timer before delegating to the parent to cleanup further:
M: grid-gadget ungraft*
[ timer>> stop-timer ] [ call-next-method ] bi ;
The default dimension for our gadget is the grid dimension times the pixel size:
M: grid-gadget pref-dim*
[ grid>> grid-dim swap ] [ size>> '[ _ * ] bi@ 2array ] bi ;
If the grid size
changes – for example, by using the mouse scroll wheel to
zoom in or out – we can create and store a new grid, keeping the cells that
are visible in the same state they were in:
:: update-grid ( gadget -- )
gadget dim>> first2 :> ( w h )
gadget size>> :> size
h w [ size /i ] bi@ :> ( new-rows new-cols )
gadget grid>> :> grid
grid grid-dim :> ( rows cols )
rows new-rows = not cols new-cols = not or [
new-rows new-cols <grid> :> new-grid
rows new-rows min [| j |
cols new-cols min [| i |
i j grid nth nth
i j new-grid nth set-nth
] each-integer
] each-integer
new-grid gadget grid<<
] when ;
We can draw the cells that are live as black
squares:
:: draw-cells ( gadget -- )
COLOR: black gl-color
gadget size>> :> size
gadget grid>> [| row j |
row [| cell i |
cell [
i j [ size * ] bi@ 2array { size size } gl-fill-rect
] when
] each-index
] each-index ;
And then draw the gray
lines that define the grid of cells:
:: draw-lines ( gadget -- )
gadget size>> :> size
gadget grid>> grid-dim :> ( rows cols )
COLOR: gray gl-color
cols rows [ size * ] bi@ :> ( w h )
rows 1 + [| j |
j size * :> y
{ 0 y } { w y } gl-line
] each-integer
cols 1 + [| i |
i size * :> x
{ x 0 } { x h } gl-line
] each-integer ;
Putting this together, we draw our gadget by updating the grid, drawing the cells, and drawing the lines:
M: grid-gadget draw-gadget*
[ update-grid ] [ draw-cells ] [ draw-lines ] tri ;
And, with the “visual REPL”, you can directly render the grid gadget, to see it work:
We now need to build the interactive parts. Let’s first start by handling a
click, to toggle the state of a cell, and storing which state it was toggled
to in the last-click
variable:
SYMBOL: last-click
:: on-click ( gadget -- )
gadget grid>> :> grid
gadget size>> :> size
grid grid-dim :> ( rows cols )
gadget hand-rel first2 [ size /i ] bi@ :> ( i j )
i 0 cols 1 - between?
j 0 rows 1 - between? and [
i j grid nth
[ not dup last-click set ] change-nth
] when gadget relayout-1 ;
That allows us to build a drag feature, where as we drag, we continue to either set cells to live or dead according to what the first click was doing:
:: on-drag ( gadget -- )
gadget grid>> :> grid
gadget size>> :> size
grid grid-dim :> ( rows cols )
gadget hand-rel first2 [ size /i ] bi@ :> ( i j )
i 0 cols 1 - between?
j 0 rows 1 - between? and [
last-click get i j
grid nth set-nth
gadget relayout-1
] when ;
We implement a scrolling feature to adjust the size
of the rendered cells,
clamping the value when it gets too small or too large:
: on-scroll ( gadget -- )
[
scroll-direction get second {
{ [ dup 0 > ] [ -2 ] }
{ [ dup 0 < ] [ 2 ] }
[ 0 ]
} cond nip + 4 30 clamp
] change-size relayout-1 ;
And we store these as "gestures"
that are supported by the gadget:
grid-gadget "gestures" [
{
{ T{ button-down { # 1 } } [ on-click ] }
{ T{ drag { # 1 } } [ on-drag ] }
{ mouse-scroll [ on-scroll ] }
} assoc-union
] change-word-prop
The last bit we need is to make the toolbar, which has a few commands we can run:
:: com-play ( gadget -- )
gadget timer>> restart-timer ;
:: com-stop ( gadget -- )
gadget timer>> stop-timer ;
:: com-clear ( gadget -- )
gadget dup grid>> [ clear-bits ] each relayout-1 ;
:: com-random ( gadget -- )
gadget dup grid>> [ [ drop { t f } random ] map! drop ] each relayout-1 ;
:: com-glider ( gadget -- )
gadget dup grid>> :> grid
{ { 2 1 } { 3 2 } { 1 3 } { 2 3 } { 3 3 } }
[ grid nth t -rot set-nth ] assoc-each relayout-1 ;
:: com-step ( gadget -- )
gadget dup grid>> next-step relayout-1 ;
And then store these as the "toolbar"
command map:
grid-gadget "toolbar" f {
{ T{ key-down { sym "1" } } com-play }
{ T{ key-down { sym "2" } } com-stop }
{ T{ key-down { sym "3" } } com-clear }
{ T{ key-down { sym "4" } } com-random }
{ T{ key-down { sym "5" } } com-glider }
{ T{ key-down { sym "6" } } com-step }
} define-command-map
And finally, we can wrap the grid gadget with something that makes a toolbar, and creates a main window when launched:
TUPLE: life-gadget < track ;
: <life-gadget> ( -- gadget )
vertical life-gadget new-track
20 20 make-grid <grid-gadget>
[ <toolbar> format-toolbar f track-add ]
[ 1 track-add ] bi ;
M: life-gadget focusable-child* children>> second ;
MAIN-WINDOW: life-window
{ { title "Game of Life" } }
<life-gadget> >>gadgets ;
As with anything, there are probably things we could continue to improve in our UI framework, but one of the biggest missing pieces are examples of working code, which is largely what motivated writing about this today.
Check it out!
And maybe think about how you might adjust it to be an infinite game board, or to increase performance when computing the next generation, to improve the OpenGL rendering logic, persist the game board between launches, or do things like communicate age of each cell by the color that it is rendered with.
Recently, Chris Siebenmann was lamenting
the lack of a good command line way to sort IPv6
addresses.
This followed a post of his a few years ago about how sort -V
can
easily sort IPv4
addresses.
Since I had some fun talking about sorting Roman
numerals recently – and we have an extensive
standard
library – I
thought I’d talk about how you might solve this problem with
Factor.
As a reminder, IPv6 uses a 128-bit address space with more theoretical addresses than the older – but still quite commonly used – IPv4 32-bit address space.
The internal network address of your computer is sometimes referred to as
localhost or a loopback
address, and
represented as 127.0.0.1
in IPv4, or ::1
in IPv6. We have an
ip-parser
vocabulary with
words for parsing and manipulating IP addresses as well as IP network strings
written in CIDR
notation. We can
use these words to show how to translate these addresses to their byte
representation:
IN: scratchpad "127.0.0.1" parse-ipv4 .
B{ 127 0 0 1 }
IN: scratchpad "::1" parse-ipv6 .
{ 0 0 0 0 0 0 0 1 }
And, we could use that to sort a list of addresses pretty easily:
IN: scratchpad {
"127.0.0.1"
"1.1.1.1"
"8.8.8.8"
"192.168.10.40"
} [ parse-ipv4 ] sort-by .
{
"1.1.1.1"
"8.8.8.8"
"127.0.0.1"
"192.168.10.40"
}
IN: scratchpad {
"2620:0:1cfe:face:b00c::3"
"2001:4860:4860::8844"
"2620:0:ccc::2"
"::1"
"2001:4860:4860::8888"
} [ parse-ipv6 ] sort-by .
{
"::1"
"2001:4860:4860::8844"
"2001:4860:4860::8888"
"2620:0:ccc::2"
"2620:0:1cfe:face:b00c::3"
}
And so, now that some great feedback encouraged us to do command-line eval with auto-use? enabled, we can run this easily as a one-line script:
# make a file full of unsorted IPv6 addresses
$ cat <<EOF > ips.txt
2620:0:1cfe:face:b00c::3
2001:4860:4860::8844
2620:0:ccc::2
::1
2001:4860:4860::8888
EOF
# show that you can parse the file as strings
$ cat ips.txt | ./factor -e="read-lines ."
{
"2620:0:1cfe:face:b00c::3"
"2001:4860:4860::8844"
"2620:0:ccc::2"
"::1"
"2001:4860:4860::8888"
}
# sort and print the sorted output
$ cat ips.txt | ./factor -e="read-lines [ parse-ipv6 ] sort-by [ print ] each"
::1
2001:4860:4860::8844
2001:4860:4860::8888
2620:0:ccc::2
2620:0:1cfe:face:b00c::3
Pretty cool!
Last week, Falk HΓΌffner wrote about making a leap year check in three instructions:
With the following code, we can check whether a year 0 β€ y β€ 102499 is a leap year with only about 3 CPU instructions:
bool is_leap_year_fast(uint32_t y) { return ((y * 1073750999) & 3221352463) <= 126976; }
How does this work? The answer is surprisingly complex. This article explains it, mostly to have some fun with bit-twiddling; at the end, I’ll briefly discuss the practical use.
This is how a leap year check is typically implemented:
bool is_leap_year(uint32_t y) { if ((y % 4) != 0) return false; if ((y % 100) != 0) return true; if ((y % 400) == 0) return true; return false; }
It would be fun to see how that works in Factor and compare the relative performance between a simple version and the new super-fast-highly-optimized 3 instruction version. To do that, we can use the benchmark word to record execution time by calling it repeatedly and returning an average time-per-call in nanoseconds:
TYPED: average-benchmark ( n: fixnum quot -- nanos-per-call )
over [ '[ _ _ times ] benchmark ] dip /f ; inline
Note: We are forcing the iteration loop above to be fixnum
to reduce its
overhead, and due to the design of the benchmark words below, are going to have
code blocks with predictable inputs. Testing your program with random inputs is
also important to see the impact of CPU optimizations such as cache and branch
predictions, or across multiple CPU architectures. Performance is also impacted
by use of code generation features such as
inline and
compiler steps such as dead-code elimination. Benchmarking is hard.
The simple – and typical – implementation can be easily written as:
: leap-year? ( year -- ? )
dup 100 divisor? 400 4 ? divisor? ;
And in fact, that’s how it is implemented in the standard library.
We can write a quick benchmarking word. This ensures we are using the optimizing compiler and also asserts that the result of the word is as expected:
: bench-leap-year ( n year ? -- nanos )
'[ _ leap-year? _ assert= ] average-benchmark ;
And then call it one hundred million times, to see how long it takes each call on average:
IN: scratchpad 100,000,000 2028 t bench-leap-year .
10.53904317
Just under 11 nanoseconds, including the loop and the assert…
The fast implementation suggested by Falk can be written directly as:
: fast-leap-year? ( year -- ? )
1073750999 * 3221352463 bitand 126976 <= ;
And then write a benchmarking word:
: bench-fast-leap-year ( n year ? -- nanos )
'[ _ fast-leap-year? _ assert= ] average-benchmark ;
And see how long it takes:
IN: scratchpad 100,000,000 2028 t bench-fast-leap-year .
4.74783302
Just under 5 nanoseconds…
Well, generally Factor supports arbitrarily large
integers by
allowing integers to implicitly promote from word-sized fixnum
to
overflow into bignum
. And, as they say, you can write the C programming
language in any
language.
A faster implementation might check the input is a fixnum
and then force
math without overflow:
TYPED: faster-leap-year? ( year: fixnum -- ? )
1073750999 fixnum*fast 3221352463 fixnum-bitand 126976 fixnum<= ;
And write a benchmark word:
: bench-faster-leap-year ( n year ? -- nanos )
'[ _ faster-leap-year? _ assert= ] average-benchmark ;
It’s a bit faster:
IN: scratchpad 100,000,000 2028 t bench-faster-leap-year .
3.24267145
Just under 4 nanoseconds…
But, to make sure that we take advantage of the least amount of instructions
possible, we can make it slightly-less-safe by declaring the input to be a
fixnum
to avoid the run-time type checks. This could cause issues if it is
called with other types on the stack.
: fastest-leap-year? ( year -- ? )
{ fixnum } declare
1073750999 fixnum*fast 3221352463 fixnum-bitand 126976 fixnum<= ;
And write a benchmark word:
: bench-fastest-leap-year ( n year ? -- nanos )
'[ _ fastest-leap-year? _ assert= ] average-benchmark ;
And then you can see it gets quite fast indeed:
IN: scratchpad 100,000,000 2028 t bench-fastest-leap-year .
2.82150476
Just under 3 nanoseconds!
But, is it also just 3 instructions?
IN: scratchpad \ fastest-leap-year? disassemble
000075f0afa19490: 89056a5bd1fe mov [rip-0x12ea496], eax
000075f0afa19496: 498b06 mov rax, [r14]
000075f0afa19499: 48c1f804 sar rax, 0x4
000075f0afa1949d: 4869c0d7230040 imul rax, rax, 0x400023d7
000075f0afa194a4: bb0ff001c0 mov ebx, 0xc001f00f
000075f0afa194a9: 4821d8 and rax, rbx
000075f0afa194ac: 4881f800f00100 cmp rax, 0x1f000
000075f0afa194b3: b801000000 mov eax, 0x1
000075f0afa194b8: 48bb5c0e388cf0750000 mov rbx, 0x75f08c380e5c
000075f0afa194c2: 480f4ec3 cmovle rax, rbx
000075f0afa194c6: 498906 mov [r14], rax
000075f0afa194c9: 8905315bd1fe mov [rip-0x12ea4cf], eax
000075f0afa194cf: c3 ret
Pretty close!
There is an extra instruction near the beginning to untag our fixnum
input. Due to the convention around handling booleans in
Factor, there
are a couple of extra instructions at the end for converting the result into
a return value of either t
or f
.
And it could get even faster if either the assert=
was removed, or the code
was made inline
so the function prologue and
epilogue could
be elided into the outer scope.
So much fun.
Raylib is a very neat C library that has become popular as a “simple and easy-to-use library to enjoy videogames programming”. Originally released in 2014, it has seen lots of updates including with the latest version 5.5 representing 11 years of updates since the original version 1.0 release.
You can sense the love this library has received from the extensive feature list:
- NO external dependencies, all required libraries included with raylib
- Multiplatform: Windows, Linux, MacOS, RPI, Android, HTML5… and more!
- Written in plain C code (C99) using PascalCase/camelCase notation
- Hardware accelerated with OpenGL (1.1, 2.1, 3.3, 4.3 or ES 2.0)
- Unique OpenGL abstraction layer: rlgl
- Powerful Fonts module (SpriteFonts, BMfonts, TTF, SDF)
- Multiple texture formats support, including compressed formats (DXT, ETC, ASTC)
- Full 3d support for 3d Shapes, Models, Billboards, Heightmaps and more!
- Flexible Materials system, supporting classic maps and PBR maps
- Animated 3d models supported (skeletal bones animation)
- Shaders support, including Model shaders and Postprocessing shaders
- Powerful math module for Vector, Matrix and Quaternion operations: raymath
- Audio loading and playing with streaming support (WAV, OGG, MP3, FLAC, XM, MOD)
- VR stereo rendering support with configurable HMD device parameters
- Huge examples collection with +120 code examples!
- Bindings to +60 programming languages!
- Free and open source. Check [LICENSE].
In 2019, we first included support for version
2.5
of raylib
. Over the years, we have updated our bindings to include new
functions and features of new versions. Our most recent two releases, Factor
0.99 and Factor
0.100 included support for version
4.5. And in the current development cycle, we have updated to version
5.0
and then again updated to version
5.5.
It is possible to maintain support for multiple versions, but we have chosen
for now to just target the most recent stable release as best we can.
Generally, the raylib
library has been quite stable, with the
occasional deprecation or breaking change which seem to be easy to adjust
for.
As a simple demonstration, we can start with the basic window example – an extremely simple program in the spirit of achieving a black triangle moment – to make sure everything works. We can directly translate this example into Factor using our Raylib bindings – which import the C functions using our convention for kebab-case word names.
USE: raylib
: basic-window ( -- )
800 450 "raylib [core] example - basic window" init-window
60 set-target-fps
[ window-should-close ] [
begin-drawing
RAYWHITE clear-background
"Congrats! You created your first window!"
190 200 20 LIGHTGRAY draw-text
end-drawing
] until close-window ;
And, if you run the example – you’ll end up with this:
Note: if you’re wondering why the background isn’t white, it’s because
RAYWHITE
is a special version of not-quite-white that matches the Raylib
logo.
Of course, your examples can become more complex. For example, Joseph Oziel released a fun game called BitGuessr written in Factor using Raylib. And, given the extensive feature list, your simple program written quickly for a game jam might end up including great audio, images, keyboard, mouse, gamepad, 3d rendering, and other features relatively easily!
I’d love to see more demos written in Factor and Raylib. Happy coding!
Some might describe shuffle words as one of the fundamental building blocks of Factor. Others might describe them as a code smell and seek to use dataflow combinators or other higher-level words to reduce code complexity.
Whatever your opinion is, they are useful concepts in a concatenative
language.
Besides the basic shuffle words – like dup
, swap
, rot
– we have
had the shuffle
vocabulary which
provides some “additional shuffle words” for awhile, as well as a syntax word
that can perform arbitrary shuffles:
IN: scratchpad USE: shuffle
IN: scratchpad { 1 2 3 4 5 } [
shuffle( a b c d e -- d e c a b )
] with-datastack .
{ 4 5 3 1 2 }
This would be quite useful, except that it has had a fundamental issue – the way it is implemented uses a macro to curry the stack arguments into an array, and then pull the stack arguments back out of the array onto the stack in the requested order.
For example, we can look at a simple swap
and a complex shuffle:
IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[
2 0 <array> dup >R >R R> 3 set-slot
R> >R R> dup >R >R R> 2 set-slot
R> >R R> dup >R >R R> >R R> 3 slot R> >R R> >R R> 2 slot
]
IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
5 0 <array> dup >R >R R> 6 set-slot
R> >R R> dup >R >R R> 5 set-slot
R> >R R> dup >R >R R> 4 set-slot
R> >R R> dup >R >R R> 3 set-slot
R> >R R> dup >R >R R> 2 set-slot
R> >R R> dup >R >R R> >R R> 3 slot
R> dup >R >R R> >R R> 2 slot R> dup >R >R R> >R R> 5 slot
R> dup >R >R R> >R R> 4 slot R> >R R> >R R> 6 slot
]
And not only would this be less-than-efficient, it would also turn literal
arguments that were on the stack into run-time arguments and potentially cause
a Cannot apply 'call' to a run-time computed value
error if one of the
shuffled arguments is a
quotation they
hope to use.
This bug was described on our issue tracker and I spent some time recently looking into it.
It turns out that we can use the stack
checker to
indicate that a shuffle is taking place, and use some "special"
machinery
to allow the optimizing compiler to generate efficient and correct code for
these arbitrary shuffles.
After applying a small fix, we can see that the earlier examples are now quite simple:
IN: scratchpad [ shuffle( x y -- y x ) ] optimized.
[ swap ]
IN: scratchpad [ shuffle( a b c d e -- b a d c e ) ] optimized.
[
( 10791205 10791206 10791207 10791208 10791209 -- 10791206 10791205 10791208 10791207 10791209 )
]
This is available in the latest development version.
I was distracted a little by some recent explorations building a Brainfuck interpreter in Factor and had a couple of follow-ups to add to the conversation.
First, I realized my initial quick-and-dirty Brainfuck interpreter didn’t
support nested loops. Specifically, the logic for beginning or ending a
loop just searched for the nearest [
or ]
character without
considering nesting. This was fixed
today
so that will no longer be an issue.
Second, despite the Brainfuck compiler implicitly making an AST (abstract syntax tree) for Brainfuck by virtue of generating a quotation, I thought it would be more fun to build and generate one intentionally.
We can model the Brainfuck commands as operations using the following tuples and singletons:
TUPLE: ptr n ;
TUPLE: mem n ;
SINGLETONS: output input debug ;
TUPLE: loop ops ;
Next, we can build a parser using EBNF to convert the textual commands into our Brainfuck AST:
EBNF: ast-brainfuck [=[
inc-ptr = (">")+ => [[ length ptr boa ]]
dec-ptr = ("<")+ => [[ length neg ptr boa ]]
inc-mem = ("+")+ => [[ length mem boa ]]
dec-mem = ("-")+ => [[ length neg mem boa ]]
output = "." => [[ output ]]
input = "," => [[ input ]]
debug = "#" => [[ debug ]]
space = [ \t\n\r]+ => [[ f ]]
unknown = (.) => [[ "Invalid input" throw ]]
ops = inc-ptr|dec-ptr|inc-mem|dec-mem|output|input|debug|space
loop = "[" {loop|ops}+ "]" => [[ second sift loop boa ]]
code = (loop|ops|unknown)* => [[ sift ]]
]=]
This is interesting, because now we can more easily analyze a piece of Brainfuck code, such as the Hello, World example that I have been frequently using:
IN: scratchpad "
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" ast-brainfuck .
V{
T{ mem { n 10 } }
T{ loop
{ ops
V{
T{ ptr { n 1 } }
T{ mem { n 7 } }
T{ ptr { n 1 } }
T{ mem { n 10 } }
T{ ptr { n 1 } }
T{ mem { n 3 } }
T{ ptr { n 1 } }
T{ mem { n 1 } }
T{ ptr { n -4 } }
T{ mem { n -1 } }
}
}
}
T{ ptr { n 1 } }
T{ mem { n 2 } }
output
T{ ptr { n 1 } }
T{ mem { n 1 } }
output
T{ mem { n 7 } }
output
output
T{ mem { n 3 } }
output
T{ ptr { n 1 } }
T{ mem { n 2 } }
output
T{ ptr { n -2 } }
T{ mem { n 15 } }
output
T{ ptr { n 1 } }
output
T{ mem { n 3 } }
output
T{ mem { n -6 } }
output
T{ mem { n -8 } }
output
T{ ptr { n 1 } }
T{ mem { n 1 } }
output
T{ ptr { n 1 } }
output
}
And then we can implement those operations against a brainfuck
state
object, by deferring to words from our current
implementation:
GENERIC: op ( brainfuck op -- brainfuck )
M: ptr op n>> (>) ;
M: mem op n>> (+) ;
M: output op drop (.) ;
M: input op drop (,) ;
M: debug op drop (#) ;
M: loop op [ get-memory zero? ] swap ops>> '[ _ [ op ] each ] until ;
And now this Brainfuck AST represents a hybrid execution model somewhere between the compiled and interpreted versions:
: hybrid-brainfuck ( code -- )
[ <brainfuck> ] dip ast-brainfuck [ op ] each drop ;
And see that it works:
IN: scratchpad "
++++++++++[>+++++++>++++++++++>+++>+<<<<-]
>++.>+.+++++++..+++.>++.<<+++++++++++++++
.>.+++.------.--------.>+.>.
" hybrid-brainfuck
Hello World!
We also gain some potential for building code optimization techniques that operate on an AST as a step before actual compilation or execution – for example, coalescing adjacent increment and decrement operations or some other more complex analysis.
That, however, is likely to remain an exercise for the reader!
planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.