[ planet-factor ]

John Benediktsson: Factor Tutorial

Andrea Ferretti has posted a great tutorial about Factor!

From the announcement on the mailing list:

Factor has a lot of documentation in the listener, but I have
tried to cover some topics that are present in the official
docs, but scattered throughout it, so that they were not clear
to me at the beginning.

These include for instance:

- the central concept is function composition, the stack is more
of a detail
- how simple is to deploy program and scripts
- what tools are there: linter, inspector, unit testing support,
reverse lookup of function uses...
- what model of multithreading and async I/O are used
- how to make use of multiple cores
- in what sense Factor has an object system
and more

Check it out!

Sat, 22 Nov 2014 16:13:00

John Benediktsson: Factor 0.97 now available

"If birds can glide for long periods of time, then... why can’t I?" - Orville Wright

I'm very pleased to announce the release of Factor 0.97!

OS/CPUWindowsMac OS XLinux

Source code: 0.97

This release is brought to you with over 1,400 commits by the following individuals:

Alex Vondrak, Andrew Pennebaker, Benjamin Pollack, Björn Lindqvist, CW Alston, Doug Coleman, Erik Charlebois, Fred Alger, Iskander Sitdikov, John Benediktsson, Jon Harper, Loryn Jenkins, Paul Woolcock, Roc King, Samuel Tardieu, Steven Stewart-Gallus, and @Profpatsch

Besides some bug fixes and library improvements, I want to highlight the following changes:

Some possible backwards compatibility issues:

  • Fixed mask? in math.bitwise to be more correct
  • Fixed bias in Mersenne Twister random number generator
  • Better support for shebang (no longer need a space after #!)
  • io-error now lives in the libc vocabulary
  • sender stubs in cocoa.messages now named by method signature
  • filter now allocates length of seq, not exemplar.
  • Removed make-assoc in favor of explicit get's.

Some of the improvements to FUEL, Factor's emacs mode:

  • Modernize for emacs 24.3
  • Prepare FUEL to be uploaded to MELPA
  • Change font locking and syntax highlighting
  • Make fuel-help work for vocabularies also
  • New minor mode: fuel-autohelp-mode
  • Fix word help to use correct vocabulary using list
  • Variable controlling whether fuel-mode is loaded automatically
  • Fixes to table rendering

What is Factor

Factor is a concatenative, stack-based programming language with high-level features including dynamic types, extensible syntax, macros, and garbage collection. On a practical side, Factor has a full-featured library, supports many different platforms, and has been extensively documented.

The implementation is fully compiled for performance, while still supporting interactive development. Factor applications are portable between all common platforms. Factor can deploy stand-alone applications on all platforms. Full source code for the Factor project is available under a BSD license.

New libraries:

Improved libraries:

Mon, 3 Nov 2014 00:02:00

John Benediktsson: cURL

The cURL project is a command-line tool and library for transferring data using URL syntax supporting many (many!) protocols. I recently contributed a simple wrapper for libcurl to Factor and wanted to show a little bit about how it was made.

We have a download-to word that uses our HTTP client to download resources from the web. I wanted to show how to build a similar word to download resources using libcurl.


We will use the alien vocabulary to interface with the libcurl C library, defining words to initialize, perform a request, and cleanup


FUNCTION: CURL* curl_easy_init ( ) ;

FUNCTION: int curl_easy_perform ( CURL* curl ) ;

FUNCTION: void curl_easy_cleanup ( CURL* curl ) ;

Before we perform the request, we will want to set various options to control what request is made, using function aliases to allow passing different types of values based on the numeric key:

FUNCTION-ALIAS: curl_easy_setopt_long
int curl_easy_setopt ( CURL* curl, int option, long value ) ;

FUNCTION-ALIAS: curl_easy_setopt_string
int curl_easy_setopt ( CURL* curl, int option, c-string value )

FUNCTION-ALIAS: curl_easy_setopt_pointer
int curl_easy_setopt ( CURL* curl, int option, void* value ) ;

TYPEDEF: int64_t curl_off_t

FUNCTION-ALIAS: curl_easy_setopt_curl_off_t
int curl_easy_setopt ( CURL* curl, int option, curl_off_t value ) ;

: curl_easy_setopt ( curl option value -- code )
over enum>number {
{ [ dup 30000 > ] [ drop curl_easy_setopt_curl_off_t ] }
{ [ dup 20000 > ] [ drop curl_easy_setopt_pointer ] }
{ [ dup 10000 > ] [ drop curl_easy_setopt_string ] }
[ drop curl_easy_setopt_long ]
} cond ;


We can then begin to use libcurl in a few simple Factor words that allow us to present a nice interface to the user. Starting with initializing the library, and registering a destructor the cleanup after we are done:

DESTRUCTOR: curl_easy_cleanup

: curl-init ( -- CURL )
curl_easy_init &curl_easy_cleanup ;

Some of the functions produce an error code that we should check.


: check-code ( code -- )
CURLE_OK assert= ;

We can set options using the curl_easy_setopt words we defined earlier:

: curl-set-opt ( CURL key value -- )
curl_easy_setopt check-code ;

Using these we can set file (opening and registering a destructor to close) and URL options:



: curl-set-file ( CURL path -- )
CURLOPT_FILE swap "wb" fopen &fclose curl-set-opt ;

: curl-set-url ( CURL url -- )
CURLOPT_URL swap present curl-set-opt ;

And a word to perform the "curl":

: curl-perform ( CURL -- )
curl_easy_perform check-code ;

Putting all of that together, we can finally download a URL to a specified local file path:

: curl-download-to ( url path -- )
[ swap curl-set-file ]
[ swap curl-set-url ]
[ curl-perform ] tri
] with-destructors ;

Using it is pretty simple:

IN: scratchpad "http://factorcode.org" "/tmp/factor.html"

Thu, 23 Oct 2014 18:52:00

John Benediktsson: Quicksort

Sorting algorithms are a frequent element to computer science education, conversation amongst programmers, and job interviews. There are many different versions with varying tradeoffs of performance and technique.

I noticed that Rosetta Code has a page on Quicksort implementations. I thought it might make a nice example of translating pseudocode to Factor.

simple quicksort

The "simple quicksort algorithm" has the following pseudocode:

function quicksort(array)
less, equal, greater := three empty arrays
if length(array) > 1
pivot := select any element of array
for each x in array
if x < pivot then add x to less
if x = pivot then add x to equal
if x > pivot then add x to greater
array := concatenate(less, equal, greater)

We can copy it verbatim using the ability to have named local variables:

:: quicksort ( seq -- sorted-seq )
seq length 1 > [
V{ } clone :> less
V{ } clone :> equal
V{ } clone :> greater
seq first :> pivot
seq [| x |
x pivot <=> {
{ +lt+ [ x less push ] }
{ +eq+ [ x equal push ] }
{ +gt+ [ x greater push ] }
} case
] each
less quicksort equal greater quicksort 3append
] [ seq ] if ;

Even though local variables can be convenient, we discourage using them if library words or simpler concepts can express the same logic. Noticing that this partitions the sequence, and then joins the parts, we can make it a bit shorter using some available library words:

: quicksort ( seq -- sorted-seq )
dup empty? [
unclip [
'[ _ before? ] partition [ quicksort ] bi@
] keep prefix append
] unless ;

Neither of these is particularly fast, since they involve the creation of a lot of temporary sequences. There is a better (meaning faster and not really more complex) version available.

better quicksort

The "better quicksort algorithm" is an in-place version that uses swaps to move items into a sorted order. It has the following pseudocode:

function quicksort(array)
if length(array) > 1
pivot := select any element of array
left := first index of array
right := last index of array
while left ≤ right
while array[left] < pivot
left := left + 1
while array[right] > pivot
right := right - 1
if left ≤ right
swap array[left] with array[right]
left := left + 1
right := right - 1
quicksort(array from first index to right)
quicksort(array from left to last index)

We can take a similar translation approach to the first example (using some unsafe words to avoid bounds-checking and mutable local variables) to create this version:

:: (quicksort) ( seq from to -- )
from to < [
from to + 2/ seq nth-unsafe :> pivot
from :> left!
to :> right!

[ left right <= ] [
[ left seq nth-unsafe pivot before? ]
[ left 1 + left! ] while
[ right seq nth-unsafe pivot after? ]
[ right 1 - right! ] while
left right <= [
left right seq exchange-unsafe
left 1 + left!
right 1 - right!
] when
] while

seq from right (quicksort)
seq left to (quicksort)
] when ; inline recursive

: quicksort ( seq -- )
0 over length 1 - (quicksort) ;

This is faster, although about 3x slower than our current merge sort algorithm. There are probably ways we could make it faster (one I noticed and filed an issue to track that also makes merge sort faster).

I have committed a version of this in the sorting.quick vocabulary that I hope to use for faster in-place sorting in the standard library.

Wed, 25 Jun 2014 16:59:00

John Benediktsson: World Cup

Many people are watching the FIFA World Cup 2014 that is going on right now in Brazil. A few days ago, someone posted a gist for following the World Cup in six lines of Python 3. Several people tried to improve it, down to four lines, then down to one or two lines of code.

Without worrying too much about lines of code, here is something similar in Factor.

: worldcup. ( -- )
"http://worldcup.sfg.io/matches" http-get nip json>
[ "status" of "completed" = ] filter
[ "home_team" of ] [ "away_team" of ] bi
[ [ "country" of ] [ "goals" of ] bi ] bi@
"%s %s x %s %s\n" printf
] each ;

And if you run it, you'll get something like this:

IN: scratchpad worldcup.
Brazil 3 x Croatia 1
Mexico 1 x Cameroon 0
Spain 1 x Netherlands 5
Chile 3 x Australia 1
Colombia 3 x Greece 0
Ivory Coast 2 x Japan 1
Uruguay 1 x Costa Rica 3
England 1 x Italy 2
Switzerland 2 x Ecuador 1
France 3 x Honduras 0
Argentina 2 x Bosnia and Herzegovina 1
Iran 0 x Nigeria 0
Germany 4 x Portugal 0
Ghana 1 x USA 2
Belgium 2 x Algeria 1
Russia 1 x Korea Republic 1
Brazil 0 x Mexico 0
Cameroon 0 x Croatia 4
Spain 0 x Chile 2
Australia 2 x Netherlands 3
Colombia 2 x Ivory Coast 1
Japan 0 x Greece 0
Uruguay 2 x England 1
Italy 0 x Costa Rica 1
Switzerland 2 x France 5
Honduras 1 x Ecuador 2
Argentina 1 x Iran 0
Nigeria 1 x Bosnia and Herzegovina 0
Germany 2 x Ghana 2

Extra Credit

If we wanted to engineer this a bit more, we could start adding to the example.

First, we could define a tuple class to hold the result of each game. This isn't really necessary, but it can be nice to see all the fields that are available, and to represent it as an object rather than just a hashtable:

TUPLE: game home_team home_team_events home_team_tbd
away_team away_team_events away_team_tbd winner match_number
datetime location status ;

Then we could get all the game results as tuples, using from-slots to convert from an array of hashtable of attributes:

: worldcup ( -- games )
"http://worldcup.sfg.io/matches" http-get nip json>
[ game from-slots ] map ;

Next, having fun with colors, we use character styles to print the winner in bold green text.

CONSTANT: winner-style H{
{ foreground COLOR: MediumSeaGreen }
{ font-style bold }

And then, using more code than is probably necessary, we print out each team, making sure to format the winner using the style we just defined (using locals for convenience):

: game. ( game -- )
[ home_team>> ] [ away_team>> ] [ winner>> ] tri
:> ( home away winner )

home "country" of dup winner =
[ winner-style format ] [ write ] if bl
home "goals" of number>string write

" x " write

away "country" of dup winner =
[ winner-style format ] [ write ] if bl
away "goals" of number>string write nl
] ;

We want to see the completed games, so we can make a word to filter the list of games.

: completed-games ( games -- games' )
[ status>> "completed" = ] filter ;

Finally, putting all this together, we make one word to print out all the completed games:

: worldcup. ( -- )
worldcup completed-games [ game. ] each ;

The code for this is on my GitHub.

Sun, 22 Jun 2014 14:53:00

John Benediktsson: Filename Sanitization

I came across the Zaru project that provides filename sanitization for Ruby. You can learn a bit about filenames reading the article on Wikipedia. I thought it might be a nice feature to implement something like this for Factor.

The rules for sanitization are relatively simple, so I will list and then implement each one:

1. Certain characters generally don't mix well with certain file systems, so we filter them:

: filter-special ( str -- str' )
[ "/\\?*:|\"<>" member? not ] filter ;

2. ASCII control characters (0x00 to 0x1f) are not usually a good idea, either:

: filter-control ( str -- str' )
[ control? not ] filter ;

3. Unicode whitespace is trimmed from the beginning and end of the filename and collapsed to a single space within the filename:

: filter-blanks ( str -- str' )
[ blank? ] split-when harvest " " join ;

4. Certain filenames are reserved on Windows and are filtered (substituting a "file" placeholder name):

: filter-windows-reserved ( str -- str' )
dup >upper {
"CON" "PRN" "AUX" "NUL" "COM1" "COM2" "COM3" "COM4"
"COM5" "COM6" "COM7" "COM8" "COM9" "LPT1" "LPT2" "LPT3"
"LPT4" "LPT5" "LPT6" "LPT7" "LPT8" "LPT9"
} member? [ drop "file" ] when ;

5. Empty filenames are not allowed, replaced instead with file:

: filter-empty ( str -- str' )
[ "file" ] when-empty ;

6. Filenames that begin with only a "dot" character are replaced with file:

: filter-dots ( str -- str' )
dup first CHAR: . = [ "file" prepend ] when ;

Putting it all together, and requiring the filename to be no more than 255 characters:

: sanitize-path ( path -- path' )
255 short head ;

The code for this (and some tests) is on my GitHub.

Fri, 13 Jun 2014 21:03:00

John Benediktsson: Swift Ranges

Looking at the documentation for the Swift programming language recently released by Apple, I noticed they have support for integer ranges, similar to how the range objects work in Factor.

In Swift, you can get a range of the integers 2 through 6 by doing 2...6 and the integers 2 through 5 by doing 2..6. Notice the use of three or two dots to indicate whether the range includes the last number, or not, respectively.

I thought it would be fun to implement a similar syntax for Factor.

First, you can show that:

IN: scratchpad 2 6 [a,b) >array .
{ 2 3 4 5 }

IN: scratchpad 2 6 [a,b] >array .
{ 2 3 4 5 6 }

Similar to how we implemented fat arrows (also known as "pair rockets" or "hash rockets"), we can define the following syntax words:

SYNTAX: .. dup pop scan-object [a,b) suffix! ;

SYNTAX: ... dup pop scan-object [a,b] suffix! ;

And then use them:

IN: scratchpad 2 .. 6 >array .
{ 2 3 4 5 }

IN: scratchpad 2 ... 6 >array .
{ 2 3 4 5 6 }

Wed, 11 Jun 2014 01:11:00

John Benediktsson: Comparing k-NN in Factor

Recently a pair of blog posts compared implementations of a k-nearest neighbour (k-NN) classifier in F# and OCaml. Subsequently an implementation showing performance in Rust got my attention and I thought it might be nice to demonstrate a version in Factor.

The first OCaml version is 30 lines of code and takes 21 seconds on my laptop:

$ sloccount classifyDigits.ml
ml: 30 (100.00%)

$ time ./classifyDigits
Percentage correct:94.400000

real 0m21.292s
user 0m21.152s
sys 0m0.120s

The second OCaml version is 47 lines of code and takes 12 seconds:

$ sloccount classifyDigitsArray.ml
ml: 47 (100.00%)

$ time ./classifyDigitsArray
Percentage correct:94.400000

real 0m12.563s
user 0m12.434s
sys 0m0.120s

Note: I couldn't get the parallel version to run, but would assume it to have the same 2x speedup that the author saw.


It is often useful to start with the simplest possible code before trying to optimize for performance. I decided to parse the training and validation files (containing comma-separated values, the first of which is the label and the subsequent values are observations) into an array of arrays.

: slurp-file ( path -- {pixels,label} )
ascii file-lines rest [
"," split [ string>number ] map unclip 2array
] map ;

: classify ( training pixels -- label )
'[ first _ distance ] infimum-by second ;

: k-nn ( -- )
"~/trainingsample.csv" slurp-file
"~/validationsample.csv" slurp-file
[ [ first2 [ classify ] [ = ] bi* ] with count ]
[ length ] bi / 100.0 * "Percentage correct: %.1f\n" printf ;

You can see that it produces the desired output of 94.4% correct, and takes about 40 seconds on my laptop.

IN: scratchpad gc [ k-nn ] time
Percentage correct: 94.4
Running time: 40.283777984 seconds

Not too bad for 11 lines of simple code, but slower than it could be. Much of the performance penalty in this version is due to the large amount of generic dispatch, which is something we hope to reduce in future versions of Factor.


I noticed all the observed values were in the range [0-255], so thought a simple speedup might be to store them in a byte-array, and instead of using the builtin distance word, make my own that specializes on byte-arrays.

: slurp-file ( path -- {pixels,label} )
ascii file-lines rest [
"," split [ string>number ] B{ } map-as unclip 2array
] map ;

: distance ( x y -- z )
{ byte-array byte-array } declare 0 [ - sq + ] 2reduce ;

: classify ( training pixels -- label )
'[ first _ distance ] infimum-by second ;

: k-nn ( -- )
"~/trainingsample.csv" slurp-file
"~/validationsample.csv" slurp-file
[ [ first2 [ classify ] [ = ] bi* ] with count ]
[ length ] bi / 100.0 * "Percentage correct: %f\n" printf ;

With that simple change, we get 7x faster than our previous version and roughly as fast as the fastest parallel OCaml version!

IN: scratchpad gc [ k-nn ] time
Percentage correct: 94.400000
Running time: 5.708627884 seconds

The Rust version requires a nightly build and I haven't had a chance to test it, but I assume it is a bit faster, and discussions on r/rust, r/programming, and Hacker News show some fast versions in C++ and D as well.

The code for this is in my GitHub.

Tue, 10 Jun 2014 18:23:00


planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.