[ planet-factor ]

John Benediktsson: AnyBar

AnyBar is a macOS status indicator that displays a "colored dot" in the menu bar that can be changed programatically. What it means and when it changes is entirely up to the user.

You can easily install it with Homebrew-cask:

$ brew cask install anybar

The README lists a number of alternative clients in different programming languages. I thought it would be fun to show how to use it from Factor. Since AnyBar responds to AppleScript (and I added support for AppleScript a few years ago), we could do this:

USE: cocoa.apple-script

"tell application \"AnyBar\" to set image name to \"blue\""
run-apple-script

The AnyBar application also listens to a UDP port (default: 1738) and can be instructed to change from a Terminal using a simple echo | nc command:

$ echo -n "blue" | nc -4u -w0 localhost 1738

Using our networking words similarly is pretty simple:

"blue" >byte-array "127.0.0.1" 1738 <inet4> send-once

But if we wanted to get more fancy, we could use symbols to configure which AnyBar instance to send to, with default values to make it easy to use, and resolve-host to lookup hostnames:

SYMBOL: anybar-host
"localhost" anybar-host set-global

SYMBOL: anybar-port
1738 anybar-port set-global

: anybar ( str -- )
ascii encode
anybar-host get resolve-host first
anybar-port get with-port send-once ;

AnyBar is a neat little program!

Mon, 28 Nov 2016 20:58:00

John Benediktsson: Reverse Factorial

A few years ago, I wrote about implementing various factorials using Factor. Recently, I came across a programming challenge to implement a "reverse factorial" function to determine what factorial produces a number, or none if it is not a factorial.

To do this, we examine each factorial in order, checking against the number being tested:

: reverse-factorial ( m -- n )
1 1 [ 2over > ] [ 1 + [ * ] keep ] while [ = ] dip and ;

And some unit tests:

{ 10 } [ 3628800 reverse-factorial ] unit-test
{ 12 } [ 479001600 reverse-factorial ] unit-test
{ 3 } [ 6 reverse-factorial ] unit-test
{ f } [ 18 reverse-factorial ] unit-test

Sun, 27 Nov 2016 06:28:00

John Benediktsson: Gopher Server

A few days ago, I noticed a post about building a Gopher Server in Perl 6. I had already implemented a Gopher Client in Factor, and thought it might be fun to show a simple Gopher Server in Factor in around 50 lines of code.

Using the io.servers vocabulary, we will define a new multi-threaded server that has a directory to serve content from and hostname that it can be accessed at:

TUPLE: gopher-server < threaded-server
{ serving-hostname string }
{ serving-directory string } ;

When a file is requested, it can be streamed back to clients:

: send-file ( path -- )
binary [ [ write ] each-block ] with-file-reader ;

The Gopher protocol is defined in RFC 1436 and lists a few differentiated file types. We use the mime.types vocabulary to return the correct one.

: gopher-type ( entry -- type )
dup directory? [
drop "1"
] [
name>> mime-type {
{ [ dup "text/" head? ] [ drop "0" ] }
{ [ dup "image/gif" = ] [ drop "g" ] }
{ [ dup "image/" head? ] [ drop "I" ] }
[ drop "9" ]
} cond
] if ;

When a directory is requested, we can send a listing of all the sub-directories and files it contains, sending their relative path to the root directory being served so they can be requested properly by the client:

:: send-directory ( server path -- )
path [
[
[ gopher-type ] [ name>> ] bi
dup path prepend-path
server serving-directory>> ?head drop
server serving-hostname>>
server insecure>>
"%s%s\t%s\t%s\t%d\r\n" sprintf utf8 encode write
] each
] with-directory-entries ;

To know which path was requested, we read the line, split on the first tab, carriage return, or newline character we see:

: read-gopher-path ( -- path )
readln [ "\t\r\n" member? ] split1-when drop ;

With all of that built, we can now implement a word to handle a client request:

M: gopher-server handle-client*
dup serving-directory>> read-gopher-path append-path
dup file-info directory? [
send-directory
] [
send-file drop
] if flush ;

Initializing a gopher-server instance and providing a convenience word to start one:

: <gopher-server> ( directory port -- server )
utf8 gopher-server new-threaded-server
"gopher.server" >>name
swap >>insecure
binary >>encoding
"localhost" >>serving-hostname
swap resolve-symlinks >>serving-directory ;

: start-gopher-server ( directory port -- server )
<gopher-server> start-server ;

This is available in the gopher.server vocabulary with a few improvements such as:

  • Support for .gophermap files for alternate results when content is requested.
  • Support for .gopherhead files to print headers above directory listings.
  • Navigation to parent directories using .. links.
  • Display file modified timestamp and file sizes.
  • Improved error handling.

Thu, 27 Oct 2016 21:14:00

John Benediktsson: Cuckoo Filters

A Cuckoo filter is a Bloom filter replacement that allows for space-efficient probabilistic membership checks. Cuckoo filters provide the ability to add and remove items dynamically without significantly degrading space and performance. False positive rates are typically low.

This data structure is explained by Bin Fan, Dave Andersen, Michael Kaminsky, and Michael Mitzenmacher in two papers: Cuckoo Filter: Better Than Bloom and Cuckoo Filter: Practically Better Than Bloom. There is also an implementation in C++ that can be referred to.

The Cuckoo filter is basically a dense hash table that can support high load factors (up to 95%) without degraded performance. Instead of storing objects, we will store a hashed fingerprint.

Buckets

First, we need to create a number of buckets. Each bucket will hold 4 fingerprints. Load factors over 96% will cause us to grow our capacity to the next-power-of-2.

! The number of fingerprints to store in each bucket
CONSTANT: bucket-size 4

! The maximum load factor we allow before growing the capacity
CONSTANT: max-load-factor 0.96

: #buckets ( capacity -- #buckets )
[ bucket-size /i next-power-of-2 ] keep
over / bucket-size / max-load-factor > [ 2 * ] when ;

Making our buckets is then just an array of arrays:

: <cuckoo-buckets> ( capacity -- buckets )
#buckets [ bucket-size f <array> ] replicate ;

Given a fingerprint, we can check if it is in a bucket by calling member?:

: bucket-lookup ( fingerprint bucket -- ? )
member? ;

To insert a fingerprint into the bucket, we find the first empty slot and replace it with the fingerprint. We return a boolean value indicating if we were able to insert it or not:

: bucket-insert ( fingerprint bucket -- ? )
dup [ not ] find drop [ swap set-nth t ] [ 2drop f ] if* ;

To delete a fingerprint, we finding its index (if present) and set it to false.

: bucket-delete ( fingerprint bucket -- ? )
[ f ] 2dip [ index ] keep over [ set-nth t ] [ 3drop f ] if ;

If the bucket is full, we need to be able to swap a fingerprint into the bucket, replacing/removing an existing one:

: bucket-swap ( fingerprint bucket -- fingerprint' )
[ length random ] keep [ swap ] change-nth ;

Hashing

Our hashing strategy will be to generate the SHA-1 hash value for a given byte-array, splitting it into two 32-bit values (a 32-bit fingerprint, and a 32-bit index value). We will also generate an alternate index value as well using a constant from the MurmurHash to mix with the primary index:

: hash-index ( hash -- fingerprint index )
4 over <displaced-alien> [ uint deref ] bi@ ;

: alt-index ( fingerprint index -- alt-index )
[ 0x5bd1e995 w* ] [ bitxor ] bi* ;

: hash-indices ( bytes -- fingerprint index alt-index )
sha1 checksum-bytes hash-index 2dup alt-index ;

Insert/Lookup/Delete

Our Cuckoo filter holds our buckets:

TUPLE: cuckoo-filter buckets ;

: <cuckoo-filter> ( capacity -- cuckoo-filter )
<cuckoo-buckets> cuckoo-filter boa ;

To insert an item into the Cuckoo filter, we calculate its hash-indices and then try inserting it into the bucket specified by the first index, then the bucket specified by the second index. If those buckets are full, we go through a "kickdown" process to move fingerprints from other buckets until we find a bucket that has space, or exceed the maximum number of attempts:

! The maximum number of times we kick down items/displace from
! their buckets
CONSTANT: max-cuckoo-count 500

:: cuckoo-insert ( bytes cuckoo-filter -- ? )
bytes hash-indices :> ( fp! i1 i2 )
cuckoo-filter buckets>> :> buckets
buckets length :> n
{
[ fp i1 n mod buckets nth bucket-insert ]
[ fp i2 n mod buckets nth bucket-insert ]
} 0|| [
t
] [
cuckoo-filter checksum>> :> checksum
2 random zero? i1 i2 ? :> i!
max-cuckoo-count [
drop
fp i n mod buckets nth bucket-swap fp!
fp i alt-index i!

fp i n mod buckets nth bucket-insert
] find-integer >boolean
] if ;

To lookup an item, we calculate the hash-indices and then check the two buckets to see if the fingerprint can be found.

:: cuckoo-lookup ( bytes cuckoo-filter -- ? )
bytes hash-indices :> ( fp i1 i2 )
cuckoo-filter buckets>> :> buckets
buckets length :> n
{
[ fp i1 n mod buckets nth bucket-lookup ]
[ fp i2 n mod buckets nth bucket-lookup ]
} 0|| ;

To delete an item, we calculate the hash-indices and then try and remove it from the first index, or the second index if not found in the first bucket.

:: cuckoo-delete ( bytes cuckoo-filter -- ? )
bytes hash-indices :> ( fp i1 i2 )
cuckoo-filter buckets>> :> buckets
buckets length :> n
{
[ fp i1 n mod buckets nth bucket-delete ]
[ fp i2 n mod buckets nth bucket-delete ]
} 0|| ;

This is available in the cuckoo-filters vocabulary along with some tests, documentation, and a few extra features.

Tue, 9 Aug 2016 00:00:00

John Benediktsson: Backticks

Most languages support running arbitrary commands using something like the Linux system function. Often, this support has both quick-and-easy and full-featured-but-complex versions.

In Python, you can use os.system:

>>> os.system("ls -l")

In Ruby, you can use system as well as "backticks":

irb(main):001:0> system("ls -l")

irb(main):002:0> `ls -l`

Basically, the difference between "system" and "backticks" is:

  • "system" executes a command, returning the exit code of the process.
  • "backticks" executes a command, returning the standard output of the process.

Factor has extensive cross-platform support for launching processes, but I thought it would be fun to show how custom syntax can be created to implement "backticks", capturing and returning standard output from the process:

SYNTAX: `
"`" parse-multiline-string '[
_ utf8 [ contents ] with-process-reader
] append! ;

You can use this in a similar fashion to Ruby or Perl:

IN: scratchpad ` ls -l`
Note: This syntax currently requires a space after the leading backtick. In the future, we have plans for an improved lexer that removes this requirement.

This is available in the backticks vocabulary.

Fri, 15 Jul 2016 16:49:00

John Benediktsson: Clock Angles

Programming Praxis posted about calculating clock angles, specifically to:

Write a program that, given a time as hours and minutes (using a 12-hour clock), calculates the angle between the two hands. For instance, at 2:00 the angle is 60°.

Wikipedia has a page about clock angle problems that we can pull a few test cases from:

{ 0 } [ "12:00" clock-angle ] unit-test
{ 60 } [ "2:00" clock-angle ] unit-test
{ 180 } [ "6:00" clock-angle ] unit-test
{ 18 } [ "5:24" clock-angle ] unit-test
{ 50 } [ "2:20" clock-angle ] unit-test

The hour hand moves 360° in 12 hours and depends on the number of hours and minutes (properly handling midnight and noon to be ):

:: hour° ( hour minutes -- degrees )
hour [ 12 = 0 ] keep ? minutes 60 / + 360/12 * ;

The minute hand moves 360° in 60 minutes:

: minute° ( minutes -- degrees )
360/60 * ;

Using these words, we can calculate the clock angle from a time string:

: clock-angle ( string -- degrees )
":" split1 [ number>string ] bi@
[ hour° ] [ minute° ] bi - abs ;

Wed, 6 Jul 2016 23:37:00

John Benediktsson: left-pad

In the wake of an epic ragequit where Azer Koçulu removed all of his modules from npm (the node.js package manager), there have been so many entertaining discussions and explanations covering what happened.

Today, Programming Praxis posted the leftpad challenge, pointing out that the original solution ran in quadratic time due to it's use of character-by-character string concatenation (but not pointing out that it only works with strings).

First, the original code in Javascript:

function leftpad (str, len, ch) {
str = String(str);
var i = -1;
if (!ch && ch !== 0) ch = ' ';
len = len - str.length;
while (++i < len) {
str = ch + str;
}
return str;
}

Now, a (simpler? faster? more general?) version in Factor:

:: left-pad ( seq n elt -- newseq )
seq n seq length [-] elt <repetition> prepend ;

Using it, you can see it works:

IN: scratchpad "hello" 3 CHAR: h left-pad .
"hello"

IN: scratchpad "hello" 10 CHAR: h left-pad .
"hhhhhhello"

And it even works with other types of sequences:

IN: scratchpad { 1 2 3 } 3 0 left-pad .
{ 1 2 3 }

IN: scratchpad { 1 2 3 } 10 0 left-pad .
{ 0 0 0 0 0 0 0 1 2 3 }

I should also point out that Factor has pad-head that does this in the standard library and node.js has a pad-left module that solves the quadratic time problem (but still only works with strings).

Fri, 25 Mar 2016 16:39:00

John Benediktsson: ISBN

Most books are issued a unique International Standard Book Number (ISBN) number. Often different formats of the same book will have different ISBN numbers. On a print book, you can usually find the ISBN on a barcode on the back cover.

Most countries seem to have a national ISBN registration agency. In some countries this is a free service provided by a government agency. In other countries, this is operated by a commercial entity. In the United States, one company (R.R. Bowker LLC) has an apparent monopoly on issuing ISBN numbers which can cost $125 for one (less if you buy in bulk).

The ISBN is 13 digits long if assigned starting in 2007, and 10 digits long if assigned before 2007. Each ISBN contains a check digit which is used for basic error detection We are going to build a few words in Factor to calculate the check digits and validate ISBNs.

We need to turn an ISBN (which might include spaces or dashes) into numeric digits:

: digits ( str -- digits )
[ digit? ] filter string>digits ;

For ISBN-10, the check digit is the sum of each of 10 digits multiplied by a weight (descending from 10 to 1) modulo 11.

: isbn-10-check ( digits -- n )
0 [ 10 swap - * + ] reduce-index 11 mod ;

For ISBN-13, the check digit is the sum of each of 13 digits multiplied by a weight (alternating between 1 and 3) modulo 10.

: isbn-13-check ( digits -- n )
0 [ even? 1 3 ? * + ] reduce-index 10 mod ;

We can validate an ISBN by grabbing the digits and running either the ISBN-10 or ISBN-13 check and verifying that the result is zero.

: valid-isbn? ( str -- ? )
digits dup length {
{ 10 [ isbn-10-check ] }
{ 13 [ isbn-13-check ] }
} case 0 = ;

The code (and some tests) for this is on my GitHub.

Sat, 19 Sep 2015 21:36:00

Blogroll


planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.

Syndicate