[ planet-factor ]

John Benediktsson: Reverse Vowels

Our task today is to “reverse vowels of a string”. This sounds like (and probably is) a coding interview question as well as a LeetCode problem, a Codewars kata, and the second task in the Perl Weekly Challenge #254.

If you don’t want spoilers, maybe stop reading here!


We are going to use Factor to solve this problem as well as a variant that is a bit more challenging.

Let’s Reverse The Vowels

One of the benefits of the monorepo approach that we have taken to building the extensive Factor standard library is developing higher-level words that solve specific kind of tasks.

One of those is arg-where – currently in the miscellaneous sequences.extras vocabulary – which we can use to find all the indices in a string that contain a vowel?:

IN: scratchpad "hello" [ vowel? ] arg-where .
V{ 1 4 }

We’ll want to group the beginning and ending indices, ignoring the middle index if the number of indices is odd since it would not change:

: split-indices ( indices -- head tail )
    dup length 2/ [ head-slice ] [ tail-slice* ] 2bi ;

We can then build a word to reverse specified indices:

: reverse-indices ( str indices -- str )
    split-indices <reversed> [ pick exchange ] 2each ;

And then use it to reverse the vowels:

: reverse-vowels ( str -- str )
    dup >lower [ vowel? ] arg-where reverse-indices ;

And see how it works:

IN: scratchpad "factor" reverse-vowels .
"foctar"

IN: scratchpad "concatenative" reverse-vowels .
"cencitanetavo"

Pretty cool!

Let’s Reverse The Vowels, Maintain The Case

A somewhat more challenging task is to reverse the vowels, and to swap their letter case.

Let’s start by building a word to swap the case of two letters:

: swap-case ( a b -- a' b' )
    2dup [ letter? ] bi@ 2array {
        { { t f } [ [ ch>upper ] [ ch>lower ] bi* ] }
        { { f t } [ [ ch>lower ] [ ch>upper ] bi* ] }
        [ drop ]
    } case ;

And then another word to exchange two indices, but also swap their case:

: exchange-case ( i j seq -- )
    [ '[ _ nth ] bi@ swap-case ]
    [ '[ _ set-nth ] bi@ ] 3bi ; inline

A word to reverse the indices, but also swap their case:

: reverse-indices-case ( str indices -- str )
    split-indices <reversed> [ pick exchange-case ] 2each ;

And, finally, a word to reverse the vowels, but also swap their case:

: reverse-vowels-case ( str -- str )
    dup >lower [ vowel? ] arg-where reverse-indices-case ;

And then see how it works:

IN: scratchpad "FActor" reverse-vowels-case .
"FOctar"

A pretty fun problem!

Mon, 12 Feb 2024 19:00:00

John Benediktsson: Dragonbox

One of the challenging problems in computer science is to efficiently take a binary representation of a floating-point number and convert it to the “shortest decimal representation” that will roundtrip back to the same floating-point number when it is parsed.

A few days ago, one of the members of the Factor Discord server posted about an issue they were having where three separate floating-point numbers printed as the same decimal value:

IN: scratchpad 0x1.1ffffffffffffp7 .
144.0

IN: scratchpad 0x1.2p7 .
144.0

IN: scratchpad 0x1.2000000000001p7 .
144.0

Well, that’s not ideal!

And you can see that in other languages like Python, they parse properly into three distinct values:

>>> float.fromhex('0x1.1ffffffffffffp7')
143.99999999999997

>>> float.fromhex('0x1.2p7')
144.0

>>> float.fromhex('0x1.2000000000001p7')
144.00000000000003

In the process of investigating this issue, I re-discovered a few algorithms that have been developed to do this. There is a neat project called Drachennest that investigates the relative performance of several of these algorithms and claims:

Grisu3, Ryu, Schubfach, and Dragonbox are optimal, i.e. the output string

  1. rounds back to the input number when read in,
  2. is as short as possible,
  3. is as close to the input number as possible.

Well, it turns out that the “Dragonbox” algorithm is one of the current best and is described in a paper called A New Floating-Point Binary-to-Decimal Conversion Algorithm as well as a fantastic reference implementation of Dragonbox in C++.

I was able to quickly fix the bug by temporarily using a “modern formatting library” called {fmt} that works in C++11 and provides a version of the C++20 function std::format, but thought it would be a good idea to implement the Dragonbox algorithm someday in pure Factor code and filed an issue to track that idea.

Well, one of our awesome contributors, Giftpflanze, jumped in and implemented Dragonbox in Factor – providing a very readable and understandable and nicely concatenative version – and it was merged today!

Not only does this solve the issue of decimal representation of floats, but it provides quite a large speedup to our float-parsing benchmark:

Currently in Factor 0.99:

IN: scratchpad gc [ parse-float-benchmark ] time
Running time: 3.181906583 seconds

And now after the patch:

IN: scratchpad gc [ parse-float-benchmark ] time
Running time: 0.378132792 seconds

Very impressive!

I’m excited to say that this is now available in the development version of Factor.

Mon, 12 Feb 2024 02:30:00

John Benediktsson: Divmods

There’s a discussion on support multiple divisors in divmod() on the Python.

So instead of

minutes, seconds = divmod(t, 60)
hours, minutes = divmod(minutes, 60)
days, hours = divmod(hours, 24)
weeks, days = divmod(days, 7)

you could write:

weeks, days, hours, minutes, seconds = divmod(t, 7, 24, 60, 60)

Sample implementation:

def new_divmod(dividend, *divisors):
    if not divisors:
        raise TypeError('required at least one divisor')
    remainders = []
    for divisor in reversed(divisors):
        dividend, remainder = old_divmod(dividend, divisor)
        remainders.append(remainder)
    return (dividend, *remainders[::-1])

Along with the sample implementation in Python above, the original author provides some thoughts on whether the order of arguments should be reversed or not, and some of the comments in the thread discuss various implementation details and some other use-cases for this approach.

You can see how it might work by trying the code:

>>> new_divmod(1234567, 7, 24, 60, 60)
(2, 0, 6, 56, 7)

Okay, so how might we do this in Factor?

Well, our version of divmod is /mod and we could just run it a few times to get the result:

IN: scratchpad 1234567 60 /mod swap 60 /mod swap 24 /mod swap 7 /mod swap

--- Data stack:
7
56
6
0
2

Alternatively, we could pass the arguments as a sequence and return the result as a sequence:

IN: scratchpad 1234567 { 60 60 24 7 } [ /mod ] map swap suffix

--- Data stack:
{ 7 56 6 0 2 }

Or, perhaps, we could make a macro, taking the input argument as a sequence, but generating code to put the result onto the stack:

MACRO: /mods ( seq -- quot )
    [ '[ _ /mod swap ] ] map concat ;

And then use it:

IN: scratchpad 1234567 { 60 60 24 7 } /mods

--- Data stack:
7
56
6
0
2

Kind of an interesting idea!

Fri, 2 Feb 2024 15:00:00

John Benediktsson: Crontab

Cron might be the latest, greatest, and coolest “next-generation calendar” as well as now a product called Notion Calendar. But in the good ol’ days, cron was instead known as:

The cron command-line utility is a job scheduler on Unix-like operating systems. Users who set up and maintain software environments use cron to schedule jobs (commands or shell scripts), also known as cron jobs, to run periodically at fixed times, dates, or intervals. It typically automates system maintenance or administration—though its general-purpose nature makes it useful for things like downloading files from the Internet and downloading email at regular intervals.

There are implementations of crond – the cron daemon – on most operating systems. Many of them have standardized on a crontab format that looks something like this:

# ┌───────────── minute (0–59)
# │ ┌───────────── hour (0–23)
# │ │ ┌───────────── day of the month (1–31)
# │ │ │ ┌───────────── month (1–12)
# │ │ │ │ ┌───────────── day of the week (0–6) (Sunday to Saturday;
# │ │ │ │ │                                   7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * * <command to execute>

At first (and sometimes second and third and fourth) glance, this looks a bit inscrutable, and so websites such as crontab guru pop up to help you unpack and explain when a cronentry is expected to be run.

I thought it would be fun to build a parser for these cronentries in Factor.

Let’s start by defining a cronentry type:

TUPLE: cronentry minutes hours days months days-of-week command ;

For each component, there is a variety of allowed inputs:

  • all values in the range: *
  • list of values: 3,5,7
  • range of values: 10-15
  • step values: 1-20/5
  • random value in range: 10~30

We build a parse-value word that will take an input string, a quot to parse the input, and a seq of possible values, as well as a parse-range word to help with optional starting and ending input values.

:: parse-range ( from/f to/f quot: ( input -- value ) seq -- from to )
    from/f [ seq first ] quot if-empty
    to/f [ seq last ] quot if-empty ; inline

:: parse-value ( input quot: ( input -- value ) seq -- value )
    input {
        { [ dup "*" = ] [ drop seq ] }

        { [ CHAR: , over member? ] [
            "," split [ quot seq parse-value ] map concat ] }

        { [ CHAR: / over member? ] [
            "/" split1 [
                quot seq parse-value dup length 1 =
                [ seq swap first seq index seq length ]
                [ 0 over length ] if 1 -
            ] dip string>number <range> swap nths ] }

        { [ CHAR: - over member? ] [
            "-" split1 quot seq parse-range [a..b] ] }

        { [ CHAR: ~ over member? ] [
            "~" split1 quot seq parse-range [a..b] random 1array ] }

        [ quot call 1array ]
    } cond members sort ; inline recursive

We can then make parse-cronentry to parse the entry description, handling days and months differently to allow their abbreviations to be passed as input (e.g., sun for Sunday or jan for January).

: parse-day ( str -- n )
    [ string>number dup 7 = [ drop 0 ] when ] [
        >lower $[ day-abbreviations3 [ >lower ] map ] index
    ] ?unless ;

: parse-month ( str -- n )
    [ string>number ] [
        >lower $[ month-abbreviations [ >lower ] map ] index
    ] ?unless ;

: parse-cronentry ( entry -- cronentry )
    " " split1 " " split1 " " split1 " " split1 " " split1 {
        [ [ string>number ] T{ range f 0 60 1 } parse-value ]
        [ [ string>number ] T{ range f 0 24 1 } parse-value ]
        [ [ string>number ] T{ range f 1 31 1 } parse-value ]
        [ [ parse-month ] T{ range f 1 12 1 } parse-value ]
        [ [ parse-day ] T{ circular f T{ range f 0 7 1 } 1 } parse-value ]
        [ ]
    } spread cronentry boa ;

We can try using it to see what a parsed cronentry looks like:

IN: scratchpad "20-30/5 5 */5 * * /path/to/command" parse-cronentry .
T{ cronentry
    { minutes { 20 25 30 } }
    { hours { 5 } }
    { days { 1 6 11 16 21 26 31 } }
    { months { 1 2 3 4 5 6 7 8 9 10 11 12 } }
    { days-of-week { 0 1 2 3 4 5 6 } }
    { command "/path/to/command" }
}

Now that we have that working, we can use it to calculate the next-time-after a given timestamp that the cronentry will trigger, applying a waterfall to rollover the timestamp until a valid one is found:

:: (next-time-after) ( cronentry timestamp -- )

    f ! should we keep searching for a matching time

    timestamp month>> :> month
    cronentry months>> [ month >= ] find nip
    dup month = [ drop ] [
        [ cronentry months>> first timestamp 1 +year drop ] unless*
        timestamp 1 >>day 0 >>hour 0 >>minute month<< drop t
    ] if

    timestamp day-of-week :> weekday
    cronentry days-of-week>> [ weekday >= ] find nip [
        cronentry days-of-week>> first 7 +
    ] unless* weekday - :> days-to-weekday

    timestamp day>> :> day
    cronentry days>> [ day >= ] find nip [
        cronentry days>> first timestamp days-in-month +
    ] unless* day - :> days-to-day

    cronentry days-of-week>> length 7 =
    cronentry days>> length 31 = 2array
    {
        { { f t } [ days-to-weekday ] }
        { { t f } [ days-to-day ] }
        [ drop days-to-weekday days-to-day min ]
    } case [
        timestamp 0 >>hour 0 >>minute swap +day 2drop t
    ] unless-zero

    timestamp hour>> :> hour
    cronentry hours>> [ hour >= ] find nip
    dup hour = [ drop ] [
        [ cronentry hours>> first timestamp 1 +day drop ] unless*
        timestamp 0 >>minute hour<< drop t
    ] if

    timestamp minute>> :> minute
    cronentry minutes>> [ minute >= ] find nip
    dup minute = [ drop ] [
        [ cronentry minutes>> first timestamp 1 +hour drop ] unless*
        timestamp minute<< drop t
    ] if

    [ cronentry timestamp (next-time-after) ] when ;

: next-time-after ( cronentry timestamp -- timestamp )
    [ dup cronentry? [ parse-cronentry ] unless ]
    [ 1 minutes time+ 0 >>second ] bi*
    [ (next-time-after) ] keep ;

This is great, because we can find the next time that a cronentry will trigger. For example, if we wanted to specify something to trigger at midnight on every leap day:

IN: scratchpad "0 0 29 2 *" now next-time-after timestamp>rfc822 .
"Thu, 29 Feb 2024 00:00:00 -0800"

Or even, the next several times that the cronentry will trigger:

IN: scratchpad "0 0 29 2 *" now 5 [
                   dupd next-time-after [ timestamp>rfc822 . ] keep
               ] times 2drop
"Thu, 29 Feb 2024 00:00:00 -0800"
"Tue, 29 Feb 2028 00:00:00 -0800"
"Sun, 29 Feb 2032 00:00:00 -0800"
"Fri, 29 Feb 2036 00:00:00 -0800"
"Wed, 29 Feb 2040 00:00:00 -0800"

This is available in the crontab vocabulary including some features such as support for aliases (e.g., @daily and @weekly) and some higher-level words for working with crontabs and cronentries.

Wed, 31 Jan 2024 15:00:00

John Benediktsson: Codewars

Codewars is an online platform for learning programming languages by solving small programming exercises called “kata” and subsequently increasing your degree of proficiency via levels of “kyu”. It has useful features such as extensive unit tests, leaderboards, allies for allowing friendly competition, and discussion boards.

It supports an incredible number of programming languages – albeit some of these are in “beta” status – including Factor!

I wanted to draw attention to the Codewars website and point out that it has newly released support for Factor 0.99 due to great community support and some work on the Codewars test vocabulary that was developed specifically for use with the Codewars system.

It’s pretty fun to complete the katas and then see the solutions that other users have contributed.

Give it a try!

Mon, 29 Jan 2024 15:00:00

John Benediktsson: Special Numbers

Lots of numbers are special in various definitions of specialness. This often forms the basis of different programming challenges. In the case of the most recent Perl Weekly Challenge #252, the problem statement declares that a number is “special” in this way:

You are given an array of integers, @ints.

Write a script to find the sum of the squares of all special elements of the given array.

An element $int[i] of @ints is called special if i divides n, i.e. n % i == 0, where n is the length of the given array. Also the array is 1-indexed for the task.

And it gives two examples, which we can use as test cases later when we solve this in Factor.

Spoiler Alert: This weekly challenge deadline is due in a few days from now (on January 21, 2024 at 23:59). This blog post provides some solutions to this challenge. Please don’t read on if you intend to complete the challenge on your own.

Solution

Let’s find the special indices – which are just the divisors of the length of the input sequence – and then take the elements at those special indices:

: special-numbers ( ints -- ints' )
    [ length divisors 1 v-n ] [ nths ] bi ;

And so, we can solve this problem for both provided examples:

{ 21 } [ { 1 2 3 4 } special-numbers sum-of-squares ] unit-test

{ 63 } [ { 2 7 1 19 18 3 } special-numbers sum-of-squares ] unit-test

And a “script”, if we wanted to take input from the command-line, as requested:

MAIN: [
    [ readln ] [
        split-words harvest [ string>number ] map
        dup special-numbers sum-of-squares
        "%u => %u\n" printf
    ] while*
]

Mon, 15 Jan 2024 15:00:00

John Benediktsson: Building Hangman

Recently, Jon Fincher published an interesting Python tutorial describing steps to build a hangman game for the command line in Python. It provides for a nice demo of different programming language features including taking user input, printing to the screen, storing game state, and performing some logic until the game is completed.

A game in progress might look like this – with hangman as the word being guessed:


      ┌───┐
      │   │
      O   │
     ─┼─  │
    / │   │
      │   │
   └──────┘

Your word is: _ a n _ _ a n
Your guesses: a e r s n

I thought it would be fun to show how to build a similar hangman game in Factor, using similar steps.

Step 1: Set Up the Hangman Project

We need to create the hangman vocabulary to store our work. We can use the scaffold-vocab word to create a new vocabulary. It will prompt for which vocab-root to place the new vocabulary into.

IN: scratchpad USE: tools.scaffold

IN: scratchpad "hangman" scaffold-vocab

And then open the vocab in your favorite text editor:

IN: scratchpad "hangman" edit-vocab

We can start the file with all these imports, which we will be using in the implementation below and two symbols that we will use to hold the state of our game.

USING: combinators.short-circuit io io.encodings.utf8 io.files
kernel make math multiline namespaces random sequences
sequences.interleaved sets sorting unicode ;

IN: hangman

SYMBOL: target-word  ! the word being guessed
SYMBOL: guesses      ! all of the guessed letters

Step 2: Select a Word to Guess

Let’s create a vocab:hangman/words.txt file containing all of the possible word choices. The original tutorial had a list of words that you can reference – you are welcome to copy my file or create your own:

IN: scratchpad USE: http.client

IN: scratchpad "https://raw.githubusercontent.com/mrjbq7/re-factor/master/hangman/words.txt"
               "vocab:hangman/words.txt" download-to

Now we can add this word to read the file into memory and then choose a random line.

: random-word ( -- word )
    "vocab:hangman/words.txt" utf8 file-lines random ;

Step 3: Get and Validate the Player’s Input

The user will use the readln word to provide input, and we will validate it by making sure the line contains a single character that is not already in our guesses:

: valid-guess? ( input -- ? )
    {
        [ length 1 = ]
        [ lower? ]
        [ first guesses get ?adjoin ]
    } 1&& ;

Reading the player input is then just looping until we get a valid guess:

: player-guess ( -- ch )
    f [ dup valid-guess? ] [ drop readln ] do until first ;

Step 4: Display the Guessed Letters and Word

We can display the guessed letters as a sorted, space-separated list:

: spaces ( str -- str' )
    CHAR: \s <interleaved> ;

: guessed-letters ( -- str )
    guesses get members sort spaces ;

And the target word is also space-separated with blanks for letters we have not guessed, or the actual letters if they have been guessed successfully:

: guessed-word ( -- str )
    target-word get guesses get '[
        dup _ in? [ drop CHAR: _ ] unless
    ] map spaces ;

Step 5: Draw the Hanged Man

We first calculate the number of wrong guesses, by set difference between the guesses and our target word:

: #wrong-guesses ( -- n )
    guesses get target-word get diff cardinality ;

Displaying the “hanged man” requires a bit more lines of code that the rest of the program, using the number of wrong guesses to pick which to output:

CONSTANT: HANGED-MAN {
[[
      ┌───┐
      │   │
   └──────┘
]] [[
      ┌───┐
      │   │
      O   │
   └──────┘
]] [[
      ┌───┐
      │   │
      O   │
     ─┼─  │
      │   │
      │   │
   └──────┘
]] [[
      ┌───┐
      │   │
      O   │
     ─┼─  │
    / │   │
      │   │
   └──────┘
]] [[
      ┌───┐
      │   │
      O   │
     ─┼─  │
    / \ │
      │   │
   └──────┘
]] [[
      ┌───┐
      │   │
      O   │
     ─┼─  │
    / \ │
      │   │
     ─┴─  │
    /     │     │
   └──────┘
]] [[
      ┌───┐
      │   │
      O   │
     ─┼─  │
    / \ │
      │   │
     ─┴─  │
    /   \ │
    │   │ │
   └──────┘
]]
}

: hanged-man. ( -- )
    #wrong-guesses HANGED-MAN nth print ;

Step 6: Figure Out When the Game Is Over

The game is lost when the player has too many wrong guesses:

: lose? ( -- ? )
    #wrong-guesses HANGED-MAN length 1 - >= ;

The game is won when the word has no unknown letters:

: win? ( -- ? )
    target-word get guesses get diff null? ;

And the game is over when it is won or lost:

: game-over? ( -- ? )
    { [ win? ] [ lose? ] } 0|| ;

Step 7: Run the Game Loop

It is frequently useful in Factor to build helper words that, for example, set up some of the state that our program will use and then run a provided quotation:

: with-hangman ( quot -- )
    [
        random-word target-word ,,
        HS{ } clone guesses ,,
    ] H{ } make swap with-variables ; inline

And then we can use that to build and run the game:

: play-hangman ( -- )
    [
        "Welcome to Hangman!" print

        [ game-over? ] [
            hanged-man.
            "Your word is: " write guessed-word print
            "Your guesses: " write guessed-letters print

            nl "What is your guess? " write flush

            player-guess target-word get in?
            "Great guess!" "Sorry, it's not there." ? print
        ] until

        hanged-man.
        lose? "Sorry, you lost!" "Congrats! You did it!" ? print
        "Your word was: " write target-word get print
    ] with-hangman ;

One last thing we can do is set this word as the main entry point of our vocabulary:

MAIN: play-hangman

Next Steps

Well, that’s kind of fun. You can run this in the listener:

IN: scratchpad "hangman" run

Or at the command-line:

$ ./factor -run=hangman

The source code is available on my GitHub.

Tue, 26 Dec 2023 15:00:00

John Benediktsson: JavaScript Arrays

JSON or “JavaScript Object Notation” is widely used as a data format for storing, transmitting, and retrieving data objects. It is language-independent and has parsers in most modern programming languages. Factor is no exception, containing the json vocabulary.

I wanted to go over some wtfjs that relates to JavaScript Arrays and JavaScript Objects, and then see how something similar might work in Factor!

In JavaScript

You can define an array:

const person = ["John", "Doe", 46];

Or you can define an object:

const person = {firstName: "John", lastName: "Doe", age: 46};

You can start with an array and set it’s values by index:

const person = [];
person[0] = "John";
person[1] = "Doe";
person[2] = 46;

You can start with an object and set it’s values by key:

const person = {};
person["firstName"] = "John";
person["lastName"] = "Doe";
person["age"] = 46;

Or, using “dot notation” on an object:

const person = {};
person.firstName = "John";
person.lastName = "Doe";
person.age = 46;

In JavaScript, arrays are indexed by number, and objects are indexed by name. But, you can mix these and create association arrays that can be… both?

> const list = ["A", "B", "C"];
> list.length
3

> list[0]
"A"

> list["0"]
"A"

> list.key = "value";
> list.length
3

> list.key
"value"

> Object.assign({}, list);
{0: "A", 1: "B", 2: "C", key: "value"}

That’s kinda weird.

In Factor

Maybe Factor needs something like that? What if we define a type that has both a sequence and an assoc, and supports both the sequence protocol and the assoc protocol. How might that look?

TUPLE: js-array seq assoc ;

: <js-array> ( -- js-array )
    V{ } clone H{ } clone js-array boa ;

INSTANCE: js-array sequence

CONSULT: sequence-protocol js-array seq>> ;

INSTANCE: js-array assoc

CONSULT: assoc-protocol js-array assoc>> ;

And now we can do something kinda similar:

IN: scratchpad <js-array>
               { "A" "B" "C" } [ suffix! ] each
               "value" "key" pick set-at

IN: scratchpad dup first .
"A"

IN: scratchpad dup length .
3

IN: scratchpad dup members .
V{ "A" "B" "C" }

IN: scratchpad dup >alist .
{ { "key" "value" } }

IN: scratchpad "key" of .
"value"

Well, it doesn’t handle converting string keys so that "0" of would return the same value as first. And it doesn’t handle combining all the number indexed keys and name indexed keys to an alist output like the Object.assign({}, ...) call above. And probably a few other idiosyncrasies of the JavaScript association array that I’m not familiar with…

But do we like this? I dunno yet.

Wed, 22 Nov 2023 15:00:00

Blogroll


planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.

Syndicate