[ planet-factor ]

John Benediktsson: Base256Emoji

While looking into the multibase group of self-identifying base encodings, I discovered Base256Emoji which is an encoding format described by an emoji to use for each byte of an input buffer. This spec is implemented, for example, in both Go and Rust

Despite replacing each byte with a typically 3-byte or 4-byte UTF-8 sequence – which is unusual for byte encodings (they often seek to reduce the length of an input sequence) – there are some nice use cases.

We’re going to implement this in Factor and then discuss some variants.

Implementation

First, we define a 256 item sequence of emojis, one for each byte:

CONSTANT: base256>emoji "๐Ÿš€๐Ÿชโ˜„๐Ÿ›ฐ๐ŸŒŒ๐ŸŒ‘๐ŸŒ’๐ŸŒ“๐ŸŒ”๐ŸŒ•๐ŸŒ–๐ŸŒ—๐ŸŒ˜๐ŸŒ๐ŸŒ๐ŸŒŽ๐Ÿ‰โ˜€๐Ÿ’ป๐Ÿ–ฅ\
๐Ÿ’พ๐Ÿ’ฟ๐Ÿ˜‚โค๐Ÿ˜๐Ÿคฃ๐Ÿ˜Š๐Ÿ™๐Ÿ’•๐Ÿ˜ญ๐Ÿ˜˜๐Ÿ‘๐Ÿ˜…๐Ÿ‘๐Ÿ˜๐Ÿ”ฅ๐Ÿฅฐ๐Ÿ’”๐Ÿ’–๐Ÿ’™๐Ÿ˜ข๐Ÿค”๐Ÿ˜†๐Ÿ™„๐Ÿ’ช๐Ÿ˜‰โ˜บ๐Ÿ‘Œ๐Ÿค—๐Ÿ’œ๐Ÿ˜”๐Ÿ˜Ž๐Ÿ˜‡\
๐ŸŒน๐Ÿคฆ๐ŸŽ‰๐Ÿ’žโœŒโœจ๐Ÿคท๐Ÿ˜ฑ๐Ÿ˜Œ๐ŸŒธ๐Ÿ™Œ๐Ÿ˜‹๐Ÿ’—๐Ÿ’š๐Ÿ˜๐Ÿ’›๐Ÿ™‚๐Ÿ’“๐Ÿคฉ๐Ÿ˜„๐Ÿ˜€๐Ÿ–ค๐Ÿ˜ƒ๐Ÿ’ฏ๐Ÿ™ˆ๐Ÿ‘‡๐ŸŽถ๐Ÿ˜’๐Ÿคญโฃ๐Ÿ˜œ๐Ÿ’‹\
๐Ÿ‘€๐Ÿ˜ช๐Ÿ˜‘๐Ÿ’ฅ๐Ÿ™‹๐Ÿ˜ž๐Ÿ˜ฉ๐Ÿ˜ก๐Ÿคช๐Ÿ‘Š๐Ÿฅณ๐Ÿ˜ฅ๐Ÿคค๐Ÿ‘‰๐Ÿ’ƒ๐Ÿ˜ณโœ‹๐Ÿ˜š๐Ÿ˜๐Ÿ˜ด๐ŸŒŸ๐Ÿ˜ฌ๐Ÿ™ƒ๐Ÿ€๐ŸŒท๐Ÿ˜ป๐Ÿ˜“โญโœ…๐Ÿฅบ๐ŸŒˆ๐Ÿ˜ˆ\
๐Ÿค˜๐Ÿ’ฆโœ”๐Ÿ˜ฃ๐Ÿƒ๐Ÿ’โ˜น๐ŸŽŠ๐Ÿ’˜๐Ÿ˜ โ˜๐Ÿ˜•๐ŸŒบ๐ŸŽ‚๐ŸŒป๐Ÿ˜๐Ÿ–•๐Ÿ’๐Ÿ™Š๐Ÿ˜น๐Ÿ—ฃ๐Ÿ’ซ๐Ÿ’€๐Ÿ‘‘๐ŸŽต๐Ÿคž๐Ÿ˜›๐Ÿ”ด๐Ÿ˜ค๐ŸŒผ๐Ÿ˜ซโšฝ๐Ÿค™\
โ˜•๐Ÿ†๐Ÿคซ๐Ÿ‘ˆ๐Ÿ˜ฎ๐Ÿ™†๐Ÿป๐Ÿƒ๐Ÿถ๐Ÿ’๐Ÿ˜ฒ๐ŸŒฟ๐Ÿงก๐ŸŽโšก๐ŸŒž๐ŸŽˆโŒโœŠ๐Ÿ‘‹๐Ÿ˜ฐ๐Ÿคจ๐Ÿ˜ถ๐Ÿค๐Ÿšถ๐Ÿ’ฐ๐Ÿ“๐Ÿ’ข๐ŸคŸ๐Ÿ™๐Ÿšจ๐Ÿ’จ\
๐Ÿคฌโœˆ๐ŸŽ€๐Ÿบ๐Ÿค“๐Ÿ˜™๐Ÿ’Ÿ๐ŸŒฑ๐Ÿ˜–๐Ÿ‘ถ๐Ÿฅดโ–ถโžกโ“๐Ÿ’Ž๐Ÿ’ธโฌ‡๐Ÿ˜จ๐ŸŒš๐Ÿฆ‹๐Ÿ˜ท๐Ÿ•บโš ๐Ÿ™…๐Ÿ˜Ÿ๐Ÿ˜ต๐Ÿ‘Ž๐Ÿคฒ๐Ÿค ๐Ÿคง๐Ÿ“Œ๐Ÿ”ต๐Ÿ’…๐Ÿง\
๐Ÿพ๐Ÿ’๐Ÿ˜—๐Ÿค‘๐ŸŒŠ๐Ÿคฏ๐Ÿทโ˜Ž๐Ÿ’ง๐Ÿ˜ฏ๐Ÿ’†๐Ÿ‘†๐ŸŽค๐Ÿ™‡๐Ÿ‘โ„๐ŸŒด๐Ÿ’ฃ๐Ÿธ๐Ÿ’Œ๐Ÿ“๐Ÿฅ€๐Ÿคข๐Ÿ‘…๐Ÿ’ก๐Ÿ’ฉ๐Ÿ‘๐Ÿ“ธ๐Ÿ‘ป๐Ÿค๐Ÿคฎ๐ŸŽผ๐Ÿฅต\
๐Ÿšฉ๐ŸŽ๐ŸŠ๐Ÿ‘ผ๐Ÿ’๐Ÿ“ฃ๐Ÿฅ‚"

And then we compute the reverse mapping:

CONSTANT: emoji>base256 $[ base256>emoji H{ } zip-index-as ]

With those two data blocks, we can define words to convert into and out of base256emoji:

: >base256emoji ( bytes -- str )
    [ base256>emoji nth ] "" map-as ;

: base256emoji> ( str -- bytes )
    [ emoji>base256 at ] B{ } map-as ;

You can try it out:

IN: scratchpad "Hello, Factor!" >base256emoji .
"๐Ÿ˜„โœ‹๐Ÿ€๐Ÿ€๐Ÿ˜“๐Ÿ’ช๐Ÿ˜…๐Ÿ’“๐Ÿคค๐Ÿ’ƒ๐Ÿ˜ˆ๐Ÿ˜“๐Ÿฅบ๐Ÿ‘"

IN: scratchpad "๐Ÿ˜„โœ‹๐Ÿ€๐Ÿ€๐Ÿ˜“๐Ÿ’ช๐Ÿ˜…๐Ÿ’“๐Ÿคค๐Ÿ’ƒ๐Ÿ˜ˆ๐Ÿ˜“๐Ÿฅบ๐Ÿ‘" base256emoji> "" like .
"Hello, Factor!"

Use Cases

The most interesting use case for base256emoji seems to be the ERC-7673: Distinguishable base256emoji Addresses proposal for Ethereum. This proposal seeks to “address spoofing attacks [that] have mislead tens of thousands of ether, and countless other tokens” by using visual emoji-based strings – a similar justification to the Drunken Bishop algorithm.

We can try base256emoji out with checksums to yield a similar benefit, displaying 16 emojis instead of a 32-byte hex-string:

IN: scratchpad "resource:license.txt" md5 checksum-bytes
               bytes>hex-string .
"ebb5ab617e3a88ed43f7d247c6466d95"

IN: scratchpad "resource:license.txt" md5 checksum-bytes
               >base256emoji .
"๐Ÿ’Œ๐Ÿ’จ๐Ÿคจ๐Ÿคค๐Ÿ˜ โœจ๐Ÿ˜น๐Ÿฅ€๐Ÿ˜๐ŸŽผ๐Ÿค ๐Ÿคฉโฌ‡๐Ÿ’“๐ŸŒท๐Ÿค™"

Visual Collisions

Given a desire to use visually dissimilar emojis in identities, it would be useful to think about the chosen emoji set and how that might be interpreted in a visual context. Some criticism of this particular group of emojis, which are sometimes magnified by smaller font sizes, might focus on the visual similarity of:

  1. Similar “globe” emojis (๐ŸŒ ๐ŸŒ ๐ŸŒŽ)
  2. Similar “face” emojis (๐Ÿ™‚ ๐Ÿ˜ ๐Ÿ˜‘ ๐Ÿ™)
  3. Similar “kiss” emojis (๐Ÿ˜™ ๐Ÿ˜š ๐Ÿ˜— ๐Ÿ˜˜)
  4. Similar “star” emojis (โญ ๐ŸŒŸ)
  5. Similar “grin” emojis (๐Ÿ˜€ ๐Ÿ˜ƒ ๐Ÿ˜„ ๐Ÿ˜ ๐Ÿ˜† ๐Ÿ˜…)
  6. Similar “heart” emojis (๐Ÿ’” ๐Ÿ’— ๐Ÿ’• ๐Ÿ’– ๐Ÿ’˜ ๐Ÿ’™ ๐Ÿ’œ ๐Ÿ’š ๐Ÿ’› ๐Ÿ–ค)
  7. Similar “hand” emojis (๐Ÿคž ๐Ÿ‘‹ โœ‹ ๐Ÿ‘Š โœŠ ๐Ÿค ๐Ÿคฒ)

It might be desirable to choose emojis not based on community membership or common usage, but on their most dissimilar visual identity. to make it even harder for scammers to deliberately pick lookalike emojis and rely on small text sizes, platform font differences, or user inattention.

A couple of other emoji sets can be found, for example @Equim-chan/base256 (which uses one of my favorite emojis: ๐Ÿ‘พ), npm/base-emoji, or even the KittenMoji crate.

Fun!

This is available on my GitHub.

Sat, 22 Mar 2025 15:00:00

John Benediktsson: RDAP

ICANN recently posted an update on Launching RDAP; Sunsetting WHOIS which got some discussion on Hacker News. Andy Newton, one of the creators of RDAP, has published A Guide to the Registration Data Access Protocol (RDAP), which is a pretty useful resource for understanding how it works. More information comes from the main RDAP website which describes it as:

The Registration Data Access Protocol (RDAP) is the successor to WHOIS. Like WHOIS, RDAP provides access to information about Internet resources (domain names, autonomous system numbers, and IP addresses). Unlike WHOIS, RDAP provides:

  • A machine-readable representation of registration data;
  • Differentiated access;
  • Structured request and response semantics;
  • Internationalisation;
  • Extensibility.

The WHOIS protocol that it replaces is super simple, being described by RFC 3912 in a few paragraphs. And, in fact, you can test it out on the command-line of most computers:

$ echo -e "factorcode.org\r\n" | nc -i 1 whois.cloudflare.com 43
Domain Name: FACTORCODE.ORG
Registry Domain ID: c49c93dee3304f39b081383262d320c6-LROR
Registrar WHOIS Server: whois.cloudflare.com
Registrar URL: https://www.cloudflare.com
Updated Date: 2025-01-15T22:46:54Z
Creation Date: 2005-12-01T04:54:37Z
Registrar Registration Expiration Date: 2025-12-01T04:54:37Z
Registrar: Cloudflare, Inc.
Registrar IANA ID: 1910
Domain Status: clienttransferprohibited https://icann.org/epp#clienttransferprohibited
...

The RDAP protocol must be equally simple, right? Well, not so fast. Instead of a few paragraphs, and simple queries over sockets, you get many RFCs describing it:

These, along with things like the RDAP Extension registry, and the protocol reliance on HTTP/HTTPS, JSON, and JSONPath considerably increase the complexity of RDAP implementations.

Below we are going to start an implementation using Factor.

Bootstrapping

The first concept we have to implement is RDAP Bootstrapping which uses 5 IANA files to redirect searches to the correct upstream RDAP servers.

Type Link
Forward DNS https://data.iana.org/rdap/dns.json
IPv4 Addresses https://data.iana.org/rdap/ipv4.json
IPv6 Addresses https://data.iana.org/rdap/ipv6.json
Autonomous System Numbers https://data.iana.org/rdap/asn.json
Object Tags https://data.iana.org/rdap/object-tags.json

We can abstract these by making a word to convert a type to a URL:

: bootstrap-url ( type -- url )
    "https://data.iana.org/rdap/" ".json" surround ;

We don’t want to retrieve these files all the time, so let’s cache them for 30 days:

INITIALIZED-SYMBOL: bootstrap-cache [ 30 days ]

: bootstrap-get ( type -- data )
    bootstrap-url cache-directory bootstrap-cache get
    download-outdated-into path>json ;

And provide a way to force delete the cached bootstrap files:

CONSTANT: bootstrap-files { "asn" "dns" "ipv4" "ipv6" "object-tags" }

: reset-bootstrap ( -- )
    [ bootstrap-files [ ".json" append ?delete-file ] each ] with-cache-directory ;

Each bootstrap file is described in RFC 9224, but basically we want to extract and manipulate the "services" block, modifying the keys of the assoc for convenient searching:

: parse-services ( data quot: ( key -- key' ) -- services )
    [ "services" of ] dip '[ [ _ map ] dip ] assoc-map ; inline

: search-services ( services quot: ( key -- ? ) -- urls )
    '[ drop _ any? ] assoc-find drop nip ; inline

We can then provide bootstrap data structures that are used for searching. For example, we find the longest subdomain that has an entry in the dns bootstrap list to handle both SLD and TLD:

: dns-bootstrap ( -- services )
    "dns" bootstrap-get "services" of ;

: split-domain ( domain -- domains )
    "." split dup length <iota> [ tail "." join ] with map ;

: domain-endpoints ( domain -- urls )
    split-domain dns-bootstrap [ swap member? ] with search-services ;

You can see that different domain names are directed to different RDAP endpoints:

IN: scratchpad "factorcode.org" domain-endpoints .
{ "https://rdap.publicinterestregistry.org/rdap/" }

IN: scratchpad "google.com" domain-endpoints .
{ "https://rdap.verisign.com/com/v1/" }

Or, find the correct endpoint for a given IPV4 address from the ipv4 bootstrap list:

: ipv4-bootstrap ( -- services )
    "ipv4" bootstrap-get [ >ipv4-network ] parse-services ;

: ipv4-endpoints ( ipv4 -- urls )
    ipv4-aton ipv4-bootstrap [ ipv4-contains? ] with search-services ;

Lookup

The RDAP data is typically available from HTTP or HTTPS web servers, as JSON files, but it uses a custom mime-type application/rdap+json. We can write a simple word to make the request and convert the response:

: accept-rdap ( request -- request )
    "application/rdap+json" "Accept" set-header ;

: rdap-get ( url -- response rdap )
    <get-request> accept-rdap http-request
    dup string? [ utf8 decode ] unless json> ;

And now we can build a word to lookup a domain:

: lookup-domain ( domain -- results )
    [ domain-endpoints random ]
    [ "domain/%s" sprintf derive-url rdap-get nip ] bi ;

Or to lookup an IPV4 address:

: lookup-ipv4 ( ipv4 -- results )
    [ ipv4-endpoints random ]
    [ "ip/%s" sprintf derive-url rdap-get nip ] bi ;

And we can try it out, getting an extensive response:

IN: scratchpad "factorcode.org" lookup-domain

--- Data stack:
LH{ { "rdapConformance" ~array~ } { "notices" ~array~ } { ...

Output

It would be nice to print the output in a more human-readable format. For now, we will just print these as a nested tree of keys and values:

GENERIC: print-rdap-nested ( padding key value -- )

M: linked-assoc print-rdap-nested
    [ over write write ":" print "  " append ] dip
    [ swapd print-rdap-nested ] with assoc-each ;

M: array print-rdap-nested
    [ print-rdap-nested ] 2with each ;

M: object print-rdap-nested
    present [ 2drop ] [ [ ": " [ write ] tri@ ] dip print ] if-empty ;

: print-rdap ( results -- )
    [ "" -rot print-rdap-nested ] assoc-each ;

This could probably be improved a fair bit – for example, the keys could be made more readable, and it doesn’t handle vCard entries very well.

Try it out!

You can try this out, by lookup up a domain name:

IN: scratchpad "factorcode.org" lookup-domain print-rdap
rdapConformance: rdap_level_0
rdapConformance: icann_rdap_response_profile_0
rdapConformance: icann_rdap_technical_implementation_guide_0
ldhName: factorcode.org
unicodeName: factorcode.org
nameservers:
  ldhName: carl.ns.cloudflare.com
  unicodeName: carl.ns.cloudflare.com
  objectClassName: nameserver
  handle: c34bedeccd8e4514b917e9e82a052077-LROR
  status: associated
nameservers:
  ldhName: kay.ns.cloudflare.com
  unicodeName: kay.ns.cloudflare.com
  objectClassName: nameserver
  handle: 7fc12bf413944de088f27f837349a8da-LROR
  status: associated
...

Or, by lookup up an IP address:

IN: scratchpad "1.1.1.1" lookup-ipv4 print-rdap
rdapConformance: history_version_0
rdapConformance: nro_rdap_profile_0
rdapConformance: cidr0
rdapConformance: rdap_level_0
events:
  eventAction: registration
  eventDate: 2011-08-10T23:12:35Z
events:
  eventAction: last changed
  eventDate: 2023-04-26T22:57:58Z
name: APNIC-LABS
status: active
type: ASSIGNED PORTABLE
endAddress: 1.1.1.255
ipVersion: v4
startAddress: 1.1.1.0
objectClassName: ip network
handle: 1.1.1.0 - 1.1.1.255
...

Pretty cool!

This was recently committed to the development version of Factor.

Tue, 18 Mar 2025 15:00:00

John Benediktsson: Two Sum

A few days ago, Ryan Petermen wrote a tweet about switching to Python from C++ in a solution for the Two Sum problem from LeetCode:

Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.

You may assume that each input would have exactly one solution, and you may not use the same element twice.

You can return the answer in any order.

Spoilers below!

They offered this iterative C++ solution:

std::vector<int> twoSum(std::vector<int>& nums, int target) {
    std::unordered_map<int, int> hashMap;
    std::vector<int> result;

    for (int i = 0; i < nums.size(); ++i) {
        int complement = target - nums[i];
        if (hashMap.find(complement) != hashMap.end()) {
            result.push_back(hashMap[complement]);
            result.push_back(i);
            return result;
        }
        hashMap[nums[i]] = i;
    }
    return result;
}

Followed by this improved Python solution:

def two_sum(nums, target):
    hashmap = {}

    for i, num in enumerate(nums):
        complement = target - num
        if complement in hashmap:
            return [hashmap[complement], i]
        hashmap[num] = i

Of course, I was curious what a Factor solution would look like.

Direct translation

We can start by making a mostly direct translation using local variables:

:: two-sum ( nums target -- i j )
    H{ } clone :> hashmap
    nums <enumerated> [
        first2 :> ( i num )
        target num - :> complement
        complement hashmap at [
            i num hashmap set-at f
        ] unless*
    ] map-find ?first ;

And a few test cases to show that it works:

{ 0 1 } [ { 2 7 11 15 } 9 two-sum ] unit-test
{ 1 2 } [ { 3 2 4 } 6 two-sum ] unit-test
{ 0 1 } [ { 3 3 } 6 two-sum ] unit-test

Using map-find-index

Sometimes a higher-level word exists that can simplify logic, for example map-find-index:

:: two-sum ( nums target -- i j )
    H{ } clone :> hashmap
    nums [| num i |
        target num - :> complement
        complement hashmap at [
            i num hashmap set-at f
        ] unless*
    ] map-find-index drop ;

Implicit arguments

In the spirit of concatenative thinking, we could reduce the amount of named variables we have slightly by not naming the complement internal variable:

:: two-sum ( nums target -- i j )
    H{ } clone :> hashmap
    nums [| num i |
        target num - hashmap at [
            i num hashmap set-at f
        ] unless*
    ] map-find-index drop ;

Fried quotations

And that works, but perhaps we could try some alternative syntax features like fried quotations:

: two-sum ( nums target -- i j )
    H{ } clone dup '[
        swap _ over - _ at [ 2nip ] [ _ set-at f ] if*
    ] map-find-index drop ;

Is that better? Perhaps.

How else might you solve this problem?

Can you do it in one-line?

Sat, 22 Feb 2025 15:00:00

John Benediktsson: Random Derangement

I had a lot of fun writing code to compute derangements recently. I thought I was done with that topic until I bumped into a question on StackOverflow asking how to generate a random derangement of a list. Being nerd sniped is a real thing, and so I started looking at solutions.

There’s a paper called “An analysis of a simple algorithm for random derangements” that has an, ahem, simple algorithm. The basic idea is to generate a random permutation of indices, breaking early if the random permutation is obviously not a derangement.

One way to take a random permutation would be to use our permutations virtual sequence:

IN: scratchpad "ABCDEF" <permutations> random .
"FCEBDA" ! is a derangement

IN: scratchpad "ABCDEF" <permutations> random .
"DFBCEA" ! is NOT a derangement

And so you could loop until a derangement of indices is found:

: random-derangement-indices ( n -- seq )
    f swap <iota> <permutations>
    '[ drop _ random dup derangement? not ] loop ;

But, since only 36% or so of permutations are derangements, perhaps it would be faster and better to implement the algorithm from that paper – making our own random permutation of indices and breaking early if obviously not a derangement:

:: random-derangement-indices ( n -- indices )
    n <iota> >array :> seq
    f [
        dup :> v
        n 1 (a..b] [| j |
            j 1 + random :> p
            p v nth j = [ t ] [ j p v exchange f ] if
        ] any? v first zero? or
    ] [ drop seq clone ] do while ;

We can use that to build a random-derangement word:

: random-derangement ( seq -- seq' )
    [ length random-derangement-indices ] [ nths ] bi ;

And then, for example, get a random derangement of the alphabet – of which there are one hundred and forty-eight septillion derangements, give or take – in under a millisecond:

IN: scratchpad "ABCDEFGHIJKLMNOPQRSTUVWXYZ" random-derangement .
"CZFABMSUXRQDEHGYJLTPVOIKWN"

We could check to make sure that we generate all derangments with equal possibility using a simple test case:

IN: scratchpad 1,000,000 [
                   "ABCD" random-derangement
               ] replicate histogram sort-keys .
{
    { "BADC" 111639 }
    { "BCDA" 110734 }
    { "BDAC" 110682 }
    { "CADB" 111123 }
    { "CDAB" 111447 }
    { "CDBA" 111147 }
    { "DABC" 111215 }
    { "DCAB" 111114 }
    { "DCBA" 110899 }
}

Looks good to me!

Mon, 16 Dec 2024 15:00:00

John Benediktsson: Derangements

Derangements, also sometimes known as deranged permutations, are described as:

In combinatorial mathematics, a derangement is a permutation of the elements of a set in which no element appears in its original position. In other words, a derangement is a permutation that has no fixed points.

There is a fun online derangements generator tool that you can use to play with computing the derangements of a sequence as well as calculating the number of derangements for a given sequence size.

As an example, we can use the math.combinatorics vocabulary, to generate all the permutations of the sequence { 0 1 2 }:

IN: scratchpad { 0 1 2 } all-permutations .
{
    { 0 1 2 }
    { 0 2 1 }
    { 1 0 2 }
    { 1 2 0 }
    { 2 0 1 }
    { 2 1 0 }
}

Since a derangement is a permutation that requires each element to be in a different slot, we could write a word to check the permuted indices to see if that is true:

: derangement? ( indices -- ? )
    dup length <iota> [ = ] 2any? not ;

These would be the two derangements of the indices { 0 1 2 }:

IN: scratchpad { 0 1 2 } all-permutations [ derangement? ] filter .
{
    { 1 2 0 }
    { 2 0 1 }
}

The number of derangements is the subfactorial of the length of the sequence:

: subfactorial ( n -- ? )
    [ 1 ] [ factorial 1 + e /i ] if-zero ;

We can build a <derangement-iota> that is a sequence as long as that number:

: <derangement-iota> ( seq -- <iota> )
    length subfactorial <iota> ; inline

And we can build a next-derangement word that calculates the next permutation that is a derangement:

: next-derangement ( seq -- seq )
    [ dup derangement? ] [ next-permutation ] do until ;

We can then build upon some of the code for iterating permutations, designing an internal derangements-quot word that is similar in form to the existing permutations-quot word:

: derangements-quot ( seq quot -- seq quot' )
    [ [ <derangement-iota> ] [ length <iota> >array ] [ ] tri ] dip
    '[ drop _ next-derangement _ nths-unsafe @ ] ; inline

And then use it to build a series of words that can provide iteration across derangements:

: each-derangement ( ... seq quot: ( ... elt -- ... ) -- ... )
    derangements-quot each ; inline

: map-derangements ( ... seq quot: ( ... elt -- ... newelt ) -- ... newseq )
    derangements-quot map ; inline

: filter-derangements ( ... seq quot: ( ... elt -- ... ? ) -- ... newseq )
    selector [ each-derangement ] dip ; inline

: all-derangements ( seq -- seq' )
    [ ] map-derangements ;

: all-derangements? ( ... seq quot: ( ... elt -- ... ? ) -- ... ? )
    derangements-quot all? ; inline

: find-derangement ( ... seq quot: ( ... elt -- ... ? ) -- ... elt/f )
    '[ _ keep and ] derangements-quot map-find drop ; inline

: reduce-derangements ( ... seq identity quot: ( ... prev elt -- ... next ) -- ... result )
    swapd each-derangement ; inline

And, now we can use this to find the nine derangements for "ABCD":

IN: scratchpad "ABCD" all-derangements .
{
    "BADC"
    "BCDA"
    "BDAC"
    "CADB"
    "CDAB"
    "CDBA"
    "DABC"
    "DCAB"
    "DCBA"
}

This is available on my GitHub.

Sun, 8 Dec 2024 15:00:00

John Benediktsson: Zen of Factor

Years ago, I remember reading a blog post called “Why is Groovy is big?” by Slava Pestov, the original author of Factor. In it, he talks about lines of code and sets the stage for how concatenative thinking can lead to properties like conciseness and readability, ending with this:

I tend to think the majority of code people write is overly complicated, full of redundancy, and designed for such flexibility that in practice is not needed at all. I hope one day this trend reverses.

I’ve been thinking a lot recently about 20 years of Factor and I thought it would be fun to write a Zen of Factor. Perhaps inspired by the Zen of Programming book written in 1987 by Geoffrey James, there are a couple of examples I wanted to point to first.

Python

The Python programming language was one of the first languages that I am aware of to get a Zen, contributed at least as far back as February 2002 in a kind of easter egg fashion encrypted by ROT13:

$ python -c "import this" 
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Zig

Quite a bit more recently, the Zig programming language introduced a Zen through a series of commits starting in August 2017. You can see the current version by running zig zen:

$ zig zen

 * Communicate intent precisely.
 * Edge cases matter.
 * Favor reading code over writing code.
 * Only one obvious way to do things.
 * Runtime crashes are better than bugs.
 * Compile errors are better than runtime crashes.
 * Incremental improvements.
 * Avoid local maximums.
 * Reduce the amount one must remember.
 * Focus on code rather than style.
 * Resource allocation may fail; resource deallocation must succeed.
 * Memory is a resource.
 * Together we serve the users.

Factor

In that spirit, I thought it would be fun to put together some ideas for what a Zen of Factor might look like, incorporating aspects of the language that I enjoy, some thematic elements that might make you more successful learning and developing in it, as well as pointing out the strong community that we have and hope to grow.

Here is the current draft:

The Zen of Factor

The REPL is your playground.
Working code beats perfect theory.
Words are better than paragraphs.
Stack effects tell the story.
Any syntax you need, you can create.
Simple primitives yield powerful combinations.
First make it work, then make it beautiful.
Make it beautiful, then make it fast.
Quick hacks grow into robust solutions.
When in doubt, factor it out.
Every word should do one thing well.
Let the stack guide your logic.
Write less, compose more.
If it works, ship it.
If it doesn't work, fix it.
If you don't like it, change it.
Today's beginner is tomorrow's core developer.
Questions encouraged, PRs welcome.

I really appreciate hearing about programs and problems that developers work on, getting feedback that allows us to iteratively improve, and knowing that every commit leads us towards a better future.

Thank you!

Thu, 5 Dec 2024 15:00:00

John Benediktsson: Watching Code

Factor has a watching words feature that allows you to see the inputs and outputs of a word when it is called. I have wanted a way to define a watched code block that we could use to see the stack before and after some inner part of a word.

It turns out that it’s pretty simple to make this syntax using our existing watch implementation from the tools.annotations vocabulary:

DEFER: WATCH>

SYNTAX: <WATCH
    \ WATCH> parse-until >quotation dup (watch) append! ;

Using this, we can now watch the inner part of a word, for example this word that increments the input by 1:

: foo ( x -- y ) 1 <WATCH + WATCH> ;

And see it used:

IN: scratchpad 10 foo
--- Entering [ + ]
x 10
x 1
--- Leaving [ + ]
x 11

--- Data stack:
11

One thing that I’d like to improve on this someday, is making it so this watch syntax has a prettyprint implementation that allows it to be rendered as it is typed.

I have added this to the latest developer version if you’d like to update and give it a try!

Tue, 3 Dec 2024 15:00:00

John Benediktsson: Listener Font Sizes

Factor contains a REPL – called the Listener – available on the command-line and graphically as part of the UI developer tools. For many users, this is their main interface to programming in the Factor programming language.

We sometimes get requests to better support styling the user interface. This has led to improvements such as support for light and dark themes, adjustable font sizes, and other customizations. There have been a few existing ways to style the UI listener including support for keyboard commands to increase or decrease font sizes. But, until recently this only affected new output or new Listener sessions.

Today, I improved this to make adjusting the font size much more dynamic, using traditional keyboard shortcuts of Ctrl – or Cmd on macOS – combined with + or -:

Give it a try, and please let us know other ways we can make improvements!

Wed, 13 Nov 2024 15:00:00

Blogroll


planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.

Syndicate