ICANN recently posted an update on Launching RDAP; Sunsetting WHOIS which got some discussion on Hacker News. Andy Newton, one of the creators of RDAP, has published A Guide to the Registration Data Access Protocol (RDAP), which is a pretty useful resource for understanding how it works. More information comes from the main RDAP website which describes it as:
The Registration Data Access Protocol (RDAP) is the successor to WHOIS. Like WHOIS, RDAP provides access to information about Internet resources (domain names, autonomous system numbers, and IP addresses). Unlike WHOIS, RDAP provides:
- A machine-readable representation of registration data;
- Differentiated access;
- Structured request and response semantics;
- Internationalisation;
- Extensibility.
The WHOIS protocol that it replaces is super simple, being described by RFC 3912 in a few paragraphs. And, in fact, you can test it out on the command-line of most computers:
$ echo -e "factorcode.org\r\n" | nc -i 1 whois.cloudflare.com 43
Domain Name: FACTORCODE.ORG
Registry Domain ID: c49c93dee3304f39b081383262d320c6-LROR
Registrar WHOIS Server: whois.cloudflare.com
Registrar URL: https://www.cloudflare.com
Updated Date: 2025-01-15T22:46:54Z
Creation Date: 2005-12-01T04:54:37Z
Registrar Registration Expiration Date: 2025-12-01T04:54:37Z
Registrar: Cloudflare, Inc.
Registrar IANA ID: 1910
Domain Status: clienttransferprohibited https://icann.org/epp#clienttransferprohibited
...
The RDAP protocol must be equally simple, right? Well, not so fast. Instead of a few paragraphs, and simple queries over sockets, you get many RFCs describing it:
These, along with things like the RDAP Extension registry, and the protocol reliance on HTTP/HTTPS, JSON, and JSONPath considerably increase the complexity of RDAP implementations.
Below we are going to start an implementation using Factor.
The first concept we have to implement is RDAP Bootstrapping which uses 5 IANA files to redirect searches to the correct upstream RDAP servers.
Type | Link |
---|---|
Forward DNS | https://data.iana.org/rdap/dns.json |
IPv4 Addresses | https://data.iana.org/rdap/ipv4.json |
IPv6 Addresses | https://data.iana.org/rdap/ipv6.json |
Autonomous System Numbers | https://data.iana.org/rdap/asn.json |
Object Tags | https://data.iana.org/rdap/object-tags.json |
We can abstract these by making a word to convert a type to a URL:
: bootstrap-url ( type -- url )
"https://data.iana.org/rdap/" ".json" surround ;
We don’t want to retrieve these files all the time, so let’s cache them for 30 days:
INITIALIZED-SYMBOL: bootstrap-cache [ 30 days ]
: bootstrap-get ( type -- data )
bootstrap-url cache-directory bootstrap-cache get
download-outdated-into path>json ;
And provide a way to force delete the cached bootstrap files:
CONSTANT: bootstrap-files { "asn" "dns" "ipv4" "ipv6" "object-tags" }
: reset-bootstrap ( -- )
[ bootstrap-files [ ".json" append ?delete-file ] each ] with-cache-directory ;
Each bootstrap file is described in RFC
9224,
but basically we want to extract and manipulate the "services"
block,
choosing to flatten the
assoc
for convenient searching:
: parse-services ( data -- services )
[ "services" of [ swap [ ,, ] with each ] assoc-each ] LH{ } make ;
We can then provide bootstrap data structures that are setup for easy searching. For example, we find the longest subdomain that has an entry in the dns bootstrap list:
: dns-bootstrap ( -- services )
"dns" bootstrap-get parse-services ;
: split-domain ( domain -- domains )
"." split dup length <iota> [ tail "." join ] with map ;
: domain-endpoints ( domain -- urls )
split-domain [ dns-bootstrap at ] map-find drop ;
Or, find the correct endpoint for a given IPV4 address from the ipv4 bootstrap list:
: ipv4-bootstrap ( -- services )
"ipv4" bootstrap-get parse-services
[ [ >ipv4-network ] dip ] assoc-map ;
: ipv4-endpoints ( ipv4 -- urls )
ipv4-aton ipv4-bootstrap
[ drop swap ipv4-contains? ] with assoc-find drop nip ;
The RDAP data is typically available from HTTP or HTTPS web servers, as JSON
files, but it uses a custom mime-type application/rdap+json
. We can
write a simple word to make the request and convert the response:
: accept-rdap ( request -- request )
"application/rdap+json" "Accept" set-header ;
: rdap-get ( url -- response rdap )
<get-request> accept-rdap http-request
dup string? [ utf8 decode ] unless json> ;
And now we can build a word to lookup a domain:
: lookup-domain ( domain -- results )
[ domain-endpoints random ]
[ "domain/%s" sprintf derive-url rdap-get nip ] bi ;
Or to lookup an IPV4 address:
: lookup-ipv4 ( ipv4 -- results )
[ ipv4-endpoints random ]
[ "ip/%s" sprintf derive-url rdap-get nip ] bi ;
And we can try it out, getting an extensive response:
IN: scratchpad "factorcode.org" lookup-domain
--- Data stack:
LH{ { "rdapConformance" ~array~ } { "notices" ~array~ } { ...
It would be nice to print the output in a more human-readable format. For now, we will just print these as a nested tree of keys and values:
GENERIC: print-rdap-nested ( padding key value -- )
M: linked-assoc print-rdap-nested
[ over write write ":" print " " append ] dip
[ swapd print-rdap-nested ] with assoc-each ;
M: array print-rdap-nested
[ print-rdap-nested ] 2with each ;
M: object print-rdap-nested
present [ 2drop ] [ [ ": " [ write ] tri@ ] dip print ] if-empty ;
: print-rdap ( results -- )
[ "" -rot print-rdap-nested ] assoc-each ;
This could probably be improved a fair bit – for example, the keys could be made more readable, and it doesn’t handle vCard entries very well.
You can try this out, by lookup up a domain name:
IN: scratchpad "factorcode.org" lookup-domain print-rdap
rdapConformance: rdap_level_0
rdapConformance: icann_rdap_response_profile_0
rdapConformance: icann_rdap_technical_implementation_guide_0
ldhName: factorcode.org
unicodeName: factorcode.org
nameservers:
ldhName: carl.ns.cloudflare.com
unicodeName: carl.ns.cloudflare.com
objectClassName: nameserver
handle: c34bedeccd8e4514b917e9e82a052077-LROR
status: associated
nameservers:
ldhName: kay.ns.cloudflare.com
unicodeName: kay.ns.cloudflare.com
objectClassName: nameserver
handle: 7fc12bf413944de088f27f837349a8da-LROR
status: associated
...
Or, by lookup up an IP address:
IN: scratchpad "1.1.1.1" lookup-ipv4 print-rdap
rdapConformance: history_version_0
rdapConformance: nro_rdap_profile_0
rdapConformance: cidr0
rdapConformance: rdap_level_0
events:
eventAction: registration
eventDate: 2011-08-10T23:12:35Z
events:
eventAction: last changed
eventDate: 2023-04-26T22:57:58Z
name: APNIC-LABS
status: active
type: ASSIGNED PORTABLE
endAddress: 1.1.1.255
ipVersion: v4
startAddress: 1.1.1.0
objectClassName: ip network
handle: 1.1.1.0 - 1.1.1.255
...
Pretty cool!
This was recently committed to the development version of Factor.
A few days ago, Ryan Petermen wrote a tweet about switching to Python from C++ in a solution for the Two Sum problem from LeetCode:
Given an array of integers
nums
and an integertarget
, return indices of the two numbers such that they add up totarget
.You may assume that each input would have exactly one solution, and you may not use the same element twice.
You can return the answer in any order.
Spoilers below!
They offered this iterative C++ solution:
std::vector<int> twoSum(std::vector<int>& nums, int target) {
std::unordered_map<int, int> hashMap;
std::vector<int> result;
for (int i = 0; i < nums.size(); ++i) {
int complement = target - nums[i];
if (hashMap.find(complement) != hashMap.end()) {
result.push_back(hashMap[complement]);
result.push_back(i);
return result;
}
hashMap[nums[i]] = i;
}
return result;
}
Followed by this improved Python solution:
def two_sum(nums, target):
hashmap = {}
for i, num in enumerate(nums):
complement = target - num
if complement in hashmap:
return [hashmap[complement], i]
hashmap[num] = i
Of course, I was curious what a Factor solution would look like.
We can start by making a mostly direct translation using local variables:
:: two-sum ( nums target -- i j )
H{ } clone :> hashmap
nums <enumerated> [
first2 :> ( i num )
target num - :> complement
complement hashmap at [
i num hashmap set-at f
] unless*
] map-find ?first ;
And a few test cases to show that it works:
{ 0 1 } [ { 2 7 11 15 } 9 two-sum ] unit-test
{ 1 2 } [ { 3 2 4 } 6 two-sum ] unit-test
{ 0 1 } [ { 3 3 } 6 two-sum ] unit-test
Sometimes a higher-level word exists that can simplify logic, for example map-find-index:
:: two-sum ( nums target -- i j )
H{ } clone :> hashmap
nums [| num i |
target num - :> complement
complement hashmap at [
i num hashmap set-at f
] unless*
] map-find-index drop ;
In the spirit of concatenative
thinking, we
could reduce the amount of named variables we have slightly by not naming the
complement
internal variable:
:: two-sum ( nums target -- i j )
H{ } clone :> hashmap
nums [| num i |
target num - hashmap at [
i num hashmap set-at f
] unless*
] map-find-index drop ;
And that works, but perhaps we could try some alternative syntax features like fried quotations:
: two-sum ( nums target -- i j )
H{ } clone dup '[
swap _ over - _ at [ 2nip ] [ _ set-at f ] if*
] map-find-index drop ;
Is that better? Perhaps.
How else might you solve this problem?
Can you do it in one-line?
I had a lot of fun writing code to compute derangements recently. I thought I was done with that topic until I bumped into a question on StackOverflow asking how to generate a random derangement of a list. Being nerd sniped is a real thing, and so I started looking at solutions.
There’s a paper called “An analysis of a simple algorithm for random derangements” that has an, ahem, simple algorithm. The basic idea is to generate a random permutation of indices, breaking early if the random permutation is obviously not a derangement.
One way to take a random permutation would be to use our permutations virtual sequence:
IN: scratchpad "ABCDEF" <permutations> random .
"FCEBDA" ! is a derangement
IN: scratchpad "ABCDEF" <permutations> random .
"DFBCEA" ! is NOT a derangement
And so you could loop until a derangement of indices is found:
: random-derangement-indices ( n -- seq )
f swap <iota> <permutations>
'[ drop _ random dup derangement? not ] loop ;
But, since only 36% or so of permutations are derangements, perhaps it would be faster and better to implement the algorithm from that paper – making our own random permutation of indices and breaking early if obviously not a derangement:
:: random-derangement-indices ( n -- indices )
n <iota> >array :> seq
f [
dup :> v
n 1 (a..b] [| j |
j 1 + random :> p
p v nth j = [ t ] [ j p v exchange f ] if
] any? v first zero? or
] [ drop seq clone ] do while ;
We can use that to build a random-derangement
word:
: random-derangement ( seq -- seq' )
[ length random-derangement-indices ] [ nths ] bi ;
And then, for example, get a random derangement of the alphabet – of which there are one hundred and forty-eight septillion derangements, give or take – in under a millisecond:
IN: scratchpad "ABCDEFGHIJKLMNOPQRSTUVWXYZ" random-derangement .
"CZFABMSUXRQDEHGYJLTPVOIKWN"
We could check to make sure that we generate all derangments with equal possibility using a simple test case:
IN: scratchpad 1,000,000 [
"ABCD" random-derangement
] replicate histogram sort-keys .
{
{ "BADC" 111639 }
{ "BCDA" 110734 }
{ "BDAC" 110682 }
{ "CADB" 111123 }
{ "CDAB" 111447 }
{ "CDBA" 111147 }
{ "DABC" 111215 }
{ "DCAB" 111114 }
{ "DCBA" 110899 }
}
Looks good to me!
Derangements, also sometimes known as deranged permutations, are described as:
In combinatorial mathematics, a derangement is a permutation of the elements of a set in which no element appears in its original position. In other words, a derangement is a permutation that has no fixed points.
There is a fun online derangements generator tool that you can use to play with computing the derangements of a sequence as well as calculating the number of derangements for a given sequence size.
As an example, we can use the math.combinatorics
vocabulary,
to generate all the permutations of the sequence { 0 1 2 }
:
IN: scratchpad { 0 1 2 } all-permutations .
{
{ 0 1 2 }
{ 0 2 1 }
{ 1 0 2 }
{ 1 2 0 }
{ 2 0 1 }
{ 2 1 0 }
}
Since a derangement is a permutation that requires each element to be in a different slot, we could write a word to check the permuted indices to see if that is true:
: derangement? ( indices -- ? )
dup length <iota> [ = ] 2any? not ;
These would be the two derangements of the indices { 0 1 2 }
:
IN: scratchpad { 0 1 2 } all-permutations [ derangement? ] filter .
{
{ 1 2 0 }
{ 2 0 1 }
}
The number of derangements is the subfactorial of the length of the sequence:
: subfactorial ( n -- ? )
[ 1 ] [ factorial 1 + e /i ] if-zero ;
We can build a <derangement-iota>
that is a sequence as long as that
number:
: <derangement-iota> ( seq -- <iota> )
length subfactorial <iota> ; inline
And we can build a next-derangement
word that calculates the next
permutation that is a
derangement:
: next-derangement ( seq -- seq )
[ dup derangement? ] [ next-permutation ] do until ;
We can then build upon some of the code for iterating
permutations,
designing an internal derangements-quot
word that is similar in form to the
existing permutations-quot
word:
: derangements-quot ( seq quot -- seq quot' )
[ [ <derangement-iota> ] [ length <iota> >array ] [ ] tri ] dip
'[ drop _ next-derangement _ nths-unsafe @ ] ; inline
And then use it to build a series of words that can provide iteration across derangements:
: each-derangement ( ... seq quot: ( ... elt -- ... ) -- ... )
derangements-quot each ; inline
: map-derangements ( ... seq quot: ( ... elt -- ... newelt ) -- ... newseq )
derangements-quot map ; inline
: filter-derangements ( ... seq quot: ( ... elt -- ... ? ) -- ... newseq )
selector [ each-derangement ] dip ; inline
: all-derangements ( seq -- seq' )
[ ] map-derangements ;
: all-derangements? ( ... seq quot: ( ... elt -- ... ? ) -- ... ? )
derangements-quot all? ; inline
: find-derangement ( ... seq quot: ( ... elt -- ... ? ) -- ... elt/f )
'[ _ keep and ] derangements-quot map-find drop ; inline
: reduce-derangements ( ... seq identity quot: ( ... prev elt -- ... next ) -- ... result )
swapd each-derangement ; inline
And, now we can use this to find the nine derangements for "ABCD"
:
IN: scratchpad "ABCD" all-derangements .
{
"BADC"
"BCDA"
"BDAC"
"CADB"
"CDAB"
"CDBA"
"DABC"
"DCAB"
"DCBA"
}
This is available on my GitHub.
Years ago, I remember reading a blog post called “Why is Groovy is big?” by Slava Pestov, the original author of Factor. In it, he talks about lines of code and sets the stage for how concatenative thinking can lead to properties like conciseness and readability, ending with this:
I tend to think the majority of code people write is overly complicated, full of redundancy, and designed for such flexibility that in practice is not needed at all. I hope one day this trend reverses.
I’ve been thinking a lot recently about 20 years of Factor and I thought it would be fun to write a Zen of Factor. Perhaps inspired by the Zen of Programming book written in 1987 by Geoffrey James, there are a couple of examples I wanted to point to first.
The Python programming language was one of the first languages that I am aware of to get a Zen, contributed at least as far back as February 2002 in a kind of easter egg fashion encrypted by ROT13:
$ python -c "import this"
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Quite a bit more recently, the Zig programming language
introduced a Zen through a series of commits starting in August
2017.
You can see the current version by running zig zen
:
$ zig zen
* Communicate intent precisely.
* Edge cases matter.
* Favor reading code over writing code.
* Only one obvious way to do things.
* Runtime crashes are better than bugs.
* Compile errors are better than runtime crashes.
* Incremental improvements.
* Avoid local maximums.
* Reduce the amount one must remember.
* Focus on code rather than style.
* Resource allocation may fail; resource deallocation must succeed.
* Memory is a resource.
* Together we serve the users.
In that spirit, I thought it would be fun to put together some ideas for what a Zen of Factor might look like, incorporating aspects of the language that I enjoy, some thematic elements that might make you more successful learning and developing in it, as well as pointing out the strong community that we have and hope to grow.
Here is the current draft:
The Zen of Factor
The REPL is your playground.
Working code beats perfect theory.
Words are better than paragraphs.
Stack effects tell the story.
Any syntax you need, you can create.
Simple primitives yield powerful combinations.
First make it work, then make it beautiful.
Make it beautiful, then make it fast.
Quick hacks grow into robust solutions.
When in doubt, factor it out.
Every word should do one thing well.
Let the stack guide your logic.
Write less, compose more.
If it works, ship it.
If it doesn't work, fix it.
If you don't like it, change it.
Today's beginner is tomorrow's core developer.
Questions encouraged, PRs welcome.
I really appreciate hearing about programs and problems that developers work on, getting feedback that allows us to iteratively improve, and knowing that every commit leads us towards a better future.
Thank you!
Factor has a watching words feature that allows you to see the inputs and outputs of a word when it is called. I have wanted a way to define a watched code block that we could use to see the stack before and after some inner part of a word.
It turns out that it’s pretty simple to make this syntax using our existing watch implementation from the tools.annotations vocabulary:
DEFER: WATCH>
SYNTAX: <WATCH
\ WATCH> parse-until >quotation dup (watch) append! ;
Using this, we can now watch the inner part of a word, for example this word that increments the input by 1:
: foo ( x -- y ) 1 <WATCH + WATCH> ;
And see it used:
IN: scratchpad 10 foo
--- Entering [ + ]
x 10
x 1
--- Leaving [ + ]
x 11
--- Data stack:
11
One thing that I’d like to improve on this someday, is making it so this watch syntax has a prettyprint implementation that allows it to be rendered as it is typed.
I have added this to the latest developer version if you’d like to update and give it a try!
Factor contains a REPL – called the Listener – available on the command-line and graphically as part of the UI developer tools. For many users, this is their main interface to programming in the Factor programming language.
We sometimes get requests to better support styling the user interface. This has led to improvements such as support for light and dark themes, adjustable font sizes, and other customizations. There have been a few existing ways to style the UI listener including support for keyboard commands to increase or decrease font sizes. But, until recently this only affected new output or new Listener sessions.
Today, I improved this to make adjusting the font size much more dynamic, using
traditional keyboard shortcuts of Ctrl
– or Cmd
on
macOS – combined with +
or -
:
Give it a try, and please let us know other ways we can make improvements!
Recently, I’ve been inspired by conversations taking place on our Factor Discord server. This sometimes reflects areas of interest from new contributors, curiousity exploring similarities and differences between Factor and other programming languages, or even early learning moments when exploring concatenative languages in general.
Today, someone asked about how to think about “accumulation of values in an array… to find all occurences (their position) of a subseq in a seq”. The solution to this might have this word name and stack effect:
: subseq-indices ( seq subseq -- indices ) ... ;
Before answering, I wanted to make sure they wanted to find overlapping indices vs. non-overlapping indices, and they clarified that they expect it to find this result – allowing overlapping subsequences:
IN: scratchpad "abcabcabc" "abcabc" subseq-indices .
{ 0 3 }
So, now that we have a reasonable specification, how do we think about solving this problem when we are at the same time learning to solve problems in stack languages and trying to see what features of Factor’s standard library would help.
There are a lot of ways to think about this, and I often recommend one of three approaches:
So let’s look at each approach in turn:
The inner logic is going to require something like “take an index to start from and find the next matching subseq index”, which looks an awful lot like subseq-index-from – except you also want to increment the found index afterwards to make sure you are progressing through the sequence.
: next-subseq-index ( index seq subseq -- next-index/f found-index/f )
subseq-index-from [ [ 1 + ] keep ] [ f f ] if* ;
Then you could use it like so in a loop with an accumulator:
: subseq-indices ( seq subseq -- indices )
[ V{ } clone 0 ] 2dip '[
_ _ next-subseq-index dup [ [ pick push ] keep ] when
] loop drop ;
But that feels like we had to work hard to do that, directly using an
accumulator, conditionals, and some stack shuffling. Luckily we have some
higher level words that might help, for example the make
vocabulary
which has an implicit accumulator that we can use ,
or %
to push into:
: subseq-indices ( seq subseq -- indices )
[ 0 ] 2dip '[
[ _ _ next-subseq-index dup [ , ] when* ] loop
] { } make nip ;
Or even using a while* loop, which is less code:
: subseq-indices ( seq subseq -- indices )
[ 0 ] 2dip '[
[ _ _ next-subseq-index ] [ , ] while*
] { } make nip ;
But that feels like a lot too, simpler might be produce:
: subseq-indices ( seq subseq -- indices )
[ 0 ] 2dip '[ _ _ next-subseq-index dup ] [ ] produce 2nip ;
Or using follow, adjusting our start index and increment:
: subseq-indices ( seq subseq -- indices )
[ -1 ] 2dip '[ 1 + _ _ subseq-index-from ] follow rest ;
The outer logic approach would be something like “we need to loop from the start of the sequence, finding the next match, and accumulating it, until we hit some exit condition and then return a result” which you could write in a kind of non-functional stack pseudocode:
: subseq-indices ( seq subseq -- indices )
0 [ find-next-match ] [ accumulate-match ] while ;
Then you have to kind of figure out what goes into those blocks:
: find-next-match ( seq subseq n -- found-index/f )
-rot subseq-index-from ;
And also something like:
: accumulate-match ( accum found-index -- accum next-index )
[ suffix! ] keep 1 + ;
Taking those, and maybe thinking about what items should be on the stack and in what order to reduce stack shuffling, becomes something like:
: subseq-indices ( seq subseq -- indices )
[ V{ } clone 0 ] 2dip
'[ _ _ subseq-index-from ] [ [ suffix! ] keep 1 + ] while* ;
It is true that [ suffix! ] keep 1 +
is also [ suffix! ] [ 1 + ] bi
,
with varying aesthetics and ease of understanding, but sometimes when
learning a new language especially a stack language with
combinators,
it is sometimes easy to start with stack shuffling and then learn about
these forms later to see if they can improve your code.
Instead of those two stack approaches, we could instead use our local variables and write one big word in a manner similar to applicative languages, stepping back and focusing on the result we want:
:: subseq-indices ( seq subseq -- indices )
V{ } clone :> accum
0 :> i!
[ i seq subseq subseq-index-from ]
[ dup accum push 1 + i! ] while*
accum ;
When working on this stuff, it’s nice to remember you can put a B
to
set a
breakpoint in
places to examine the stack at some inner point, or perhaps write a comment
showing the incoming stack and optionally the outgoing stack that a piece of
code is expected to have so that you understand what is happening in the
next few lines:
! the next block of code finds the next index
! ( index seq subseq -- found-index )
! and pushes it into an accumulator
! ( accum found-index -- accum )
This was added to the developer
branch
in the sequences.extras
vocabulary.
We love to hear questions and it’s even better when we can provide answers or guidance for learning and solving problems. Feel free to join our conversations and explore learning Factor!
planet-factor is an Atom/RSS aggregator that collects the contents of Factor-related blogs. It is inspired by Planet Lisp.