Tool of Thought

APL for the Practical Man

"We make software the old-fashioned way, we write it."

A Query in R and Kap

May 8, 2026

Elias Mårtenson posts about a query in R and compares it to his APL-inspired language Kap. Elias of course provides a succint, APL-style solution in Kap. Here we take a look at the R solution and compare it to FlipDB, a relational database management system written in APL. The problem starts with the following table:

Country Amount Discount
USA 2,00010
USA 3,50015
USA 3,00020
Canada 120 12
Canada 180 18
Canada 3,10021
UK 130 13
UK 160 16
...

Given this, the task is to produce a table of total discounted sales by country, excluding outliers defined as entries in each country with an amount greater than 10 times the median for that particular country. The result we are looking for is:

Country Total
Australia540
Brazil 414
Canada 270
France 450
Germany 513
India 648
Italy 567
Japan 621
Spain 594
UK 432
USA 8,455

We thus have a row-dependent where clause. That is, whether or not a row is included depends on other rows in the table. More precisely, whether or not a row is included in a group depends on the other rows in the group. This is the crux of the matter, and what makes the query problematic in some tools.

The R solution is given as:

  purchases |>
  group_by(country) |>   
  filter(amount <= median(amount) * 10) |>
  summarize(total = sum(amount - discount))

As the original post suggests, this is indeed a nice solution. There is a filter, or where clause, that occurs after grouping. In APL terms, we can visualize the reference to amount being a nested array and median being an aggregate function applied with an each, and scalar extension applying to yield a nested Boolean selection vector. Who knows how it is executed under the covers, but that's how it looks.

How is this solved in FlipDB? We define the query as:

GroupBy:NameExpression
CountryCountry
Measures:NameExpression
Totalsum (Amount - Discount) where Amount <= 10 * median Amount
OrderBy:NameDirection
CountryUp

(Note that we have to explicitly specify the ordering which seems to be a default in R. We don't show the OrderBy clause in the queries below in the interest of space.)

This is remarkably similar to the R solution. The difference is that we are filtering in-line, just for one column in the result set, whereas the R solution is filtering the entire table. In-line filtering is useful because we can apply different filters, or no filter at all, for different columns in the result table. For example, it might be useful to display the totals without excluding the outliers side-by-side for comparison which can be done by just adding another measure:

Measures:NameExpression
Totalsum (Amount - Discount) where Amount <= 10 * median Amount
TotalAllsum Amount - Discount

yielding:

Country Total TotalAll
Australia540 540
Brazil 414 414
Canada 270 3,349
France 450 450
Germany 513 513
India 648 648
Italy 567 567
Japan 621 621
Spain 594 594
UK 432 432
USA 8,4558,455

Here we can see by inspection that only Canada has outliers.

We can, like the R solution, specify a where clause for the query as a whole:

Where:Expression
Amount <= 10 * median by Amount (group Country)
GroupBy:NameExpression
CountryCountry
Measures:NameExpression
Totalsum Amount - Discount

However, unlike the R solution, this is applied before, and independently of, grouping. Thus we need to specify a grouping in the where clause itself, and then apply the median function to each group using the by operator. The by operator handles the details of applying an aggregate function to grouped data, and then replicating the results to line up with the ungrouped data. The advantage here over the R solution is that we may then group our query by some other column or value than Country. For example, we might group by region, but keep outliers defined within country:

Where:Expression
Amount <= 10 * median by Amount (group Country)
GroupBy:NameExpression
AmericasCountry in 'USA,Canada,Brazil'
EuropeCountry in 'UK,France,Germany,Italy,Spain'
AsiaCountry in 'Australia,Japan,India'
Measures:NameExpression
Totalsum Amount - Discount

which yields:

Region Total
Americas9,139
Europe 2,556
Asia 1,809

What if we want to group by region but in addition display totals including the outliers, as we did above? We can't exclude that data from the query, but now we can't assume that the grouping for the query is the same grouping for computing the outliers. Rather than applying a where clause, we can pre-compute a column that flags outliers according to their country and then group based on region:

ComputedColumns:NameExpression
OutlierAmount > 10 * median by Amount (group Country)
GroupBy:NameExpression
AmericasCountry in 'USA,Canada,Brazil'
EuropeCountry in 'UK,France,Germany,Italy,Spain'
AsiaCountry in 'Australia,Japan,India'
Measures:NameExpression
Totalsum (Amount - Discount) where not Outlier
TotalAllsum Amount - Discount

which yields:

Partition Total TotalAll
Americas9,13912,218
Europe 2,5562,556
Asia 1,8091,809

In FlipDB computed columns in a query are executed before the where clause. This is useful, as we may want to reference them in the where clause, and if they are row-dependent we may want them computed on the entire table, not just what the where clause includes. However, perhaps in this case, it would be useful if computed columns were to execute after the where clause. We could have two sets of computed columns, one executed before the where clause and one executed after, but that seems a bit overkill.

Debouncing

May 3, 2026

Consider a treeview and a corresponding panel on its right, as in a file explorer. If we click on an item in the treeview, a select event is generated which, typically, fires a callback function to update the panel on the right. We can be Olympic mouse champions but we can only move the mouse and click on another item so fast. Generally no matter how adept we are with the mouse, the system will be able to refresh the right-hand panel with no apparent lag.

Now consider using the down cursor key to scroll through the items of the tree. It takes no special skill to scroll through hundreds of items at lightning speed. Your cat can do this stepping on your keyboard. Each keypress yields a select event and a subsequent execution of the callback function to refresh the right-hand panel. If the refresh operation takes any significant time, the response will be sluggish, the user experience bad.

The solution to this problem is known as debouncing. Typically on each keypress a timer is created or reset, and only when the timer goes off is the callback executed. So if we hold down a key the timer is constantly being reset, and only when we release the key and pause for a moment does the timer have an opportunity to go off. Instead of running a hundred times, the callback executes only once, when we pause scrolling.

As far as we know, no UI frameworks build debouncing into components. Debouncing must be implemented by hand, on a case-by-case basis.

We attempt to build this functionality right into the Abacus TreeView component, and to make it a general solution for any event that needs debouncing.

This first problem we encounter is that a ⎕WC timer will not help us because we are multithreading and ⎕WC objects do not like to interact with different threads.

Fortunately the architecture of Abacus provides another way. Abacus has a wait loop where the document is waiting for a token with ⎕TGET from the main thread, signifying the user has taken some action in the browser. Previously we had not taken advantage of the left argument of ⎕TGET which allows for a timeout. If we want to debounce some event, we can signal to the system that we want a delay. We can set a timeout, exit ⎕TGET when the user pauses, and then handle the event. First, we add a few new system properties to the Document object:

     d.Debounce←⍬
     d.DefaultTimeout←2147483
     d.Timeout←d.DefaultTimeout

These are system properties; the programmer does not set them. The Debounce property tracks if there is an event being debounced. It is either an empty array, or a two element array containing the callback function and its argument. The DefaultTimeout property effectively specifies no timeout. The Timeout property will be used by ⎕TGET in the event loop. The TreeView OnSelect property (and potentially any event callback on any component) may be set to either a simple string with the name of the callback, or a namespace containing the callback function and a debounce delay:

    (CallbackFunction:'MyCallBack' ⋄ DebounceDelay:100)

That's all the programmer needs to do.

In the TreeView component, when a cursor key is pressed, we run DebounceSelect as a cover over FireSelect which actually fires the select event:

DebounceSelect←{
     ⍝ ⍺←TreeView
     d←⍺.Document
     l←⍺.OnSelect.DebounceDelay
     l=0:FireSelect ⍺
     d.Timeout←l÷1000
     d.Debounce←(A.FQP'FireSelect')⍺
     0
 }

This checks the value of DebounceDelay. If 0, the select event is fired immediately. Otherwise, the document Timeout property is set to the delay and the callback with its argument are assigned to the Debounce property, where they await execution when the ⎕TGET times out:

ThreadQueue←{
     ⍝ Queued Event Loop for Session/Document
     r←⍺.Timeout ⎕TGET ⎕TID
     0=≢r:⍺ ∇ ⍵⊣Debounce ⍺
     c←⊃r
     _←LogBrowserEvent c
     _←(⍎c.CurrentTarget⍎'On',c.Event)c
     _←PutDoneToken c
     ⍺ ∇ ⍵
 }

When a timeout occurs, Debounce is called, which executes a pending debounced event if there is one:

Debounce←{
     ⍝ ⍵ ←→ Document
     0=≢⍵.Debounce:0
     f a←⍵.Debounce
     ⍵.Debounce←⍬
     ⍵.Timeout←⍵.DefaultTimeout
     (⍎f)a
 }

What Could Go Wrong

At least two things can go wrong with this technique. A debounced event might not fire when it should, and a debounced event might fire when it should not.

For the first case, consider two treeviews (and associated panels), and a user scrolling through the first treeview. If the debounce delay is long enough, or the user quick enough, the user could tab to the second treeview and start scrolling there. The timeout on the first debounce would not happen, and then the next debounce would overwrite the first debounce, which would thus be lost. The first treeview would not update.

For the second case, again with a debounce delay long enough or a nimble user, the treeview could be deleted by some user action, and then the timeout occurs and the debounced event fires on a nonexistant element.

We can probably code around both of these problems, but as the delay should always be set as short as possible, they are not likely to occur.

Whistling Past the Graveyard

April 30, 2026

The AI hype is at a fever pitch with tech CEOs declaring coding as we know it is done, over, kaput, a solved problem, while everyone from Uncle Bob to Linus Torvalds and DHH is embracing it, albeit in a measured way, with much less fervor than AI CEOs.

What's an APL programmer to think? We are, by definition, contrarians. After all, we use a language that's over 60 years old and fundamentally unchanged over that time. We use strange symbols. We don't write loops. Our opinion of AI will not follow the crowd.

Is there anything more absurd than using natural language to explain a complex problem to a computer? This is what APL is for. Why would we take a concise, powerful, unambiguous notation designed specifically for humans and not for computers, and replace it with a verbose, vague, error-prone means of communication between man and machine? This is one big step backwards. Who in their right mind wants to spend all day having a discussion and argument with a laptop about what to code? We can think of nothing worse than having to essentially talk to the computer for a living.

Perhaps AI is for programmers who use languages designed for computers rather than people.

Boilerplate

We often hear programmers lauding AI because it relieves them of the onerous task of writing boilerplate code. There are two problems here. First, if your language requires boilerplate, that itself is a problem. Second, it is one of the fundamental jobs of a programmer to eliminate duplication and boilerplate. What kind of programmer writes boilerplate code more than once? Not a good or experienced one.

Documentation

Another task that programmers hope to rid themselves of is writing documentation. If documentation is viewed as something to be tacked on to the end of a project after the design and programming is done, then documentation is indeed a chore, and largely a worthless effort too. Documentation is design. Nothing shows the flaws in a design more than explaining it on paper to the person who is going to use it. Documentation, like coding itself, is an essay, an attempt to understand a problem. Writing documentation as you code allows you to write what should be, rather than what is. It allows you to improve the design rather than accept it as given.

Tests

Programmers also hate writing tests, and hope AI will relieve them of this job too. But test are just documentation written in code rather than natural language. Tests perform the same function as documentation: design.

UI

Then there is the UI, yet another dreaded task of the lofty software engineer. But putting a button on a screen is not hard. Positioning it, making it a nice color, giving it a drop-shadow, or nice hover effect, none of this is hard. What is hard is deciding that the button should exist at all. Or, if it should exist, what screen should it be on. Or what other controls should be near it. Or what the button should do when clicked. The difficulty of UI is in the design, not the implementation.

Craftmanship

A hand-made software program is not like a pair of hand-made shoes. Hand-made shoes can be sold once and used by one person at a time. Hand-made software can be copied infinitely for zero cost. Software craftsmanship is not like physical world craftsmanship. Better tools never remove the need for software craftsmanship, and in fact do the opposite. Software is design, not manufacturing.

DIY

The AI promoters want everyone to write their own, disposable programs. This is silly. Some people want to build their own computers, or build their own guitars, or build their own cars, or their own APL interpreters. These people are valuable and more power to them. But the vast majority of people want to use a computer, a guitar or car that has been very carefully crafted by people deeply immersed in the specific problem. The same holds true for software. Even programmers want to use software carefully crafted by other programmers.

The Scam

The genius of the AI salesmen is that they have designed a software product that is bug-free by definition. If it is wrong, it just says Sorry! My mistake. You are absolutely right. Let's fix that! and then proceedes to get it wrong again. And again. And again. The AI companies want you to buy into this massive, expensive dependency just to write a line of code. AI for coding is a solution to a problem that shouldn't exist.

More posts...