Introduction and Terminology

K is a high level system programming language available for Windows, Solaris, Linux, and maybe a few other Unix flavors upon request. It is mixture of APL, a functional programming language, and has many unique features not found elsewhere. The K system is produced by Kx. They produce a very fast and efficient commercial relational database, KDB — written in K. You can get copies of the interactive interpreter free for educational, research, or toy purposes, and there is a non-interactive, runtime-only interpreter that is strictly free.

Contrary to some opinion, K is a real-world language, made for real use and sold as a real product. It is not a toy language. It is not a Turing tarpit language. It is not intentionally obfuscated or made to be intentionally strange, like Befunge or Intercal.

K was originally developed by Arthur Whitney, who was an influential member of the APL community. Before K, Arthur was asked to write Dr. Kenneth Iverson's APL-successor, the J language, but he decided to go his own way. This isn't his first language either; he is also the creator of the Morgan Stanley financial language A+. From 1994 to 1997 the Union Bank of Switzerland purchased exclusive rights for the use of K, but now it is available to all of us.

One of the large draws of K is the extreme programmer productivity it offers, its incredibly fast execution speed, and its very small executable units. One of the proofs of these claims is Brian Kernighan's Experiments with Scripting and User-Interface Languages, in which K easily outperforms other favorite languages. Another example of these claims met is how KDB outperforms Oracle on TPC benchmarks in both query speed and data storage size while essentially being written by one person. Also, K code is very dense, and it is typical to see 100:1 code size reduction when migrating to from C to K. I have heard of almost a 1000:1 reduction when a project moved from Java and SQL to K and KSQL (KDB's query language).

K is an exceptional language for dealing with mathematical analysis, financial prediction, or anything that handles bulk data. I have used it for a document search engine and other computational linguistic tasks. K doesn't work very well when the problem is not vectorizable and scalars are the largest datum to operate on. Often though, algorithms can be written in either method. However, some problems have been almost impossible to vectorize, such as Ackermann's function, and are not good to attempt in K.

K has bindings to other popular language such as C, Java, VisualBasic, and Excel. There has also been work done on bindings to Python and Mozilla's XUL. K's builtin interprocess communication and binary format for objects is very simple and documented so making other systems interact with K is often equally simple.

Even though K is an interpreted language, the source code is somewhat compiled internally, but not into abstract machine code like Java or Python. The interpreter can be invoked interactively or non-interactively, and you can also compile a source file down to a binary file for product distribution, if you wish.

One of the hardest things for many people to get over at first is the way K looks. Even the strongest K enthusiast will freely admit that K tends to look like line noise. I came from a Scheme background and already had my eyes hurt a couple times trying to learn Perl. When my roommate introduced me to K I ridiculed it and couldn't imagine ever wanting to use it. However, after a few sessions, my eyes began to relax. Unlike Perl, the syntax is extremely regular and there is almost no syntactic sugar or special cases. K is even translatable into English; there is a program that will take a K expression and produce its English translation (see http://www.kx.com/a/k/read.k). To allow this translation each operator is given is a short English name and sometimes small combinations of operators are also given common names. This is similar to building a vocabulary in a natural language. You will begin to look at larger segments of code and abstract away the actual details of what happens to the data, while understanding the higher-level concepts and transformations more easily.

To make even stronger connections with linguistics, grammatical terms are used to describe K. Operators are called verbs, and data is called nouns. There are also operators that modify other operators (these will be described later) that are called adverbs. Stringing some nouns, verbs, and adverbs together will produce clauses and sentences. This dialogue has been inherited from APL and is often abandoned for the more commonplace names of operators, functions, and variables.

Comments and Conventions

When small code segments are presented, I will use the conventions of the language and interpreter for comments and display. Forward-slash (/) is special when it comes at the beginning of a line or it has space to the left of it, then it starts a comment that lasts until the end of the line. Some people will also disguise comments as strings, since that will not effect the results of a computation. The interpreter prompts the user with two spaces to signify that it is ready to accept input. It will print all of its output without any leading space. I will do the same, showing interpreter input preceded by two spaces and interpreter output unindented.

K tends to favor simplicity over sugar. One thing that may confused people easly is K has no precedence rules. Everything is parsed from right to left. This makes it easy to see what happens next in a computation, but it also takes a little getting used to. For example, 3*2+1 in K will produce 9, instead of the more usual 7 in other language. This is always the case, except when a set of parentheses is come across. Whatever inside the parentheses is first evaluated then normal evaluation resumes. People will frequently play with the order of their statements to reduce the number of characters in the expression, instead of placing parentheses around it.

Values

There are four simple types of values in K -- integer, floating point, character, and symbol; a special null type for the singular value of _n; and two composite types -- dictionaries and lists. The composite types are containers that others values may be stored and retrieved from. Lists are further classified as either homogenous or heterogeneous, and all homogenous lists carry the type of value they contain around with them. A list where all the simple elements are of the same type are called vectors. K is optimized to handle these homogenous lists and vectors since they appear in programming so frequently.

Lists are represented in different ways. The most general case is surrounded by parentheses with each element separated by a semi-colon. If the list is homogenous the parentheses and semi-colons will be elided and you will simply see the elements separated by a space. If all the elements are characters then the lists will be a string surrounded by double-quotes. To index a list you use brackets ([]) with multiple indices allowed. To index a multidimentional list use a semi-colon to separate the dimensions, but leaving a dimension blank selects all (this the same as using _n as the index).

A symbol is an interned string that the interpreter uses for variable lookup. They are created by the backtick followed by any legal variable name: a letter or underscore followed by any combination of letters, numbers, underscores, and dots -- but no more than two consecutive dots (since they have a special meaning that will re explained later). If you would like the symbol to not follow a legal variable name patters, you may enclose it in double quotes.

Examples:

  (1; 2.3; 4; 5.6)       / heterogeneous list of integers and floats
  1 2 3 4 5              / homogeneous list of integers 
  1.2 3.4 5.6            / homogeneous list of floats 
  "quack"                / homogeneous list of characters 
  (1 2; 3.4 5.6; "meow") / a list of lists 
  (1;2 3;4 5 6)          / a vector of integers 
 
  "abcdefghijlkmnopqrstuvwxyz"[14 8 13 10]
"oink" 
 
  ("qwerty";"poiuy";"asdf";"jhgfdsa")[;3] / slicing or projection 
"ruff" 
 
  `moo
  `"http://www.moo.com"

Verbs

K is a very small language, so it needs to make the most of everything. Unlike APL, It uses only ASCII, so it already starts off with not too many characters to use and abuse. To allow for compact code operators are overloaded with two cases: a monadic (one argument) and diadic (two argument) use. Sometimes these uses are related and sometimes they are not. And each case the operator sometimes has slightly different behavior depending on the type or domain of the arguments.

Examples:

  !4      / enumerate: a list of integers from 0 to x-1 
0 1 2 3 
 
  5!3     / mod: the residue of the left modulus the right 
2 
 
  2!1 2 3 / rotate: spins right back-to-front left number of positions. 
3 1 2 
 
  ,2  / enlist: a one item list containing only 2 
,2 
 
  1,2 / join: forms one list of the left and right argument 
1 2 
 
  |1 2 3 / reverse: reverses a list 
3 2 1 
 
  0|1    / max: the maximum also boolean OR 
1 
 
  &1 2 3 4     / where: returns the number of units specified 
0 1 1 2 2 2 3 3 3 3 
 
  &0 1 1 0 0 1 / where: an important use to get the indices of the 1s 
1 2 5 
 
  0&1          / min: the minimum also boolean AND 
0 
 
  4<5    / less-than: predicate of "is the left smaller than the right" 
1 
  <7 4 9 / grade-up: sorts indices in ascending order 
1 0 2 
 
  =1 0 1 0 0 3 0 1 3 1 / group: groups all indices of same value 
(0 2 7 9 
 1 3 4 6 
 5 8) 
 
  2=0 1 2 3 4          / equals: compares values 
0 0 1 0 0 
 
  ?1 0 1 0 0 3 0 1 3 1  / unique: all unique elements in order seen 
1 0 3 
 
  1 0 1 0 0 3 0 1 3 1?3 / find: the first indice of the right in left 
5 

Variables and Bindings

K is a dynamically, strongly typed language and, variables are not declared, but they come into existence when you assign a value to it. This can be done anywhere, even in the middle of an expression since there is no distinction between statements or expressions. If you try to read a value from a variable that has not yet been assigned to, you will raise an error. There are also no pointers. In true functional style, when you assign to a variable a deep copy of the value is made (K does this lazily, though). Assignment is done via the colon and it is read as "gets" or "is." As a special case, when an assignment is the last thing in an expression, null is returned (this helps prevent cluttering up the display log). You can force the return of a value from an assignment statement by using a case of the monadic colon.

Examples:

  a:"moo"       / a gets the string "moo" 
  b:!10         / b gets enumerate 10 (integer list from 0 to 9) 
  :c:b          / c gets the value of b, but changes to b do not effect c 
0 1 2 3 4 5 6 7 8 9 
 
  :h:(g*2),g:1+2 / h get g times 2 join g, where g gets 1 plus 2 
6 3 
 
  g 
3 

User-defined Functions

Braces ({}) are used to create functions; they are the equivalent of lambda in Lisp. Often they are used then the resulting function is assigned to a variable, but sometimes not. To assist in making the code compact, if a function requires three or fewer arguments, K will allow you to implicitly use x, y, and z as the arguments. If you need more than three arguments you must declare them all. All functions return the value of the last executed statement, even if that statement return null. To call a function you use brackets (just like a list index). If you do not supply an argument when calling a function, it will project.

Examples:

  pyth:{_sqrt(x*x)+y*y} / notice the two implicit arguments 
  pyth[30;40]
50.0 
 
  pyth[3 15;4 20]       / cute huh. 
5 25.0 
 
  dist:{[x1;y1;x2;y2] _sqrt((x2-x1)^2)+(y2-y1)^2} 
  dist[1;1;4;5]
5.0 
 
  :d:dist[1;1]          / project or curry the first two arguments 
{[x1;y1;x2;y2] _sqrt((x2-x1)^2)+(y2-y1)^2}[1;1]
 
  d[7;9]
10.0 
 
  :e:dist[1;;4]         / project on first and third argument 
{[x1;y1;x2;y2] _sqrt((x2-x1)^2)+(y2-y1)^2}[1;;4]
 
  e[2;6]
5.0 
 
  inc:1+ 
  inc 8 
9 

System Functions and Variables

After running out of punctuation Arthur made system function. Every symbol beginning with an underscore is reserved for either a system variable or system function. System functions use infix, like their less readable cousins, but like user defined functions they cannot be overloaded with monadic and dyadic cases (in the next version of K this will be changed and users will be able to define infix functions and overload them with n-adic cases).

Examples:

  3_draw 5        / list of 3 random numbers from 0 to 4 
2 2 4 
 
  2_draw 0        / list of 2 random real numbers from 0 to 1 
0.2232866 0.9504653 
 
  4_draw-4        / deal: list of 4 random nonrepeating numbers from 0 to 3 
2 0 1 3 
 
  4 13_draw-52    / deal a deck of cards into four piles 
(29 27 10 0 23 3 28 5 24 16 40 8 22 
 51 20 36 47 18 31 26 11 44 37 38 9 13 
 39 42 34 50 21 6 19 46 48 45 14 43 2 
 33 49 4 25 41 30 35 7 32 17 1 12 15) 
 
  1 3 4 5 7 9_bin 4     / binary search through list returning index 
2 
 
  1 3 4 5 7 9_binl 2 4 6 / binary seach for a list of numbers 
1 2 4 
 
  16_vs 543212    / vector from scalar: changes base to 16 
8 4 9 14 12 
 
  5 3 2_vs 21     / also does variable change of base 
3 1 1 
 
  5 3 2_sv 3 1 1  / scalar from vector: the inverse 
21 
 
  _host`kuro5hin.org        / returns ip address as integer 
-815566008 
 
  256_vs _host`kuro5hin.org / presentation form 
207 99 115 72 

Adverbs

This is where K starts to set itself from apart from most of the common programming languages in use today. You rarely write loops in K (KDB is 100% loop-free), instead you use adverbs. An adverb modifies a function, returning another function, changing the ways it operates over its arguments and what it does with it's return values. Here is a small selection of adverbs' usages (there are other uses that are not covered here). Examples:

  +/1 2 3 4          / plus-over (sum): is similar to 1+2+3+4 
10 
 
  +\1 2 3 4          / plus-scan: returns the intermediate values of +/ 
1 3 6 10 
 
  |/5 3 7 4 2        / max-over: compares all items like 5|3|7|4|2 
7 
 
  ,/(1 2;(3 4;5);6)  / join-over: (1 2),(3 4;5),6 
(1;2;3 4;5;6) 
 
  ,//(1 2;(3 4;5);6) / flatten: explained below 
1 2 3 4 5 6 
 
  3 4_draw'-3 -4     / draw-each: (3_draw-3),(4_draw-4) 
(1 2 0 
 2 0 1 3) 
 
  2_draw/:10 100     / draw-right-each: (2_draw 10),(2_draw 100) 
(7 7 
 45 91) 
Let me break down flatten for you. Join is a function that takes two values. Over modifies join and the result is a function that now takes one argument (a list) and joins all the elements of that list. Looking at the above example, when join was given a nested list, it simple stitched all the elements into a single list, removing one level of depth. However, we would like to remove all level of depth. We want join-over again to remove another level of depth, and join-over again to remove another level of depth, until the list is completely flat. This is what converge will do for us. The second application of over takes this monadic function, join-over, and continually applies it until the result no longer changes. If we didn't want to flatten the topmost list, but instead we had a list of lists to flatten, we would only want to apply flatten to each element of the top-level list. That is what flatten-each (,//') would do. If we wanted to keep the top two levels of depth we would write flatten-each-each (,//'') and this is where things start too look like line noise.

Conditionals

Although rarely used, there are a few conditionals; most often used is the colon. It is similar to cond in Lisp: it takes pairs of arguments and an optional final argument. The first argument of each pair is tested for truth (0 is false, all other integers are true, anything besides an integer is an error). If it is true then the result of evaluating the second of the pair is returned. If it is false then the next pair is tested. If all the pairs have been exhausted, then the final argument is evaluated and the result returned. If there is no final argument and all the conditions are false, then null is returned.

Examples:

  :[0;"true";"false"]
"false" 
 
  s:{:[x>0;"+"; x<0;"-"; "0"]} / returns the sign of x or 0 
  s 4 
"+" 
 
  s -3 
"-" 

Naive Primality Test

I will conclude the basics with a naive primality predicate and try to explain it.
  isprime:{&/x!/:2_!x}  / min over x mod right-each 2 drop enumerate x 
  isprime 14 
0 
 
  isprime 7 
1 
Analyzing from right to left. We create a list of all integers from 0 to x (exclusive), then we remove the first two elements (2_), so we are left with a list from 2 to x (exclusive). Next we determine the residue of x and each of the numers in the list. Finally, we calculate the minimum of the residues. If the number is prime, then the lowest residue will be 1 and be considered true. If the number was composite there will be a 0 residue for some value.
  !14 
0 1 2 3 4 5 6 7 8 9 10 11 12 13 
 
  2_!14 
2 3 4 5 6 7 8 9 10 11 12 13 
 
  14!/:2_!14 
0 2 2 4 2 0 6 5 4 3 2 1 
 
  &/14!/:2_!14 
0 

The K-tree

Here is a quick, small look at one of the unique elements of K, called the K-tree. It falls nicely into the K philosophy of keeping things simple and powerful. The K-tree can be used for modularization of programs, as a variable scoping mechanism, for GUI design, and as a rudimentary object system.

All variables exist somewhere on the K-tree. To reference a variable you separate the name of the variable and each branch name with periods. The root of the tree is an empty symbol. For example, you might have a fully qualified variable named .branch.subbranch.variable.

A branch is really just a dictionary. If I were to assign a dictionary that contained symbols sym and bol to the variable .tr.ee then you would be able to access .tr.ee.sym and .tr.ee.bol. It goes the other way, too. If you were to create the variables .dict.ion and .dict.ary then the variable .dict would be a valid dictionary. This makes the language very reflective since you can now modify variables locations, scopes, and manually manipulate extents.

Whenever you are in the K environment you are always running in a branch of the K-tree. When the K interpreter starts it places you in the .k branch. You can move around the tree with the directory (\d) command. For example \d .tw.ig would create the new branch tw then create a subbranch ig. You can inspect all the variables in a branch with the list variables (\v) command. Often a script will begin with a change directory command and then define all of its variables in that (and maybe a few more) branches. This effectively uses the K-tree as a module system.

  \d .test          / create a new directory off the root 
  \d ement          / create a sub-branch 
  \d                / show the current directory 
.test.ement 
 
  new:`small        / put some values in the directory 
  old:`big 
  \v                / list the contents o 
new old 
 
  \d ^              / back up one directory in the tree 
  \d 
.test 
 
  \v 
ement 
 
  ement             / inspect the value of ement 
.((`new;`small;) 
  (`old;`big;)) 
 
  .test.ement.new   / fully-qualified 
`small 
 
  ement.old         / partially qualified 
`big 
 
  ement`old         / index like an array 
`big 

Other Unique Elements of K

The K philosophy itself has some unique insights on how to treat data and the appropriate way to organize data for efficient processing, however within the language there are some features that you will not find in any other language:

K uses the tree to hold an attribute structure. These attributes are used for documentation strings, GUI representations, for other functionality in K, and for whatever you decide to use them for. Some of the programming environments for K make entensive use of attributes to store information about where functions are defined and other administrative information.

K also has a the concept of dependencies and triggers. They are used for efficient, demand-based calculation of data and executing callback code whenever a value is changed (this is how the timer system in K works). K will keep track of out of date dependencies for you and only do the minimal amount of work necessary to update values.

K has a unique, declarative GUI subsystem that is based around the K tree and triggers. It takes very little effort to put a GUI on your application. K takes the approach that you show variables and changes made to that variable on the screen are immediately reflected in the program without any work by you.

K's interprocess communication (IPC) and network communication systems is also based around callbacks and message handlers. There are simple K primitives that will ship entire K data structures around for you, or you may do it yourself. The goal of the IPC system is like the goal with rest of K, make it fast, simple, powerful, and highly useful.

In a time when programming languages are lacking in originality and are not bringing new ideas to the table, K succeeds where others fail. But K is not just a research language, not appropriate for real-world use. While the community may be small some of the users of the language are very big. With recent implementations by Island ECN and the US government, this looks to only be getting bigger, too. The next version of the language will fix many of the nagging holes and annoyances and take away the line noise factor that has pushed many away.

Here are some closing simple segments of K that are informative on how the K way of approaching a problem may be different:

  cut:{1_'(&x=*x)_ x:" ",x} 
  cut "Scoop ate my spaces" 
("Scoop" 
 "ate" 
 "my" 
 "spaces") 
 
  fibonacci:{x(|+\)\1 1} 
  fibonacci 5 
(1 1 
 2 1 
 3 2 
 5 3 
 8 5 
 13 8) 
 
  first:*:
  residue:{y!x}
  euclid:{first(|residue\)\x,y} / Euclid's Algorithm
  euclid[24;40] 
(24 40 
 16 24 
 8 16 
 0 8)