Introduction

Calcit scripting language.

an interpreter for calcit snapshot, and hot code swapping friendly.

Calcit is an interpreter built with Rust, and also a JavaScript code emitter. It's inspired mostly by ClojureScript. Calcit-js emits JavaScript in ES Modules syntax.

You can try Calcit WASM build online for simple snippets.

TODO

Overview

  • Immutable Data

Values and states are represented in different data structures, which is the semantics from functional programming. Internally it's im in Rust and a custom finger tree in JavaScript.

  • Lisp(Code is Data)

Calcit-js was designed based on experiences from ClojureScript, with a bunch of builtin macros. It offers similar experiences to ClojureScript. So Calcit offers much power via macros, while keeping its core simple.

  • Indentations

With bundle_calcit command, Calcit code can be written as an indentation-based language. So you don't have to match parentheses like in Clojure. It also means now you need to handle indentations very carefully.

  • Hot code swapping

Calcit was built with hot swapping in mind. Combined with calcit-editor, it watches code changes by default, and re-runs program on updates. For calcit-js, it works with Vite and Webpack to reload, learning from Elm, ClojureScript and React.

  • ES Modules Syntax

To leverage the power of modern browsers with help of Vite, we need another ClojureScript that emits import/export for Vite. Calcit-js does this! And this page is built with Calcit-js as well, open Console to find out more.

Features from Clojure

Calcit is mostly a ClojureScript dialect. So it should also be considered a Clojure dialect.

There are some significant features Calcit is learning from Clojure,

  • Runtime persistent data by default, you can only simulate states with Refs.
  • Namespaces
  • Hygienic macros(although less powerful)
  • Higher order functions
  • Keywords
  • Compiles to JavaScript, interops
  • Hot code swapping while code modified, and trigger an on-reload function
  • HUD for JavaScript errors

Also there are some differences:

FeatureCalcitClojure
Host LanguageRust, and use dylibs for extendingJava/Clojure, import Mavan packages
SyntaxIndentations / Syntax Tree EditorParentheses
Persistent dataunbalanced 2-3 Tree, with tricks from FingerTreeHAMT / RRB-tree
Package managergit clone to a folderClojars
bundle js modulesES Modules, with ESBuild/ViteGoogle Closure Compiler / Webpack
operand orderat firstat last
Polymorphismat runtime, slow .map ([] 1 2 3) fat compile time, also supports multi-arities
REPLonly at command line: cr --eval "+ 1 2"a real REPL
[] syntax[] is a built-in functionbuiltin syntax
{} syntax{} (:a b) is macro, expands to &{} :a :bbuiltin syntax

also Calcit is a one-person language, it has too few features compared to Clojure.

Calcit shares many paradiams I learnt while using ClojureScript. But meanwhile it's designed to be more friendly with ES Modules ecosystem.

Indentation-based Syntax

Calcit was designed based on tools from Cirru Project, which means, it's suggested to be programming with Calcit Editor. It will emit a file compact.cirru containing data of the code. And the data is still written in Cirru EDN, Clojure EDN but based on Cirru Syntax.

For Cirru Syntax, read http://text.cirru.org/, and you may find a live demo at http://repo.cirru.org/parser.coffee/. A normal snippet looks like: this

defn fibo (x)
  if (< x 2) (, 1)
    + (fibo $ - x 1) (fibo $ - x 2)

But also, you can write in files and bundle compact.cirru with a command line bundle_calcit.

To run compact.cirru, internally it's doing steps:

  1. parse Cirru Syntax into vectors,
  2. turn Cirru vectors into Cirru EDN, which is a piece of data,
  3. build program data with quoted Calcit data(very similar to EDN, but got more data types),
  4. interpret program data.

Since Cirru itself is very generic lispy syntax, it may represent various semantics, both for code and for data.

Inside compact.cirru, code is like quoted data inside (quote ...) blocks:

{} (:package |app)
  :configs $ {} (:init-fn |app.main/main!) (:reload-fn |app.main/reload!)

  :entries $ {}
    :prime $ {} (:init-fn |app.main/try-prime) (:reload-fn |app.main/try-prime)
      :modules $ []

  :files $ {}
    |app.main $ {}
      :ns $ quote
        ns app.main $ :require
      :defs $ {}
        |fibo $ quote
          defn fibo (x)
            if (< x 2) (, 1)
              + (fibo $ - x 1) (fibo $ - x 2)

Notice that in Cirru |s prepresents a string "s", it's always trying to use prefixed syntax. "\"s" also means |s, and double quote marks existed for providing context of "character escaping".

More about Cirru

A review of Cirru in Chinese:

Installation

You need to install Rust first, then install Calcit with Rust:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

cargo install calcit

then Calcit is available as a Rust command line:

cr -e "echo |done"

Modules directory

No package manager yet, need to manage modules with git tags.

Configurations inside calcit.cirru and compact.cirru:

:configs $ {}
  :modules $ [] |memof/compact.cirru |lilac/

Paths defined in :modules field are just loaded as files from ~/.config/calcit/modules/, i.e. ~/.config/calcit/modules/memof/compact.cirru.

Modules that ends with /s are automatically suffixed compact.cirru since it's the default filename.

To load modules in CI environments, make use of git clone.

Rust bindings

Rust supports extending with dynamic libraries. A demo project can be found at https://github.com/calcit-lang/dylib-workflow

Currently two APIs are supported, based on Cirru EDN data.

First one is a synchronous Edn API with type signature:


#![allow(unused)]
fn main() {
#[no_mangle]
pub fn demo(args: Vec<Edn>) -> Result<Edn, String> {
}
}

The other one is an asynchorous API, it can be called multiple times, which relies on Arc type(not sure if we can find a better solution yet),


#![allow(unused)]
fn main() {
#[no_mangle]
pub fn demo(
  args: Vec<Edn>,
  handler: Arc<dyn Fn(Vec<Edn>) -> Result<Edn, String> + Send + Sync + 'static>,
  finish: Box<dyn FnOnce() + Send + Sync + 'static>,
) -> Result<Edn, String> {
}
}

in this snippet, the function handler is used as the callback, which could be called multiple times.

The function finish is used for indicating that the task has finished. It can be called once, or not being called. Internally Calcit tracks with a counter to see if all asynchorous tasks are finished. Process need to keep running when there are tasks running.

Asynchronous tasks are based on threads, which is currently decoupled from core features of Calcit. We may need techniques like tokio for better performance in the future, but current solution is quite naive yet.

Also to declare the ABI version, we need another function with specific name so that Calcit could check before actually calling it,


#![allow(unused)]
fn main() {
#[no_mangle]
pub fn abi_version() -> String {
  String::from("0.0.1")
}
}

(This feature is not stable enough yet.)

Call in Calcit

Rust code is compiled into dylibs, and then Calcit could call with:

&call-dylib-edn (get-dylib-path "\"/dylibs/libcalcit_std") "\"read_file" name

first argument is the file path to that dylib. And multiple arguments are supported:

&call-dylib-edn (get-dylib-path "\"/dylibs/libcalcit_std") "\"add_duration" (nth date 1) n k

calling a function is special, we need another function, with last argument being the callback function:

&call-dylib-edn-fn (get-dylib-path "\"/dylibs/libcalcit_std") "\"set_timeout" t cb

Extensions

Currently there are some early extensions:

Run Calcit

There are several modes to run Calcit.

Eval

cr -e 'println "Hello world"'

which is actually:

cr --eval 'println "Hello world"'

Run program

For a local compact.cirru file, run:

cr

by default, Calcit has watcher launched. If you want to run without a watcher, use:

cr -1

Generating JavaScript

cr --emit-js

Generating IR

cr --emit-ir

Run in Eval mode

use --eval or -e to eval code from CLI:

$ cr -e 'echo |demo'
1
took 0.07ms: nil
$ cr -e 'echo "|spaced string demo"'
spaced string demo
took 0.074ms: nil

You may also run multiple snippets:

=>> cr -e '
-> (range 10)
  map $ fn (x)
    * x x
'
calcit version: 0.5.25
took 0.199ms: ([] 0 1 4 9 16 25 36 49 64 81)

CLI Options

$ cr --help
Calcit Runner 0.5.14
Jon. <jiyinyiyong@gmail.com>
Calcit Runner

USAGE:
    cr [FLAGS] [OPTIONS] [--] [input]

FLAGS:
        --emit-ir        emit EDN representation of program to program-ir.cirru
        --emit-js        emit js rather than interpreting
    -h, --help           Prints help information
    -1, --once           disable watching mode
        --reload-libs    reload libs data during code reload
    -V, --version        Prints version information

OPTIONS:
    -d, --dep <dep>...             add dependency
        --emit-path <emit-path>    emit directory for js, defaults to `js-out/`
        --entry <entry>            overwrite with config entry
    -e, --eval <eval>              eval a snippet
        --init-fn <init-fn>        overwrite `init_fn`
        --reload-fn <reload-fn>    overwrite `reload_fn`
        --watch-dir <watch-dir>    a folder of assets that also being watched

ARGS:
    <input>    entry file path, defaults to compact.cirru [default: compact.cirru]

Hot Swapping

Bundle Mode

By design, Calcit program is supposed to be written with calcit-editor. And you can try short snippets in eval mode.

If you want to code larger program with calcit-editor, it's also possible. Find example in minimal-calcit.

With bundle_calcit command, Calcit code can be written as an indentation-based language. So you don't have to match parentheses like in Clojure. It also means now you need to handle indentations very carefully.

You need to bundle files first into a compact.cirru file. Then use cr command to run it. .compact-inc.cirru will be generated as well to trigger hot code swapping. Just launch these 2 watchers in parallel.

Data

  • Bool
  • Number, which is actuall f64 in Rust and Number in Rust
  • Keyword
  • String
  • Vector, serving the role as both List and Vector
  • HashMap
  • HashSet
  • Function.. maybe, internally there's also a "Proc" type.

Persistent Data

Calcit uses rpds for HashMap and HashSet, and use Ternary Tree in Rust.

For Calcit-js, it's all based on ternary-tree.ts, which is my own library. This library is quite naive and you should not count on it for good performance.

Optimizations for vector in Rust

Although named "ternary tree", it's actually unbalanced 2-3 tree, with tricks learnt from finger tree for better performance on .push_right() and .pop_left().

For example, this is the internal structure of vector (range 14):

when a element 14 is pushed at right, it's simply adding element at right, creating new path at a shallow branch, which means littler memory costs(compared to deeper branches):

and when another new element 15 is pushed at right, the new element is still placed at a shallow branch. Meanwhile the previous branch was pushed deeper into the middle branches of the tree:

so in this way, we made it cheaper in pushing new elements at right side. These steps could be repeated agained and again, new elements are always being handled at shallow branches.

This was the trick learnt from finger tree. The library Calcit using is not optimal, but should be fast enough for many cases of scripting.

Cirru Extensible Data Notation

Data notation based on Cirru. Learnt from Clojure EDN.

EDN data is designed to be transferred across networks are strings. 2 functions involved:

  • parse-cirru-edn
  • format-cirru-edn

although items of a HashSet nad fields of a HashMap has no guarantees, they are being formatted with an given order in order that its returns are reasonably stable.

Liternals

For literals, if written in text syntax, we need to add do to make sure it's a line:

do nil

for a number:

do 1

for a symbol:

do 's

there's also keyword:

do :k

String escaping

for a string:

do |demo

or wrap with double quotes to support special characters like spaces:

do "|demo string"

or use a single double quote for mark strings:

do "\"demo string"

\n \t \" \\ are supported.

Data structures:

for a list:

[] 1 2 3

or nested list inside list:

[] 1 2
  [] 3 4

HashSet for unordered elements:

#{} :a :b :c

HashMap:

{}
  :a 1
  :b 2

also can be nested:

{}
  :a 1
  :c $ {}
    :d 3

Also a record:

%{} :A
  :a 1

Quotes

For quoted data, there's a special semantics for representing them, since that was neccessary for compact.cirru usage, where code lives inside a piece of data, marked as:

quote $ def a 1

at runtime, it's represented with tuples:

:: 'quote $ [] |def |a |1

which means you can eval:

$ cr -e "println $ format-cirru-edn $ :: 'quote $ [] |def |a |1"

quote $ def a 1

took 0.027ms: nil

and also:

$ cr -e 'parse-cirru-edn "|quote $ def a 1"'
took 0.011ms: (:: 'quote ([] |def |a |1))

This is not a generic solution, but tuple is a special data structure in Calcit and can be used for marking up different types of data.

Buffers

there's a special syntax for representing buffers in EDN using pairs of Hex digits as u8:

buf 03 55 77 ff 00

which corresponds to:

&buffer 0x03 0x55 0x77 0xff 0x00

Comments

Comment expressions are started with ;. They are evaluated into nothing, but not available anywhere, at least not available at head or inside a pair.

Some usages:

[] 1 2 3 (; comment) 4 (; comment)
{}
  ; comment
  :a 1

Also notice that comments should also obey Cirru syntax. It's comments inside the syntax tree, rather than in parser.

Features

Still like Clojure.

List

Calcit List is persistent vector that wraps on ternary-Tree in Rust, which is 2-3 tree with optimization trick from fingertrees.

In JavaScript, it's ternary-tree in older version, but also with a extra CalcitSliceList for optimizing. CalcitSliceList is fast and cheap in append-only cases, but might be bad for GC in complicated cases.

But overall, it's slower since it's always immutable at API level.

Usage

Build a list:

[] 1 2 3

consume a list:

let
    xs $ [] 1 2 3 4
    xs2 $ append xs 5
    xs3 $ conj xs 5 6
    xs4 $ prepend xs 0
    xs5 $ slice xs 1 2
    xs6 $ take xs 3

  println $ count xs

  println $ nth xs 0

  println $ get xs 0

  println $ map xs $ fn (x) $ + x 1

  &doseq (x xs) (println a)

thread macros are often used in transforming lists:

-> (range 10)
  filter $ fn (x) $ > x 5
  map $ fn (x) $ pow x 2

Why not just Vector from rpds?

Vector is fast operated at tail. In Clojure there are List and Vector serving 2 different usages. Calcit wants to use a unified structure to reduce brain overhead.

It is possible to extend foreign data types via FFI, but not made yet.

HashMap

In Rust it's using rpds::HashTrieMap. And in JavaScript, it's built on top of ternary-tree with some tricks for very small dicts.

Usage

{} is a macro, you can quickly write in pairs:

{}
  :a 1
  :b 2

internally it's turned into a native function calling arguments:

&{} :a 1 :b 2
let
    dict $ {}
      :a 1
      :b 2
  println $ to-pairs dict
  println $ map-kv dict $ fn (k v)
    [] k (inc v)

Macros

Like Clojure, Calcit uses macros to support new syntax. And macros ared evaluated during building to expand syntax tree. A defmacro block returns list and symbols, as well as literals:

defmacro noted (x0 & xs)
  if (empty? xs) x0
    last xs

A normal way to use macro is to use quasiquote paired with ~x and ~@xs to insert one or a span of items. Also notice that ~x is internally expanded to (~ x), so you can also use (~ x) and (~@ xs) as well:

defmacro if-not (condition true-branch ? false-branch)
  quasiquote $ if ~condition ~false-branch ~true-branch

To create new variables inside macro definitions, use (gensym) or (gensym |name):

defmacro case (item default & patterns)
  &let
    v (gensym |v)
    quasiquote
      &let (~v ~item)
        &case ~v ~default ~@patterns

Calcit was not designed to be identical to Clojure, so there are many details here and there.

Debug Macros

use macroexpand-all for debugging:

$ cr -e 'println $ format-to-cirru $ macroexpand-all $ quote $ let ((a 1) (b 2)) (+ a b)'

&let (a 1)
  &let (b 2)
    + a b

format-to-cirru and format-to-lisp are 2 custom code formatters:

$ cr -e 'println $ format-to-lisp $ macroexpand-all $ quote $ let ((a 1) (b 2)) (+ a b)'

(&let (a 1) (&let (b 2) (+ a b)))

The syntax macroexpand only expand syntax tree once:

$ cr -e 'println $ format-to-cirru $ macroexpand $ quote $ let ((a 1) (b 2)) (+ a b)'

&let (a 1)
  let
      b 2
    + a b

JavaScript Interop

To access JavaScript global value:

do js/window.innerWidth

To access property of an object:

.-name obj

To call a method of an object, slightly different from Clojure:

.!setItem js/localStorage |key |value

To be noticed: (.m a p1 p2) is calling an internal implementation of polymorphism in Calcit.

To construct an array:

let
    a $ js-array 1 2
  .!push a 3 4
  , a

To construct an object:

js-object
  :a 1
  :b 2

To create new instance from a constructor:

new js/Date

Imports

Calcit loads namespaces from compact.cirru and modules from ~/.config/calcit/modules/. It's using 2 rules:

ns app.demo
  :require
    app.lib :as lib
    app.lib :refer $ f1 f2

By using :as, it's loading a namespace as lib, then access a definition like lib/f1. By using :refer, it's importing the definition.

JavaScript imports

Imports for JavaScript is similar,

ns app.demo
  :require
    app.lib :as lib
    app.lib :refer $ f1 f2

after it compiles, the namespace is eliminated, and ES Modules import syntax is generated:

import * as $calcit from "./calcit.core";
import * as $app_DOT_lib from "app.lib"; // also it will generate `$app_DOT_lib.f1` for `lib/f1`
import { f1, f2 } from "app.lib";

There's an extra :default rule for loading Module.default.

ns app.demo
  :require
    app.lib :as lib
    app.lib :refer $ f1 f2

    |chalk :default chalk

which generates:

// ...

import chalk from "chalk";

Polymorphism

Calcit uses tuples to simulate objects. Inherence not supported.

Core idea is inspired by JavaScript and also borrowed from a trick of Haskell since Haskell is simulating OOP with immutable data structures.

Terms

  • "Tuple", the data structure of 2 items, written like (:: a b). It's more "tagged union" in the case of Calcit.
  • "class", it's a concept between "JavaScript class" and "JavaScript prototype", it's using a record containing functions to represent the prototype of objects.
  • "objects", Calcit has no "OOP Objects", it's only tuples that simulating objects to support polymorphism.

which makes "tuple" a really special data type Calcit.

Usage

Define a class:

defrecord! MyNum
  :inc $ fn (self)
    update self 1 inc
  :show $ fn (self)
    str $ &tuple:nth self 1

notice that self in this context is (:: MyNum 1) rather than a bare liternal.

get an obejct and call method:

let
    a $ :: MyNum 1
  println $ .show a

Not to be confused with JavaScript native method function which uses .!method.

Use it with chaining:

-> (:: MyNum 1)
  .update
  .show
  println

In the runtime, a method call will try to check first element in the passed tuple and use it as the prototype, looking up the method name, and then really call it. It's roughly same behavoirs running in JavaScript except that JavaScript need to polyfill this with partial functions.

Built-in classes

Many of core data types inside Calcit are treated like "tagged unions" inside the runtime, with some class being initialized at program start:

&core-number-class
&core-string-class
&core-set-class
&core-list-class
&core-map-class
&core-record-class
&core-nil-class
&core-fn-class

that's why you can call (.fract 1.1) to run (&number:fract 1.1) since 1 is treated like (:: &core-number-class 1) when passing to method syntax.

The cost of this syntax is the code related are always initialized when Calcit run, even all of the method syntaxes not actually called.

Some old materials

  • Dev log(中文) https://github.com/calcit-lang/calcit/discussions/44
  • Dev log in video(中文) https://www.bilibili.com/video/BV1Ky4y137cv

Ecosystem

Libraries:

Frameworks:

Tools: