This document is the primary reference for the Carth programming language. It is updated on a best-effort basis. It should be valid and complete, but it may not be.
2Lexical structure
For details about the syntax that are not covered explicitly in this document or cannot be inferred from the provided examples, please consult the source of the parser directly in Parse.hs.
3Items
3.1Global variable definitions
Global variables are defined at the top-level by using either one
of the two implicitly typed variable define
and function
define
special forms, or their explicitly typed define:
counterparts.
The function definition special forms allows us to name parameters
and deconstruct them with irrefutable patterns. This is essentially
syntactic sugar for binding a variable to a fun-match
expression.
Global variables are practically equivalent to local variables, in all respects but scope.
3.1.1Tail recursion
Since Carth doesn't have any builtin loop construct and rely on tail recursion instead, optimizing tail calls is necessary in order for very long or even infinite recursions to not filling up the stack and crash. In order to make FFI with C relatively trivial, Carth always conforms to the C calling convention. One implication of this is that LLVM can't optimize arbitrary calls in tail positions (as it can with the , but it will still perform the restricted form of tail call optimization called sibling call optimization. This means that if the following conditions are true, Carth guarantees that tail calls are optimized and the stack frame is reused:
- Caller and callee are of the same type.
- Caller and callee are not curried functions. I.e., they only take one argument (which may be of any type, including a tuple).
- Call is in tail position.
If you notice that tail calls are not optimized (i.e. you get a stack overflow / segfault) even when the above conditions are true, please open an issue with a bug report!
3.1.2Examples
;; First form. Implicitly typed variable definition. ;; ;; Bind the global variable ~fst-iv~ to the function that ;; extracts the first value of a pair. No explicit type ;; signature is given - it is statically inferred by the ;; typechecker to be the equivalent of ~(forall (a b) (Fun ;; (Pair a b) a))~ (define fst-iv (fun ((Pair x _)) x)) ;; Second form. Implicitly typed function definition. ;; ;; Again, the fst function, equivalent to the defition ;; above, but using the more convenient function definition ;; syntax. (define (fst-if p) (match p (case (Pair x _) x))) ;; ... and again, together with the even more conventient ;; destructuring syntax (define (fst-ifd (Pair x _)) x) ;; Third form. Explicitly typed variable definition. ;; ;; Explicit type signature is given. The type must properly ;; be a polytype (aka /type scheme/) to be valid and ;; instantiable for any monotype. (define: fst-ev (forall (a b) (Fun (Pair a b) a)) (fun-match (case (Pair x _) x))) ;; Fourth form. Explicitly typed function definition. (define: (fst-ef (Pair x _)) (forall (a b) (Fun (Pair a b) a)) x)
3.2Type definitions
Algebraic datatypes (aka tagged/discriminated unions) are defined
at the top-level with the type
special form. Roughly equivalent
to data
in Haskell and enum
in Rust.
Recursive datatypes must contain a Box
indirection to be
representable in memory with a finite size.
Datatypes may be uninhabited, i.e. defined without any variants,
like (type Void)
. As these types have no variants, they have no
values, and we can't construct them – thus the name
"uninhabited".
3.2.1Examples
;; First form. Monomorphic datatype definition. ;; ;; ~Age~ only has one variant, and as such can be seen as a ;; "wrapper" around ~Int~, restricting its usage. (type Age (Age Int)) ;; Second form. Polymorphic datatype definition. ;; ;; ~List~ has two variants, representing that a list can ;; either be empty, or a pair of a head and a tail. Note ;; that we must have a ~Box~ indirection so that it doesn't ;; have infinite size. (type (List' a) (Cons' a (Box (List' a))) Nil') ;; An uninhabited type that can't be constructed. Useful ;; when you want to employ the type system to make invalid ;; states unrepresentable, or model propositions as types. (type Void')
4Expressions
4.1Literals
- Unit
unit
is the only value inhibiting the typeUnit
, equivalent to()
in Haskell and Rust.- Int
- 64-bit signed integer literal. Example:
42
. - Double
- 64-bit double precision floating point literal. Example:
-13.37
. - String
- UTF-8 string literals. At the moment, generates to static
arrays. Will likely be changed. Example:
"Hello, World!"
,"😄😦🐱"
. - Bool
True
orFalse
.
4.2Let
4.3Type ascription
Type ascriptions are primarily used to:
- increase readability when the type of an expression is not obvious;
- assert at compile-time that an expression is of or can specialize to the given type;
- or specialize the type of a generic expression, restricting its usage.
4.3.1Example
(define (id-int x) (: x Int)) ;; Inferred type of ~id-int~: (Fun Int Int)
4.4Match
Pattern matching. Can match against literals to test for equality, against constructions to deconstruct datatypes, against names to bind a variable to (a substructure of) the matchee.
The literal-types that can be matched against are integers, bools, and strings.
The cases of a match-expression must be exhaustive and non-redundant.
When pattern matching on an uninhabited type, no cases can be given
as the type has no constructors, and the match-expression as a
whole is absurd. Absurdity, like panic
or undefined
, implies
any type, as it's unreachable.
4.4.1Example
(type Foo Bar Baz) (type (Pair' a b) (Pair' a b)) ;; Ok (define (fst' pair) (match pair (case (Pair' a _) a))) ;; Matching on an uninhabited type corresponds to the ;; elimination rule for ⊥ (it implies anything). (define: (absurd void) (forall (a) (Fun Void' a)) (match void)) (define read-binop (fun-match (case "plus" +) (case "times" *) (case s (panic (str-append "Undefined binop " s)))))
;; Error. Redundant pattern. ~Pair _ _~ already covered by ;; previous pattern ~_~ (define (redundant pair) (match pair (case _ 1) (case (Pair' x y) 2))) ;; Error. Inexhaustive pattern. All cases not covered, ;; specifically ~Bar~ (define (inexhaustive foo) (match foo (case Baz 123)))
4.5FunMatch
Syntax sugar for a match
in a lambda. Equivalent to \case
(LambdaCase) in Haskell. (fun-match cases...)
translates to (fun
VAR (match VAR cases...))
where VAR
is a uniquely internally
generated variable that cannot be expressed by the user (which
means it won't shadow any other binding).
4.5.1Example
;; Two versions of `fst`, which returns the first value of a ;; pair ;; ;; using normal `match` (define (fst-nofun p) (match p (case (Pair a _) a))) ;; and using `fun-match` (define fst-fun (fun-match (case (Pair a _) a)))
4.6Constructor
By applying a constructor to some arguments, or just presenting it literally in the case of a nullary constructor, a value of the associated algebraic datatype is produced. Constructors of arity > 0 behave like n-ary functions: curried and the whole shebang.
4.6.1Example
;; The following datatype definition will make available the ;; constructors ~UPUnit~ and ~UPPair~ in the environment. (type UnitOrPair UPUnit (UPPair Int Int)) ;; The ~UPUnit~ constructor is nullary, and will construct a ;; ~UnitOrPair~ just presented literally. (define: upunit UnitOrPair UPUnit) ;; The ~UPPair~ constructor is binary, and takes two ;; arguments to construct a ~UnitOrPair~. It behaves like a ;; function of two ~Int~ arguments, returning a ;; ~UnitOrPair~. (define: uppair'' (Fun Int Int UnitOrPair) UPPair) (define: uppair' (Fun Int UnitOrPair) (UPPair 3)) (define: uppair UnitOrPair (uppair' 5))
5Patterns
Patterns are used to conditionally deconstruct values of algebraic datatypes in pattern-matching contexts.
There are 3 kinds of patterns: nullary constructors, n-ary constructions, and variable bindings.
5.1Example
See Match/Example.
6Literate Carth
Carth has native support for literate programming with Org
mode. Either use Emacs with Babel in Org-mode for an interactive
session, or interpret/compile the file with carth
just like a
normal .carth
file!
6.1Example
Consider a file cool.org
with the following content:
#+TITLE: Literate Programming Rules! Literate programming is just really cool! ~carth~ will assume ~tangle~ = ~yes~ by default, but setting it explicitly won't hurt. #+BEGIN_SRC carth :tangle yes (define (main _) (printInt (id 1337))) #+END_SRC * The ~id~ function ~id~ is the identity function. It returns its argument unchanged. #+BEGIN_SRC carth (define (id x) x) #+END_SRC * How not to use ~id~ Here is an example of how not to use ~id~. Note that this won't compile. We show this in a SRC block to get syntax highlighting etc, but as ~tangle~ is ~no~, this source block will be ignored by carth. #+BEGIN_SRC carth :tangle no (printInt id) #+END_SRC
When compiling this file with carth c cool.org
, the Carth source
will be untangled from the rest of the document. Line numbers are
preserved. The result of the untangling stage will be the
following:
(define (main _) (printInt (id 1337))) (define (id x) x)
And for completeness, the result of interpreting that will be 1337
.