Crafting Interpreters in OCaml - Preface
May 15, 2019
I’ll be writing a series of posts documenting my goals of learning OCaml as well as understanding how interpreters and compilers work. The motivation here is twofold.
- I want to challenge myself and step out of the comfort of usual web development and all the ease of the JS ecosystem
- I want to better understand how programming languages actually work.
To learn OCaml, I’ve worked through a Cornell class’s online notes 1 as well as read much of version 1 of Real World OCaml. When version 2 comes out, I’ll try to update these notes. I’ve also been devouring YouTube videos from Jane Street as well as spending a lot of time in the ReasonML / OCaml Discord
To understand more about interpreters, I’ll be using two primary resources.
By the time this series is done, I hope to have written an interpreter for the Lox language used in Crafting Interpreters using OCaml. I’ll be establishing a few rules in order to throw myself into the deep end.
- Where possible, I will use the OCaml data type system to help ensure exhaustive checking of various states. Variant types are a powerful tool available in OCaml (and Haskell, F# and many other ML-inspired langauges) that can serve as a way of flagging different cases. For example, a chunk of code can be a string, a number, or some keyword. Using these types, I could state that a code chunk is
type Chunk =
| Keyword
| String of string
| Number of int
-
Where possible, I’ll avoid mutation, classes and global state. In chapter 1, Bob uses a Scanner class with fields of
start
andcurrent
, both ints, to track lexeme parsing (aka just parsing out phrases). Rather than this, I’ll be writing a Scanner module and passing around a context object between functions. -
Where possible, I’ll use recursion over looping. In the book, there are times Bob uses
while
loops to pick out phrases. For example, continue parsing out a chunk of code while the next character is not a space, then stop. That code might look something like (in JS).
let start = 0;
let current = 0;
let phrase = "";
const testString = "Hello world!";
while (testString[current] != " ") {
phrase = phrase + testString[current];
current++;
}
// phrase = "Hello"
Rather than a while loop, I’ll be using pattern matching and recursion. In OCaml, that might look something like the following.
type context = {
start: int;
current: int;
phrase: string;
source: string;
}
let ctx = {
start = 0;
current = 0;
phrase = "";
source = "Hello world!";
}
let rec capture_until_space context =
let next_char = (String.get context.source context.current) in
match next_char with
| ' ' -> context
| _ -> capture_until_space {
context with current = context.current + 1;
phrase = (context.phrase ^ Char.escaped next_char);
}
(*
# capture_until_space ctx;;
- : context = {start = 0; current = 5; phrase = "Hello"; source = "Hello world!"}
*)
If I succeed in doing this, I can maintain purity.
I’ll be also attempting to write unit tests along the way using OUnit2. I’ll also attempt to implement a setup with Dune to make it easier to build and follow along. I know the Reason community has tools like Esy as well. If I’m ambitious, I’ll use that as well to get everything set up.
Wish me luck!
I work and live in Brooklyn, NY building software.