jueves, 24 de diciembre de 2015

Lux Tutorial #1: Simple TODO list using Vert.x

Hello, and welcome to the 4th entry in the Lux Devlog, a blog dedicated to the design and implementation of the Lux Programming Language, an upcoming functional language in the Lisp tradition.

This post is going to be a very special one, because it's going to be the first full-fledged Lux tutorial.
It's also going to be a special tutorial, because rather than just showing you some syntax, functions and features of Lux, I'm actually going to show you an actual sample program I made for this occasion in order to illustrate all the features I'll be introducing.

You can also get the source code of the program to play with it and study at your leisure from this repo: https://github.com/LuxLang/tutorial1

OK, so... the 1st thing you need to do is clone that repo, as we're going to dive into the source code.

But first things first. We need to know how to compile the program and it wouldn't be a bad idea to take a peek at the finished product before we start reading the code.

First, make sure you have Leiningen installed, which is the build tool currently being used for Lux development.
If you don't have Leiningen, you can find out how to install it over here: http://leiningen.org/

Got Leiningen? Great!

Now, you need to get the Lux compiler we'll be using.
Download this one from the 0.3.1 release: https://github.com/LuxLang/lux/releases/download/0.3.1/luxc.jar and place it in the tutorial1 directory.

Next, fire up a terminal and head to the tutorial1 directory.
Now, issue this command: lein luxc compile

Once the compilation process is complete, we can run the server.
Type: java -jar target/jvm/program.jar
Now, go to http://localhost:8080/

Play with it all you want.
It's a very simple app.
As you can see, all of the work is being done in the server, with no JavaScript involved. That's because I wanted to focus the tutorial purely on Lux.
Note: There is no persistence for the TODO list, so once the server process is killed, your tasks are going to go away.

By the way, the CSS was borrowed from the TodoMVC project (with some minor alterations). All the credit for the beautiful design goes to them.

Now, before we dig into the code, I think it's a good idea to delineate what we're going to see.
The app contains a small library for generating HTML from Lux (as Text), as well as a small library for generating CSS (also as Text).
Aside from that, it includes some code for working with the Vert.x platform and some code for managing the state of the program (represented as a list, since it's a TODO program).

Also, it would be a good idea to have some editor support when checking out the .lux files, so if you're an Emacs user, I suggest you install lux-mode. You can find it here: https://github.com/LuxLang/lux-mode

Now, without further ado, let's dive into the code!



Before we get into the complicated stuff, let's get our feet wet with something simple... like HTML generation.

Head over to this file and give it a glance.
 (;import lux  
          (lux (data (text #open ("text:" Text/Monoid))  
                     (list #refer #all #open ("" List/Functor List/Fold)))  
               (meta lux  
                     (ast #as ast))))  

The first thing you'll see is this.
The imports section is the first part in any Lux module (Lux modules are just files ending in the .lux extension).

You'll notice that, unlike in many other languages, the name of the module itself is never specified.
That reason is that, to avoid unnecessary repetition, Lux just assumes that the name of the file corresponds to the module itself. Also, the nesting of the directories corresponds to the nesting of modules.

Now, let's analyse this for a moment.
The first thing that's imported is the lux module, which serves as a prelude full of useful functions and (mostly) macros.

After that, I import the lux/data/text and lux/data/list modules, alongside lux/meta/luxlux/meta/syntax and lux/meta/ast.
As you can image, nesting of module names can be done up to arbitrary depths.
If you just import a module, as is the case for lux/meta/syntax, you get all of it's exported definitions as top level definitions on your module.
However, if you give the module extra options, that assumption is no longer made and you must explicitly #refer which definitions you want. In the case of lux/data/list I want everything, so I #refer #all.
I can also give aliases to modules, in cases in which I don't want to import what they have but would like to, instead, refer to every definition in a piecemeal manner, using the alias to avoid having to prefix every definition with the full module name.
An example if this is when I import the lux/meta/ast module. The alias is introduced with the #as option.
Finally, you'll see that there's a weird option called #open. It's used for taking structures and generating top-level definitions out of their members. The text you see before each structure serves to declare a prefix to use when generating the top-level definitions. This is useful in order to avoid potential clashes, or just to better track what came from where.
Thanks to those #open options, we now have access to the map and foldL functions, both working for lists, but also to the text:++ function, for concatenating texts.

There's a bit more to importing than what we've seen here, but we'll leave that for future lessons. For now, let's move on to the rest of this file.
 (deftype #export Html Text)  
 (deftype #export Attributes  
   (List (, Text Text)))  

Here, we're defining what we mean by HTML and HTML attributes. As I previously said, our HTML will be Text, and our attributes will just be KV pairs of Text.
 (def attributes^  
   (Parser (List (, Text AST)))  
   (record^ (*^ (&^ text^ id^))))  
 (defsyntax #export (@attrs [attrs attributes^])  
   (:: Lux/Monad (wrap (@list (` (: Attributes  
                                    (@list (~@ (map (: (-> (, Text AST) AST)  
                                                       (lambda [[key val]]  
                                                         (` [(~ (ast;text key)) (~ val)])))  

Things start getting interesting, as we define our first macro!

To make defining HTML attributes a bit easier, we define the @attrs macro to take the attributes in { record syntax } and generate the list of tuples we need for the action attributes.
It even adds some code to do type-checking to ensure the result is a valid attributes object.

We can also see some pieces of Lux syntax you're probably unfamiliar with, so let's dissect this for a moment.

First, you'll notice I'm defining the macro using defsyntax. There's a defmacro macro for doing this too, but it's a bit more low-level and requires that you handle the AST tokens you receive manually.
To make things easier, defsyntax relies on monadic parsing of the AST tokens.
Here, we're parsing the tokens using the attributes^ parser defined above and storing the result inside the attrs variable.

If you look at the definition of of attributes^, you'll notice that it's build by using parser combinators. That's one of the beautiful things about monadic parsing, you can build bigger parsers by just combining smaller ones.
Here, we're using the record^ combinator, which flattens the elements of records in order to provide the given parser with a list of the flattened KV pairs.
The *^ combinator allows us to parse 0 or more instances of whatever the parser you give it can parse.
Then, the &^ combinator receives two parsers and attempts to run them in sequence, returning a tuple of their outputs.
text^ only succeeds if it encounters a #TextS token, in which case it returns it's content.
id^ is the identity parser and just returns to you the first AST token it finds.
The type of our attributes^ parser correctly represents the parser we build.

Now, back to @attrs!
You'll notice there's some funny bit of syntax involving Lux/Monad (which, by the way, is one of the definitions we imported from lux/meta/lux). The :: macro allows us to use individual members from structures, and in this case we're using wrap (Lux's version of Haskell's return function).

We're wrapping a list of syntax tokens, with just 1 token being generated by the back-quote ` macro. We're generating our attributes KV list and each tuple is being generated by mapping a function over the attributes we parsed. The ast;text function takes care of wrapping the text labels back as AST nodes.
We needed to parse them in the first place to ensure that they were really text nodes and not something silly, like numbers. But now it's time to turn them back into AST nodes.

. . .

Bellow @attrs there are 2 funny-looking macro definitions, but I'll talk about them later, as they'll become necessary at the end of this file.
 (def (concat-with sep elems)  
   (-> Text (List Text) Text)  
   (|> elems  
       (interpose sep)  
       (foldL text:++ "")))  
 (def (attrs->text attrs)  
   (-> Attributes Text)  
   (|> attrs  
       (map (lambda [[key val]] ($ text:++ key "=" "\"" val "\"")))  
       (concat-with " ")))  
 (def #export (node name attrs children)  
   (-> Text Attributes (List Html) Html)  
   ($ text:++ "<" name " " (attrs->text attrs) ">" (concat-with " " children) "</" name ">"))  

We're now at the meat of this, with the actual HTML generation.
The node function takes care of that, receiving the tag-name, it's attributes and any children the tag might have.
The attrs->text and concat-with functions take care of the minutiae of how to generate some of the HTML code.

However, it would be pretty tiresome to have to invoke the node function every time we want to generate some HTML, and using the @attrs macro would also become tiresome after a while.

That's the reason behind the code right at the bottom of this file:
 (do-template [<name>]  
   [(let% [<fun-name> (%tag-func-name% <name>)]  
      (def #export <fun-name>  
        (node (%tag% <name>)))  
      (defsyntax #export (<name> attrs [children (*^ id^)])  
        (let [attrs' (case attrs  
                       [_ (#;RecordS _)]  
                       (` (@attrs (~ attrs)))  
          (:: Lux/Monad (wrap (@list (` (<fun-name> (~ attrs') (@list (~@ children))))))))))]  
   ## Head  
   ## Body  

To make writing HTML in Lux easier, we're going to define some convenience macros for us.
They're going to have the names of commonly used tags and when they receive the attributes, if they're in { record syntax }, the macro will simply wrap them inside an @attrs call to ensure the proper conversion is performed.

But wait, there's more!
Knowing that sometimes we'll want to use regular functions instead of records for generating our HTML, we're also going to be generating functions that do the same thing as our macros.

This is also where I explain what's up with the %tag% and %tag-func-name% macros that were defined earlier.
Since we're only providing the names of the macros to do-template, we need some way to generate the tag from there (which has to be in text form), and also the name of the function.

%tag% turns a symbol into text, and thus takes care of the tag for use.
%tag-func-name%, on the other hand, takes a symbol and appends a ' to the end, thereby turning div into div' and span into span'. These will be our functions.

Also, I'd like to point out something that might have puzzled you.
In the code for generating the macros for our tags, we receive the attrs AST node, but we don't specify how we parse it. It's just a symbol (attrs)!
What's going on is that whenever defsyntax encounters just-a-symbol, it assumes you don't want to do any kind of fanciful parsing and just treats it as is you had written [attrs id^].

Also, you might be puzzled about the weird let% macro up there. If you've never seen it before, or you forgot how it worked, then I'd recommend you read this post before continuing: http://luxlang.blogspot.com/2015/11/custom-pattern-matching-and-advanced.html

We've gone over our first Lux module! Now it's only 8 more to go!
Don't worry, though. Since we've seen a lot already, I won't be explain stuff we've already seen and I'll only talk about snippets introducing new concepts.

With that in mind, let's proceed!


The code is very similar to the one for HTML generation.
CSS is also text and the style for each rule is also a text-based list of KVs, with the style^ parser and @style macro being exact mirrors of attributes^ and @attrs (which is not a coincidence, since I copy-pasted the code).

The rule' function is just like node for HTML, and it even uses the same kinds of helpers.
Since there isn't a variety of rules for CSS in the same way that there are a variety of tags, there was no need to generate several macros. rule will do just fine.

However, there is a major difference between rule and node which is that node gave you back 1 piece of text, whereas rule gives you back a RuleSet, which is a list of rules (which are text).
The css function takes care of joining them together into 1 piece of text (or CSS).

The reason for rule working the way it does is that is changes the ruleset you give it in order to nest rules/styles.
You'll see what I'm talking about once we generate some CSS later on.


Things start getting fun in this module.

 (;import lux  
          (lux (codata io)  
               (data maybe  
                    (text #open ("text:" Text/Eq))  
                    (list #refer #all #open ("" List/Functor))  
                    (number #open ("i:" Int/Show)))  
               (host jvm)  
               (meta lux  
                     (ast #as ast)  
          (.. (html #as &html)  
              (css #as &css)))  

First, we notice something funny with our imports syntax.
What's up with that .. over there? What does it do?
As some of you might imagine, Lux's import syntax supports some measure of relative positioning to about writing superfluous paths in our imports syntax.
Both . and .. are supported.
Note, though, that Lux imports do not support the full syntax of file paths, as they're not file-paths, so trying something clever will probably just end up in a compiler error.
With that said, relative positioning helps in writing shorter code, plus you also get the benefit that if you move modules around, you get to break less paths, as modules can be agnostic of what their parent modules happen to be.
 ## [Host]  
 (jvm-import java.lang.String  
   (getBytes [] [] (Array byte)))  

Here we encounter our first instance of JVM interop.
The jvm-import macro generates types and functions for you, given descriptions of Java classes (or interfaces) and their methods.
As you can see, the specs don't need to be complete and the classes can have type-parameters (which is not the case for String, though).

Here, we're saying that the getBytes method doesn't have any type-parameters of it's own (the 1st tuple), nor does it take any arguments (the 2nd tuple). It returns an array of bytes.
Mind you, Lux doesn't allow you to have variables or definitions of primitive JVM types, but arrays are fair game, as the JVM offers different kinds of arrays, optimized for each case.
Whenever you're working with individual values inside those arrays, some measure of wrapping/unwrapping is performed, so buyer beware!

In this case, jvm-import will generate for us a String definition, for the type, and a String::getBytes definition, for a function masking the method call.
For those of you who might be wondering, it's perfectly possible to do interop without this, but the Lux primitives for doing interop are cumbersome to use, since they map very directly to the JVM bytecode instructions. To avoid all that hassle, it's best to use macros such as jvm-import (which, I forgot to say, comes from the lux/host/jvm module).

. . .

Back the file!
You'll be seeing various type definitions, as we define the basics for doing HTTP communications and request handling, separate from the primitives which Vert.x provides.
We'll later be plugging things together in another module...

The @headers macro works just like @attrs and @style, since HttpHeaders are also just lists of text KVs.

We have some functions for easily generating HTTP responses and them some utility functions.
Nothing too fancy here.
 (do-template [<name> <type> <content-type>]  
   [(def #export (<name> value)  
      (-> <type> HttpResponse)  
      (let [value-bytes (String::getBytes [] value)]  
        {#response-status 200  
         #response-headers (|> empty-headers  
                               (add-header "Content-Length" (i:show (array-length value-bytes)))  
                               (add-header "Content-Type" <content-type>))  
         #response-body value-bytes}))]  
   [html-response &html;Html "text/html"]  
   [css-response  &css;CSS   "text/css"]  

You'll notice, though, that we're using the String::getBytes function for the first time!
The function takes the method arguments as a tuple (in this case empty, since there are none), and the object is being given last.

You'll also notice the array-length macro being used. It's also defined inside the lux/host/jvm module.


It's time to drop the kiddy gloves and get into full-on interop territory!
Let's take a look at some of the new things this module has got to show us!
 (jvm-import io.vertx.core.Vertx  
   (#static vertx [] [] io.vertx.core.Vertx #io)  
   (createHttpServer [] [] io.vertx.core.http.HttpServer #io)  
   (deployVerticle  [] [io.vertx.core.Verticle] void #io))  

Here, you'll notice there's an odd #io tag at the end of our methods.
This is one of the nice services jvm-import can provide for us.
With this, it wraps the type of the return value inside an IO type, so we can better state that we're dealing with a side-effecting or stateful methods.

We can also see that by using #static, we can import static methods into our code.
Unlike normal methods, static methods don't take object instances are arguments, so it's fine to just pass them the arguments tuple.
 (jvm-import io.vertx.core.http.HttpServerResponse  
   (headers [] [] io.vertx.core.MultiMap)  
   (setStatusCode [] [int] io.vertx.core.http.HttpServerResponse)  
   (write [] [io.vertx.core.buffer.Buffer] io.vertx.core.http.HttpServerResponse)  
   (end [] [] void))  

Methods with void return values just give you back [] (unit) as the return value.

 (jvm-import #long (java.util.List e)  
   (size [] [] int)  
   (get [] [int] e))  

If, for whatever reason, we don't want the name of an imported class to get short when importing it, we can just use the #long option to specify that we want it's long name.
In this case, using it's short name would make it clash with Lux's List type, so we'd rather avoid that.
 (def (extract-param entries idx)  
   (-> (java.util.List (Map$Entry Text Text)) Int (, Text Text))  
   (let [entry (java.util.List::get [(_jvm_l2i idx)] entries)]  
     [(Map$Entry::getKey [] entry) (Map$Entry::getValue [] entry)]))  

Sometimes, we need to perform some conversions when doing JVM interop.
Lux's Int type actually maps to Java's longs, rather than ints. In order to invoke java.util.List's get method, we need to turn our long into an int, and that's the job of the _jvm_l2i special form.
There are many others like it for performing simple conversions between primitives.
 (do-template [<name> <method> <type>]  
   [(def (<name> req)  
      (-> HttpServerRequest <type>)  
      (let [entries (|> req (<method> []) (MultiMap::entries []))]  
        (map (extract-param entries)  
             (range Int/Enum 0 (dec (_jvm_i2l (java.util.List::size [] entries)))))))]  
   [get-headers      HttpServerRequest::headers        &;HttpHeaders]  
   [get-query-params HttpServerRequest::params         &;HttpParams]  
   [get-form-params  HttpServerRequest::formAttributes &;HttpParams]  

If you haven't gotten acquainted to it yet, |> is Lux's pipeline macro, and it works the same as Clojure's ->> macro.
 (def (respond! response request)  
   (-> &;HttpResponse HttpServerRequest (IO (,)))  
   (do IO/Monad  
     [#let [(\slots [#&;response-status #&;response-headers #&;response-body]) response]  
      $response (HttpServerRequest::response [] request)  
      #let [_ (HttpServerResponse::setStatusCode [(_jvm_l2i response-status)] $response)  
            mm (foldL (: (-> MultiMap (, Text Text) MultiMap)  
                         (lambda [headers pair] (MultiMap::add pair headers)))  
                      (HttpServerResponse::headers [] $response)  
            _ (HttpServerResponse::write [(Buffer::buffer [response-body])] $response)  
            _ (HttpServerResponse::end [] $response)]]  
     (wrap [])))  

Here, we're getting our first look into how to write monadic code in Lux.
do is our macro for monadic do-notation (implemented inside the lux/control/monad module).
You give it the monad implementation you need and it does it's thing.
#let is an option for making simple lexical binding and \slots helps us easily destructure the response in order to translate it into something Vert.x can understand.

Since we passed the #io option when importing HttpServerRequest::response, it plays well with the IO/Monad.
The methods in HttpServerResponse weren't annotated with #io because I forgot (hehe), but if they had been, they would have also been used directly in the do-notation.
 (def (request$ req body)  
   (-> HttpServerRequest &;HttpBody &;HttpRequest)  
   {#&;request-method (|> req (HttpServerRequest::method []) (Object::toString []) &;HttpMethod$ (? #&;OPTIONS))  
    #&;request-uri  (let [raw-uri (HttpServerRequest::uri [] req)]  
                      (? raw-uri  
                         (do Maybe/Monad  
                           [[uri params] (text;split-with "?" raw-uri)]  
                           (wrap uri))))  
    #&;request-headers (get-headers req)  
    #&;request-params (get-params req)  
    #&;request-body  body})  

You might be wondering what's the deal with the ? thingy over there.
That's just a function from the lux/data/maybe module which does the same thing as Scala's getOrElse method for Option.
 (def (http-handler server)  
   (-> &;RequestHandler (Handler HttpServerRequest))  
   (object java.lang.Object [(io.vertx.core.Handler io.vertx.core.http.HttpServerRequest)]  
     (#override (io.vertx.core.Handler A) handle [] [(vreq A)] void  
                (exec (|> vreq  
                          (HttpServerRequest::setExpectMultipart [true])  
                            [(body-handler (lambda [body']  
                                             (let [body (Buffer::getBytes [] body')]  
                                               (do IO/Monad  
                                                 [#let [request (request$ vreq body)]  
                                                  response (server request)]  
                                                 (respond! response vreq)  

Whoah! What is going on over here!?

In order to handle requests and perform other operations, we must give Handler implementations to Vert.x, so we better start generating some JVM classes!

The object macro helps use generate anonymous classes. Think of it as a mirror of Clojure's proxy macro.
We specify the super class and any interfaces involved. Any of the super-types can be parameterized, as we see in the case of Handler.
The final tuple is for constructor arguments to pass to the super class, but in most cases it's going to be empty.

You can only override inherited methods and can't create new ones inside anonymous class definitions, which is why there's that #override tag over there.
When we override a method, we must specify who it belongs to.
In this case, it's Handler's. We're also specifying that we'll be referring to Handler's type-parameter as A.
Then comes the name of the method, then a tuple with any method-local type-parameters, then a tuple with any arguments to the method (in this case vreq, of type A), and then comes the return type of the method (remember that void stands for []/unit).

Don't worry about vreq being of type A. The compiler matches that to HttpServerRequest, as per the super-interface previously specified.

Finally, the exec macro allows us to execute several expressions and only gives back the value of the final expression, which makes it a great macro for writing side-effecting imperative code.
 (def (verticle$ port server-fun vertx)  
   (-> &;Port &;RequestHandler Vertx Verticle)  
   (object io.vertx.core.AbstractVerticle []  
     (#override io.vertx.core.AbstractVerticle start [] [(start io.vertx.core.Future)] void  
                (exec (run-io (do IO/Monad  
                                [http-server (Vertx::createHttpServer [] vertx)  
                                 _ (HttpServer::requestHandler [(http-handler server-fun)] http-server)]  
                                (HttpServer::listen [(_jvm_l2i port)] http-server)))  
     (#override io.vertx.core.AbstractVerticle stop [] [(stop io.vertx.core.Future)] void #throws [java.lang.Exception]  
                (run-io (print-line "Verticle stopped!")))))  

Here, we see that you can also specify any exceptions the method might #throw, and we also encounter the run-io function, for executing IO actions.


This module only has a tiny type definition for the tasks in our TODO list.


This module is a bit more interesting.
 (jvm-import (java.util.concurrent.CopyOnWriteArrayList e)  
  (<init> [] []))  
 ## [Types]  
 (deftype #export AppState  
  (CopyOnWriteArrayList &;Task))  
 (deftype #export AppData  
  (List (, Int &;Task)))  

We use CopyOnWriteArrayList for storing our task list.

AppData pairs our tasks with ints (their indices within the list) in order to be able to refer to them later on when marking tasks as completed or deleting them.

The rest of the file just contains bookkeeping functions for handling the program's state.
 (def #export (clear-completed state)  
   (-> AppState (IO Bool))  
   (do IO/Monad  
     [task-list (get-task-list state)  
      _ (|> task-list  
            (map% % (: (-> (, Int &;Task) (IO Bool))  
                       (lambda [[idx task]]  
                         (if (completed-task? task)  
                           (delete-task idx state)  
                           (wrap true))))))]  
     (wrap true)))  

This last function is a bit weirder because of all the % in it.
map% is for mapping functions that return monadic values over lists. Think of it as a Lux mirror of Haskell's mapM function.
The loose % next to is is a bit more puzzling though... What is it!?
It's actually just a simple lexical binding made by the do macro. It contains the monad being used.
It's made so it's possible to refer to IO/Monad (or whatever monad is being used) anywhere inside the code without having to type the full name each time.
Pretty convenient!


This module just contains a bunch of paths and other small values that are used in other parts of the codebase.


 ## [Types]  
 (deftype #export DisplayFilter  
   (| #All  
 ## [Structs]  
 (defstruct DisplayFilter/Eq (Eq DisplayFilter)  
   (def (= x y)  
     (case [x y]  
       (\template [<tag>]  
         [[<tag> <tag>] true])  
       ([#All] [#Active] [#Completed])  

If you played with the app on your browser prior to reading the rest of the tutorial, you'll have noticed that you can filter the tasks that you see on screen.

This type helps with that, and we later use it's Eq structure to help figure out which filter the user chose.

. . .

I won't put here the full definition of css because it's too large.
It's like that for a reason. I chose not to refactor it to make something a bit easier to spot and grasp.

Remember that back when discussing CSS generation I said you could nest rules to establish a styling hierarchy?
Well, here you can see how that plays out.
In the CSS being generated, the selectors nest, so you can end up having something like .todo > .header > .new-task-form > .new-task-input
Nice, huh?
Of course, in real code I'd have just refactored every part and separated them, but having them all lumped together helps in better understanding how each style interacts with the rest.

. . .

After that, we get a bunch of functions for generating HTML, using the macros we defined previously in our little HTML DSL module.
Feel free to peruse all you want.


This is it!
It's time to face the final boss!

The handler function (which is also too large to paste here!) contains the logic for dispatching the HTTP requests and figuring out what to do.

Since there are currently no web-frameworks for Lux, I don't have something like Clojure's Ring or Compojure to simplify routing for me, so I just do a very simple manual dispatching of the incoming requests.

There's nothing in this function that hasn't been seen yet (except, perhaps, the \~ macro, but you can read more about it here: http://luxlang.blogspot.com/2015/11/custom-pattern-matching-and-advanced.html)
I'll just leave you two alone for a moment...

Back here so soon?
OK, then let's discuss the final piece of the puzzle:
 (program args  
   (do IO/Monad  
     [app-state &&state;gen-state]  
     (&&server-host;deploy-server &&util;server-port (handler app-state))))  

program is to Lux programs as main is to programs in other languages.
You get a list of strings as your sole input and you must provide an (IO Unit) value as the result.
Here, we're just initializing the program's state to a fresh list and deploying our server with a handler bound to our state.


 (defproject lux/tutorial1 "0.1.0-SNAPSHOT"  
   :plugins [[lux/lein "0.1.0"]]  
   :dependencies [[lux/stdlib "0.3.1"]  
                  [io.vertx/vertx-web "3.0.0"]]  
   :lux/program "tutorial1")  

I cannot end this tutorial without first discussing the glue that allows the Leiningen build tool to compile Lux programs.

The lux/lein plugin makes use of the luxc.jar file you downloaded previously to compile this tutorial.
In the dependencies, we ask for both the Lux standard library and for Vert.x.

We must also specify the name of the module containing the program statement.


This was a long read, but hopefully it was worth your while.
I'm very happy to finally be able to share this tutorial with you and I hope it has helped you understand how to use Lux to build real programs of your own design.

There will be more tutorials in the future (hopefully much shorter than this one), so stay tuned.

And if you have any questions, feel free to post your comments down below.
I'll gladly help in any way I can.

jueves, 12 de noviembre de 2015

Custom pattern-matching and advanced macrology

Hello, and welcome to the 3rd entry in the Lux Devlog, a blog dedicated to the design and implementation of the Lux Programming Language, an upcoming functional language in the Lisp tradition.

Today, I'll talk about 2 really interesting subjects that go hand in hand and that give Lux a lot of its power and expressiveness.

First: Custom pattern-matching

Before explaining the details, I'll talk a bit about the genesis of this feature...

Back in the early days of Lux's design & development, I was thinking about syntax for lists.
Some languages (such as Haskell and JavaScript) offer custom syntax for list or array data-structures since the regular syntax for building data-structures tends to be a bit cumbersome to use, while lists are very commonly used data-structures.

Just consider the difference between writings this:
 [1, 2, 3, 4]  
versus writing this:
 Cons 1 (Cons 2 (Cons 3 (Cons 4 Nil)))  

While designing Lux, I came to the conclusion that adding custom syntax for lists was, in a way, betraying the language. Rather than come up with a quick-and-dirty fix to the problem of lists, I wanted to have consistent and general ways to deal with data-structures, while also having comfortable syntax for lists. In short, I wanted to have my cake and eat it too.

The solution for building lists was pretty easy. In a language with macros, the easiest way to fix these kinds of syntax issues was to write a macro... and that's exactly what I did:

 (@list 1 2 3 4) ## This might surprise some of you, as an earlier version didn't have the @ sign...  

There is also an alternative version, which takes a "tail" as it's last argument

 (@list& 1 2 3 4 (@list 5 6 7 8))  

OK, so that takes care of building lists, but there is still something missing...
There's no point in building data-structures if you can't tear them apart later on. The mechanism for that is pattern-matching. And here is where we encounter our problem...

Macros work well in regular code, but pattern-matching is special. Patterns are not expressions. You're not supposed to evaluate a pattern in order to get something out of it.
However, the syntax for writing patterns turns out to be identical to the syntax for writing (certain kinds of) expressions.

The answer to this question might seem obvious to a lot of you... and it's also obvious to me (in hindsight). But to a prior version of me, many months ago, it wasn't such an obvious thing. I struggled with an answer for weeks and even considered just having custom syntax for lists and give up on the subject...

And then I had my "aha!" moment. If macros give me the syntax to build lists, and that syntax is the same I need for patterns, then why not just generate patterns with macros too. Sounds obvious, right? (In hindsight...)

But there was still the matter of how do I implement it.
Do I traverse the patterns to check for all macros that show up and expand those?
That seems easy, but then I thought of something...

Macros are pretty flexible tools. Building code for making data-structures is just one of the myriad things you can achieve.
But what if my patterns could care about more than just destructuring things.
I can't just expand macros in place whenever I see them, because I'd be assuming every macro is there to generate destructuring code for me.
I need to add more control, and include macros that allow me to do more than just easily decompose data-structures.

And so, the idea for custom pattern-matching was born.

The concept is pretty simple, the pattern-matching macro (case) checks patterns to see if there is a top-level macro invocation. If so, that macro gets involved with both the pattern and the body that is to be executed for that pattern. Whatever comes out of the macro gets substituted for the pattern and the body, and the macro-expansion process is repeated until no more macros are left.

The beautiful thing is that, since the body is also included in the macro call, you can have pattern-matching macros that transform their bodies or that repeat them an arbitrary amount of times. The power unleashed is very impressive, and I have only scratched the surface of what can be achieved.

Now, without further ado, it's time for some demonstrations :D

 (def (to-pairs xs)  
   (All [a] (-> (List a) (List (, a a))))  
   (case xs  
     (\ (@list& x1 x2 xs'))  
     (@list& [x1 x2] (to-pairs xs'))  

The \ macro has the simple task of expanding every macro it finds inside the pattern. It's the simplest of the pattern-matching macros and its use is very common (specially when working with lists).

 (deftype Weekday  
   (| #Sunday  

 (def (weekend? day)  
   (-> Weekday Bool)  
   (case day  
     (\or #Sunday #Saturday)  

The \or macro repeats the body given to it for every pattern you give it. That way, you can reuse the body whenever you have patterns that involve returning the same result.

 ## This is an actual structure from the lux/meta/ast file:  
 (defstruct #export AST/Eq (Eq AST)  
   (def (= x y)  
     (case [x y]  
       (\template [<tag> <struct>]  
        [[[_ (<tag> x')] [_ (<tag> y')]]  
        (:: <struct> (= x' y'))])  
       ([#;BoolS   Bool/Eq]  
        [#;IntS    Int/Eq]  
        [#;RealS   Real/Eq]  
        [#;CharS   Char/Eq]  
        [#;TextS   Text/Eq]  
        [#;SymbolS Ident/Eq]  
        [#;TagS    Ident/Eq])  

       (\template [<tag>]  
        [[[_ (<tag> xs')] [_ (<tag> ys')]]  
        (and (:: Int/Eq (= (size xs') (size ys')))  
             (foldL (lambda [old [x' y']]  
                      (and old (= x' y')))  
               (zip2 xs' ys')))])  

       [[_ (#;RecordS xs')] [_ (#;RecordS ys')]]  
       (and (:: Int/Eq (= (size xs') (size ys')))  
            (foldL (lambda [old [[xl' xr'] [yl' yr']]]  
                     (and old (= xl' yl') (= xr' yr')))  
              (zip2 xs' ys')))  


\template is the pattern-matching sibling to the do-template macro. It allows you to reuse the code of the body, with some minor modifications to make it more suitable to each particular case.
After the \ macro, this is probably the most handy pattern-matching macro to have around.

 ## Please forgive the super-contrived example  
 (deftype MyRecord  
   (& #foo Int  
      #bar Int  
      #baz Text))  
 (def (sum rec)  
   (-> MyRecord Int)  
   (case rec  
     (\slots [#foo #bar])  
     (i:+ foo bar)))  

\slots is Lux's counterpart to Clojure's :keys destructuring syntax.

 ## Again, sorry for the contrived example...  
 (def type-1 "foo")  
 (def type-2 "bar")  
 (def type-3 "baz")  
 (def (process-message message)  
   (-> (, Text Data) (,))  
   (case message  
     (\~ [(~ type-1) data]) (do-something      data)  
     (\~ [(~ type-2) data]) (do-something-else data)  
     (\~ [(~ type-3) data]) (do-another-thing  data)))  

Have you ever wanted to reuse a literal value in a situation that doesn't allow you the use of variables?
That's a bit problematic, as you end up repeating the same literal value over and over again, introducing the risk of bugs, should the value every change.

The \~ macro is there for precisely this purpose. Just tell it what you need inlined and let it work it's magic.
Note: It only works with Bool, Int, Real, Char and Text

Finally, I've got a nice treat for you guys...

Lux favors eager evaluation over lazy evaluation. However, we all know that some times laziness can be useful, and there are even some data-structures that depend on it, such as streams.

Lux offers a type for doing lazy evaluation:

 (deftype #export (Lazy a)  
   (All [b]  
     (-> (-> a b) b)))  

In Lux, Lazy is just like the Cont type in Haskell, except that it's arguments are in the reverse order.
Streams are defined in terms of Lazy:

 (deftype #export (Stream a)  
   (Lazy (, a (Stream a))))  

This means that streams are actually functions.

Now... some of you might think "if streams are functions, that means I can't pattern-match against them".
Well, my friend, you're wrong!

 (def (take-s n xs)  
   (All [a] (-> Int (Stream a) (List a)))  
   (if (i:<= n 0)  
     (case stream  
       (\stream& x xs')  
       (#;Cons x (take-s (dec n) xs')))))  

The \stream& macro modifies the body so that pattern-matching on streams amounts to running the functions appropriately to extract the values.
Thanks to pattern-matching macros, we can actually pattern-match against functions ^_^ .

BTW, I have talked about what macros come by default in the Lux standard library, but I haven't shown how they're implemented.
Just so you can get an idea, here's the implementation for the \ macro:

 (defmacro' #export (\ tokens)  
   (case tokens  
     (#Cons body (#Cons pattern #Nil))  
     (do Lux/Monad  
       [pattern+ (macro-expand-all pattern)]  
       (case pattern+  
         (#Cons pattern' #Nil)  
         (wrap (@list pattern' body))  
         (fail "\\ can only expand to 1 pattern.")))  
     (fail "Wrong syntax for \\")))  

Also, I haven't mentioned something very important.
Even though those macros can only work thanks to the case macro macro-expanding the patterns, there are other macros out there which use case in their implementations, and they can also benefit from pattern-matching macros.

Some of those macros are the let macro, the lambda macro, and the do macro.
That's right, you can use custom pattern-matching against the arguments to your functions, or inside your do-notation, or even in good ol' let forms. How cool is that!?


Second: Inter-operating macros

I know I've talked a lot already, but there's one other topic I want to discuss on this post, and that is how to get macros to inter-operate.

As the custom pattern-matching examples show, you can unlock great power when your macros can work together, rather than separately, and Lux already reaps some of that power outside of the boundaries of pattern-matching.

To get macros to play together, you need to do 2 things:

  1. Some macros must perform macro-expansions on the arguments they receive, in order to process the outputs
  2. There must be some kind of "contract" between the outer macro and the inner macros

The first part is pretty obvious. If not because case does macro-expansion on the patterns it receives, none of the magic would happen.

But the second part is often missed by macro writers in other lisps. Without a common contract, communication becomes impossible and there can be no cooperation.

What's a common contract?

Consider this: whatever your pattern-matching macros generate, some rules must always stand:

  1. They must have an even number of outputs (since you're substituting both the pattern and the body).
  2. For the patterns being generated, they must either be primitive forms suitable for regular pattern-matching, or they must be macro calls to be further expanded.

If any of these 2 rules is broken, the case macro is going to complain about it.

However, this isn't the only common contract macros can have and Lux already has a few macros with their own contracts.

The defsig common contract

You might remember the defsig macro from the last blog post (if not, I advise you to go read it).
What you might not know is that you can actually use other macros inside it's body.

Here's a nice example (from the lux/control/number module):

 (defsig #export (Number n)  
   (do-template [<name>]  
     [(: (-> n n n) <name>)]  
     [+] [-] [*] [/] [%])  
   (do-template [<name>]  
     [(: (-> n n) <name>)]  
     [negate] [signum] [abs])  
   (: (-> Int n)  

The Number signature provides simple math operations.
There are already implementations for Int and Real in the standard library.
And, as you can see, I make liberal use of the do-template macro to reduce boiler-plate.

The reason why this works is simple: every member of the signature must take the form:
(: <type> <name>)

Anything that generates forms of that kind is going to be welcomed. You could even implement and use your own macros in there, provided that they generate that kind of code.

defstruct also has a similar contract...

The defstruct common contract

 (defstruct #export Int/Number (Number Int)  
   (do-template [<name> <op>]  
     [(def (<name> x y) (<op> x y))]  
     [+ _jvm_ladd]  
     [- _jvm_lsub]  
     [* _jvm_lmul]  
     [/ _jvm_ldiv]  
     [% _jvm_lrem])  
   (def (from-int x)  
   (def (negate x)  
     (_jvm_lmul -1 x))  
   (def (abs x)  
     (if (_jvm_llt x 0)  
       (_jvm_lmul -1 x)  

   (def (signum x)  
     (cond (_jvm_leq x 0) 0  
       (_jvm_llt x 0) -1  

       ## else  

In this case, what defstruct is looking for is forms that define things.
Note that, the def macro being used here is the very same one used to define everything else in Lux.

Pretty cool, huh?

Finally, there's one last piece of macro awesomeness I want to share before we call it quits.
I came with it fairly recently, so I haven't settled on a name yet.
For now, let's just call it let%

Before I explain how it works, I'll show the itch it's meant cure:

 (defstruct #export Json/Read (Read Json)  
   (def (read input)  
     (case (:: JsonNull/Read (read input))  
       (#;Some value)  
       (#;Some (#Null value))  
       (case (:: JsonBoolean/Read (read input))  
         (#;Some value)  
         (#;Some (#Boolean value))  
         (case (:: JsonNumber/Read (read input))  
           (#;Some value)  
           (#;Some (#Number value))  
           (case (:: JsonString/Read (read input))  
             (#;Some value)  
             (#;Some (#String value))  
             (case (:: (JsonArray/Read [read]) (read input))  
               (#;Some value)  
               (#;Some (#Array value))  
               (case (:: (JsonObject/Read [read]) (read input))  
                 (#;Some value)  
                 (#;Some (#Object value))  

Do you see that? That train-wreck? That monstrosity?

It's from a JSON library for Lux I'm working on.
The Read signature in Lux is for structures that try to parse something out of text. If they fail, you get #None

As you can see, I'm having to do some testing, to try to figure out what I'm parsing, but the code is ruled by repetitive case expressions where everything is the same, except what parser I'm using and what tag to give to the result.

Surely, there must be a better way of doing it!

First... let's flatten the structure:

 (defstruct #export Json/Read (Read Json)  
   (def (read input)  
     (|> #;None  
         (case (:: (JsonObject/Read [read]) (read input))  
           (#;Some value)  
           (#;Some (#Object value))  
         (case (:: (JsonArray/Read [read]) (read input))  
           (#;Some value)  
           (#;Some (#Array value))  
         (case (:: JsonString/Read (read input))  
           (#;Some value)  
           (#;Some (#String value))  
         (case (:: JsonNumber/Read (read input))  
           (#;Some value)  
           (#;Some (#Number value))  
         (case (:: JsonBoolean/Read (read input))  
           (#;Some value)  
           (#;Some (#Boolean value))  
         (case (:: JsonNull/Read (read input))  
           (#;Some value)  
           (#;Some (#Null value))  

This might not be much, but it's a start.
By using the piping macro |>, I can avoid all the nesting and keep all the tests in the same level.
Now it's even more obvious that every form in there has the same shape, minus the parser and the tag.

Man... wouldn't it be nice of we just had a macro for repeating things, while passing in parameters...

 (defstruct #export Json/Read (Read Json)  
   (def (read input)  
     (|> #;None  
         (do-template [<struct> <tag>]  
           [(case (:: <struct> (read input))  
              (#;Some value)  
              (#;Some (<tag> value))  
           [(JsonObject/Read [read]) #Object]  
           [(JsonArray/Read [read]) #Array]  
           [JsonString/Read     #String]  
           [JsonNumber/Read     #Number]  
           [JsonBoolean/Read     #Boolean]  
           [JsonNull/Read      #Null]))  

do-template seems like a pretty wise choice here, doesn't it?
The problem is that it doesn't play well with |>, as |> doesn't do any kind of macro-expansion of it's member forms prior to piping them.
Because of that, I can't combine |> with do-template; as awesome as that might be.

let% to the rescue

 (defstruct #export Json/Read (Read Json)  
   (def (read input)  
     (let% [<tests> (do-template [<struct> <tag>]  
                      [(case (:: <struct> (read input))  
                         (#;Some value)  
                         (#;Some (<tag> value))  
                      [(JsonObject/Read [read]) #Object]  
                      [(JsonArray/Read [read])  #Array]  
                      [JsonString/Read          #String]  
                      [JsonNumber/Read          #Number]  
                      [JsonBoolean/Read         #Boolean]  
                      [JsonNull/Read            #Null])]  
       (|> #;None  

let% is meant for those situations in which you want to expand certain parts of the bodies of macros prior to expanding outer macros.With let%, you can bind symbols to arbitrary amounts of forms produced by macro expansions, and then they're going to be spliced wherever the symbols appear, further down the line.

With this, I can do all of my parsing while only focusing on the common logic and the details that change.

Note: let% is just a temporary name until I come up with something better.
Do you have any suggestion? Leave it in the comments.
And, yes, I have already considered with-macro-expansions, but it's too damned long...


That's it for today.
I know it was a long post, but hopefully you'll now understand the amazing power that can be unlocked when you write macros that play well together.

See you next time :D