How Does HtDP's Beginner Language Work? ======================================= The textbook "How To Design Programming Languages" uses DrScheme as the programming language environment. One surprising aspect to the HtDP system is its use of language levels --- to someone who has had previous programming experience in other languages, the concept of a language tower can be a novel and enpowering one. These language towers change the core language; these changes aren't just simple renaming of API functions: they can have fundamental effects. As a concrete example, if one is in the Beginner language, then first-class functions are actually _off_! If one tries to write the map function: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (define (map f l) (cond [(empty? l) empty] [else (cons (f (first l)) (map f (rest l)))])) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; then DrScheme will actually flag this as an error, by responding with: function call: expected a defined name or a primitive operation name after an open parenthesis, but found a function argument name and by highlighting the occurrence of 'f' within the function body! This is very cool, because this means that beginners who are starting off learning the language are working with reasonable training wheels --- beginners aren't going to be learning how to use map at first. And if they accidently something Algolish like this: (define (f x y) (+ x y)) f(x y) then rather than get a mysterious error message about applying x inappropriately, they'll get an error message that gently tells the beginner to use prefix notation: f: this is a procedure, so it must be applied to arguments (which requires using a parenthesis before the name) The features that are unique to Beginner language are described in the "PLT Drscheme: Programming Environment Manual", in section 2.6 Languages. I want to dissect how HtDP's language tower works, starting with the Beginner language, because it'll give me a better idea of how everything fits together. It'll probably help me understand macros a bit better as a nice side benefit. Audience ======== Myself. *grin* But I wonder if anyone else will be interested. I think that to understand what in the world is going on, I'll need a somewhat good understanding of syntax-case, since all of this is done through the powerful features of Scheme's hygienic macro system. As a note: I'm doing this study by myself at the moment, so some of the details here might be hideously wrong. Syntax restrictions to disallow application on first-class functions ==================================================================== Let's first see how the Beginner language forces function application to use only "primitives" or the names of user-defined functions. We saw from the introduction above that trying to use anything else (like an input parameter) as a function value gives us the error message: function call: expected a defined name or a primitive operation name after an open parenthesis, but found a function argument name So how does this work? If we look at $PLTHOME/collects/lang/htdp-beginner.ss, we can see the following: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (provide ... (rename beginner-app #%app) ...) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; When beginner language sees an expression like this at the toplevel: (hello world) then this is expanded out into even more primitive syntax, using low-level macros that allow us to hook deep into the guts of the language. [add link to reference manual] Concretely, the above s-expression is expanded out to: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > (syntax-object->datum (expand (syntax (hello world)))) (#%app (#%top . hello) (#%top . world)) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; I'll ignore the #%top macro for the moment. As we can see, there's an additional #%app macro there, which is the low-level macro used to handle procedure applications. This is very cool because we can override what procedure application does by defining our own language. Here is a quick and dirty example that shows the flexibility of this mechanism: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (module debug-app-lang mzscheme (provide (all-from-except mzscheme #%app)) (define-syntax (my-app stx) (syntax-case stx () ((my-app operator operands ...) #'(#%app (begin (printf "I see you are running ~s~%" operator) operator) operands ...)))) (provide (rename my-app #%app))) (module test-lang debug-app-lang (define (factorial x) (if (= x 0) 1 (* x (factorial (- x 1))))) (factorial 3)) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; The module debug-app-lang provides a language that pretty much re-provides all of mzscheme... except for procedure application! In this case, we've just slightly extended every procedure application to first give, at run-time, a debugging output right before we delegate off to the original meaning of application. Here's what we see when we start up the test-lang module: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > (require test-lang) I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # I see you are running # ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; So this shows how one might be able to add some kind of programatic check at run-time to adopt some special behavior or checks. But in fact, the Beginner language goes one step further: it does a syntax check at compile-time, even before any code is executed. The heart of these syntax checks occur in $PLTHOME/collects/lang/private/teach.ss, within the beginner-app/proc macro. (At least, this is true as of this writing, but since we're diving deep into private module territory, anything's up for change in the future... *grin*) Actually, it's only the code that responsible for doing the error messages that lives right there. This beginner-app/proc macro isn't supposed to be reached normally as a beginner unless something is syntactically wrong. There's a big hint about this in the comment above that code: ;; #%app should never happen in beginner. Primitive operations and ;; defined functions turn into syntax that handle application ;; forms. The only vaguely legitimate application would involve a ;; poorly implemented teachpack that exports functions instead of ;; primitive operators. Also, #%app is unavoidable in the REPL. So we really should never hit this macro unless there's a syntax error. So what "procedure applications" don't end up as syntax errors? Ones that are explicitely defined themselves as macros! What's happening is akin to something like this: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (module only-add-lang mzscheme (provide (all-from-except mzscheme #%app +)) (provide (rename my-binary-add +)) (define-syntax (my-binary-add stx) (syntax-case stx () ((_ x y) #'(#%app + x y)))) (provide (rename error-app #%app)) (define-syntax (error-app stx) (syntax-case stx () ((_ rator rands ...) (raise-syntax-error #f "You can't do that on television!" (syntax rator)))))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Here, we have a limited language where the only "application" possible is binary addition! ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > (module foo only-add-lang (* 3 4)) *: You can't do that on television! in: * > (module foo only-add-lang (+ 3 4)) > (module foo only-add-lang (+ 3 4 5)) +: bad syntax in: (+ 3 4 5) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; This example shows that we can create special forms for a "primitive", and when things fit together right, we delegate things off to the regular application that we're used too. Anything else, though, either hits our error-trapping customized error-app form, or doesn't pattern match against the weak-sauce my-add macro and raises that uninformative "bad syntax in" error message. Also note that these error messages show up at compile-time, which is exactly what we want to see. And this is essentially what Beginner language appears to do. Primitive "applications" are really macros that delegate eventually to mzscheme's standard #%app form, but every other kind of application gets captured as error messages by beginner-app/proc. The macros for each primitive are built by $PLTHOME/collects/lang/htdp-beginner.ss within in-rator-position-only, which is applied to all primitives: ;; procedures: (provide-and-document/wrap procedures in-rator-position-only (all-from beginner: (lib "beginner-funs.ss" "lang" "private") procedures))) and the procedure definition macro defined in $PLTHOME/collects/lang/private/teach.ss called 'beginner-or-intermediate-define/proc' does expand function definitions into macros (after doing a heck of a lot of compile-time checking!) (define (beginner-or-intermediate-define/proc first-order? stx) ... (syntax-case stx () ... [(_ name-seq expr ...) ... (wrap-func-definition ...) ...])) and if we follow the macros from this to wrap-function-definition, and from that to to wrap-function-definitions, then we can finally see a define-syntax in there: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (define (wrap-func-definitions first-order? kinds names argcs k) (if first-order? (let ([name2s (map (make-name-inventer) names)]) (quasisyntax (begin #,@(map (lambda (name name2 kind argc) #`(define-syntax #,name (make-first-order-function '#,kind #,argc (quote-syntax #,name2) (quote-syntax #%app)))) names name2s kinds argcs) #,(k name2s)))) (k names))) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; I'm trying not to stare at this too closely because there are way too many details here. *grin* The general idea here is that user-defined functions eventually become wrapped as macros, just as our only-add-lang module language redefined (+ x y) into a macro, and that allows us to avoid hitting the error-trapping error messages in the application macro #%app provided by the Beginner language. Whew! Primitives and procedures can't be used outside the context of application ========================================================================== On the flip side of the previous section is Beginner language's restriction on functions: functions are always being directly used as applications rather than being treated as values. If we try to just evaluate a primitive, we get an instructive error message: [fill me in] This one is a little easier to find. The way this is implemented is through a catch-all case in the macro that matches the macro's use as an identifier. [fill me in] Defined functions need to take at least one operator ==================================================== Allowing operand-positions only =============================== References ========== Fellesien, Matthias: How/Why PLT Combines Teaching with Research PLT DrScheme Programming Environment Manual