How do we use modules? --- In Python, we can bring up modules, like the math library: http://www.python.org/doc/lib/module-math.html by using the 'import' statement: ### >>> import math ### Python finds the 'math' library by following its PYTHONPATH. The import itself binds the name 'math' in our namespace and allows us to grab at math's functions and variables by grabbing attributes of the 'math' module object. For example: ### >>> math.e 2.7182818284590451 ### 'math' is a first-class value that can be passed around. If we want to pull all the names out of a module and dump then into the current namespace, we can use the 'from [module] import *' form: ### >>> from math import * >>> e 2.7182818284590451 ### However, this is discouraged in Python programs because we can quickly munge up toplevel definitions this way if we're not careful. Module import in Python is very simple, but with some tricky surprises. If we name a Python program in the current working directory with a name in the Standard Library, hilarity ensues, because Python's module finder retrieves the custom module first. PLT Scheme has two systems for doing module-like behavior. One of these appears to be somewhat complex (units), so I'm skipping that for now. Instead, I'll explore PLT's module system, which is heavily used. By the way, there is a chapter about PLT Scheme's module system in here: http://www.htus.org/Book/Staging/how-to-use-modules/ PLT Scheme uses the 'require' special form to pull a module, and PLT's standard library is called "MzLib". Let's see how we can import a math library into PLT. We'll import the "math.ss" library from MzLib: http://download.plt-scheme.org/doc/209/html/mzlib/mzlib-Z-H-23.html ;;; > (require (lib "math.ss" "mzlib")) ;;; The '(lib "math.ss" "mzlib")' part tells PLT that we'd like to get at the 'math.ss' module within the MzLib collection. There are several library collections that come with mzscheme. Actually, there are a LOT of them. Here, let's do a quick peek: ;;; > (define main-collection-path (find-executable-path program (build-path 'up "collects")) > (map path->string (directory-list main-collection-path)) ("tex2page" "string-constants" "planet" "mzcom" "drscheme" "framework" "srpersist" "sirmail" "slideshow" "syntax-color" "srfi" "algol60" "dynext" "slibinit" "make" "plot" "reduction-semantics" "lang" "web-server" "openssl" "frtime" "mred" "guibuilder" "htdp" "setup" "graphics" "profjWizard" "afm" "defaults" "honu" "icons" "skipper" "handin-client" "ssax" "texpict" "profjBoxes" "browser" "parser-tools" "waterworld" "games" "honu-module" "eopl" "tests" "html" "swindle" "handin-server" "repos-time-stamp" "mrflow" "compiler" "embedded-gui" "profj" "stepper" "readline" "hierlist" "finish-install" "sgl" "version" "syntax" "preprocessor" "net" "help" "htdch" "xelda" "trace" "slatex" "xml" "launcher" "ffi" "mrlib" "test-suite" "errortrace" "mzscheme" "mysterx" "mzlib" "info-domain" "doc" "mztake" "r6rs" "plai") ;;; Whew. *grin* Each of the names there stand for a collection, and MzLib is only one of many library collections. But since MzLib is so standard, if we leave off the "collection" portion of the module library declaration, PLT will automatically default to use mzlib, so we can simplify the above to: ;;; > (require (lib "math.ss")) ;;; The repercussions of this are that, unlike Python, this form doesn't look at the current working directory, and doesn't do a PATH-like hunt for modules. This has the nice property of being explicitly, since if we see: (require (lib "math.ss")) we know that we're referring to the "math.ss" in the standard library, without exception. I contrast this to the situation in Python, where newcomers to the language often name their personal modules in ways that conflict with the names in the Standard Library. If we did want to import a "math.ss" in the current directory, we'd leave off the "lib" declaration: ;;; > (require "math.ss") ;;; which then looks locally, rather than at the global collection paths. Let's continue. Once we've done '(require (lib "math.ss"))', we have access to the internals of the math library. But there's one surprise: unlike Python, 'math' itself is not a first-class object. By default, the require form has the same semantics as Python's "from [module] import *"! ;;; > e 2.718281828459045 ;;; A "require" pulls a module's names into our namespace. However, we can fix this by using a "prefix" on the module declaration: ;;; > (require (prefix math. (lib "math.ss" "mzlib"))) > math.e 2.718281828459045 ;;; And now all the names that we pull out of "math.ss" will have the prefix "math." attached to them. As a quick ending note on our first use of modules: I've used a period in the prefix here just to make it more comfortable to people with Python experience. But in mzscheme code that I've seen so far, it's more common to use the colon as an informal namespace separator instead, like this: ;;; > (require (prefix math: (lib "math.ss"))) > math:e 2.718281828459045 so I'll try to follow this convention for the rest of the notes. ###################################################################### How do we create a module? We can make a module with the module special form: (module [name of module] [initial-language] [module body]) There are a few new concepts here: o A module must take in an initial set of toplevel bindings from a parent initial-language module. In most modules I've seen so far, this means mzscheme. For example: (module hello-module mzscheme ...) The mzscheme module is built-in: http://download.plt-scheme.org/doc/299.100/html/mzscheme/mzscheme-Z-H-5.html#node_sec_5.7 and since it always exists by default, it's usually the most appropriate module to get our initial bindings. However, we might instead want to use a slightly richer (or less fattening!) initial environment. As a concrete example: (module eopl-practice (lib "eopl.ss" "eopl") ...) says that eopl-practice will use the initial environment provides by the (lib "eopl.ss" "eopl") module. (EOPL provides additional stuff like discriminated unions and some simple tokenizer/parser support.) o Modules don't automatically export everything. If we make a module like this: (module simple-module mzscheme (define say-hello (lambda () (printf "hello world~n")))) And if we try using it, we might run into a quick shock at first: > (module simple-module mzscheme (define say-hello (lambda () (printf "hello world~n")))) > (require simple-module) > say-hello reference to undefined identifier: say-hello repl-3:1:0: say-hello What's going on? What's happening is that, by default, modules look completely opaque. That is, we have to tell the system what things can be exposed to the outside world. Let's fix the module definition, by "providing" everything that we define: > (module simple-module mzscheme (provide (all-defined)) (define say-hello (lambda () (printf "hello world~n")))) > (require simple-module) > say-hello # > (say-hello) hello world That's better. The PLT module system can be a bit more fine-grained: we can refine what our module PROVIDEs: http://download.plt-scheme.org/doc/299.100/html/mzscheme/mzscheme-Z-H-5.html#node_sec_5.2 o The module system makes sure that symbols never collide by erroring out early. So something like this will fail: > (module broken-module mzscheme (define x 42) (define x 'forty-two)) repl-10:3:3: module: duplicate definition for identifier at: x in: (define-values (x) (quote forty-two)) But here's another place where things can break: (module hello-module-1 mzscheme (provide (all-defined)) (define say-hello (lambda () (printf "hello world~n")))) (module hello-module-2 mzscheme (provide (all-defined)) (define say-hello (lambda () (printf "hiya world~n")))) (module another-broken-module mzscheme (provide (all-defined)) (require hello-module-1 hello-module-2)) Trying to define another-broken-module will also break for the same reason that broken-module breaks: two modules try to provide the same symbols, and they clash! To get around name clashes, we can use prefixes: (module fixed-module mzscheme (provide (all-defined)) (require (prefix m1: hello-module-1) (prefix m2: hello-module-2)) (define say-hello-twice (lambda () (m1:say-hello) (m2:say-hello)))) This seems to be a good solution, and mimics how we avoid name collision in other module systems. ## Talk about how modules interact as files --- compare with how Python treats .py as modules automatically. ## Add comments from other folks on the PLT list Some significant differences of modules from Python: o Unlike Python, PLT Scheme's module system does not allow module variables to be directly munged from the outside. Let's try it: > (module test-illegal-mutation mzscheme (provide n) (define n 42)) > n 42 > (set! n (add1 n)) repl-5:1:6: set!: cannot mutate module-required variable in: n Jens Axel Soegaard sent me a note with the major reasons why mzscheme prohibits this. I'd probably butcher what he told me if I paraphrase him, so let me instead just copy what he wrote: [Jens Axel Soeggard] There are two reasons for this design choice: 1. Users of a module can't break an invariant by setting one of the exported variables tp something wrong. 2. When modules are compiled seperately, the compiler knows that a variable is set! to, if and only if it is set! inside the module. That means the compiler can see if a variable never changes - and thus use that information in the optimization phase. E.g. consider (module mzscheme (provide a) (define a 1) (display (if (= a 1) 2 3))) Since a is not changed inside the module (and also not from the outside) it is safe to assume a is constant. Thus the reference to a in the if-expression can be replaced with the value of a, namely 1. (module mzscheme (provide a) (define a 1) (display (if (= 1 1) 2 3))) Now the if-expression consists of constant expressions and can be folded together: (module mzscheme (provide a) (define a 1) (display 2)) o Coupled to this is the idea that imported names are actually aliases. That is: (module some-module mzscheme (provide (all-defined)) (define add (lambda () (set! a (+ a 1)))) (define a 42)) (require (prefix m: some-module)) (require (rename some-module myvar a)) (printf "~s~n" myvar) (m:add) (printf "~s~n" myvar) shows a result that might be slightly surprising. Here, myvar is actually an alias for the 'a' in some-module. We can contrast this with what happens in Python: ## some_module.py a = 42 def add(): global a a = a + 1 ## in another module import some_module as m from some_module import a as myvar print myvar m.add() print myvar to see that there's a fairly subtle difference here in how modules work in PLT Scheme. ## fixme: talk about philosophical differences in MzScheme and Python's module systems and consequences of those differences. 1. No way to accidently have symbols colide, at the expense of having to explicitely resolve those symbol clashes at compile time. 2. The use of the (lib ...) stuff means that there's no way to accidently import a module in the current working directory when we really meant to import a standard library thing.