How do we use modules?

---


In Python, we can bring up modules, like the math library:

    http://www.python.org/doc/lib/module-math.html

by using the 'import' statement:

###
>>> import math
###

Python finds the 'math' library by following its PYTHONPATH.  The
import itself binds the name 'math' in our namespace and allows us to
grab at math's functions and variables by grabbing attributes of the
'math' module object.  For example:

###
>>> math.e
2.7182818284590451
###

'math' is a first-class value that can be passed around.  If we want
to pull all the names out of a module and dump then into the current
namespace, we can use the 'from [module] import *' form:

###
>>> from math import *
>>> e
2.7182818284590451
###

However, this is discouraged in Python programs because we can quickly
munge up toplevel definitions this way if we're not careful.

Module import in Python is very simple, but with some tricky
surprises.  If we name a Python program in the current working
directory with a name in the Standard Library, hilarity ensues,
because Python's module finder retrieves the custom module first.


PLT Scheme has two systems for doing module-like behavior.  One of
these appears to be somewhat complex (units), so I'm skipping that for
now.  Instead, I'll explore PLT's module system, which is heavily
used.


By the way, there is a chapter about PLT Scheme's module system in
here:

    http://www.htus.org/Book/Staging/how-to-use-modules/


PLT Scheme uses the 'require' special form to pull a module, and PLT's
standard library is called "MzLib".

Let's see how we can import a math library into PLT.  We'll import the
"math.ss" library from MzLib:

    http://download.plt-scheme.org/doc/209/html/mzlib/mzlib-Z-H-23.html

;;;
> (require (lib "math.ss" "mzlib"))
;;;

The '(lib "math.ss" "mzlib")' part tells PLT that we'd like to get at
the 'math.ss' module within the MzLib collection.


There are several library collections that come with mzscheme.
Actually, there are a LOT of them.  Here, let's do a quick peek:

;;;
> (define main-collection-path
    (find-executable-path program (build-path 'up "collects"))

> (map path->string (directory-list main-collection-path))
("tex2page" "string-constants" "planet" "mzcom" "drscheme" "framework"
 "srpersist" "sirmail" "slideshow" "syntax-color" "srfi" "algol60"
 "dynext" "slibinit" "make" "plot" "reduction-semantics" "lang"
 "web-server" "openssl" "frtime" "mred" "guibuilder" "htdp" "setup"
 "graphics" "profjWizard" "afm" "defaults" "honu" "icons" "skipper"
 "handin-client" "ssax" "texpict" "profjBoxes" "browser" "parser-tools"
 "waterworld" "games" "honu-module" "eopl" "tests" "html" "swindle"
 "handin-server" "repos-time-stamp" "mrflow" "compiler" "embedded-gui"
 "profj" "stepper" "readline" "hierlist" "finish-install" "sgl"
 "version" "syntax" "preprocessor" "net" "help" "htdch" "xelda" "trace"
 "slatex" "xml" "launcher" "ffi" "mrlib" "test-suite" "errortrace"
 "mzscheme" "mysterx" "mzlib" "info-domain" "doc" "mztake"
 "r6rs" "plai")
;;;

Whew.  *grin* 

Each of the names there stand for a collection, and MzLib is only one
of many library collections.  But since MzLib is so standard, if we
leave off the "collection" portion of the module library declaration,
PLT will automatically default to use mzlib, so we can simplify the
above to:

;;;
> (require (lib "math.ss"))
;;;

The repercussions of this are that, unlike Python, this form doesn't
look at the current working directory, and doesn't do a PATH-like hunt
for modules.  This has the nice property of being explicitly, since if
we see:

    (require (lib "math.ss"))

we know that we're referring to the "math.ss" in the standard library,
without exception.  I contrast this to the situation in Python, where
newcomers to the language often name their personal modules in ways
that conflict with the names in the Standard Library.


If we did want to import a "math.ss" in the current directory, we'd
leave off the "lib" declaration:

;;;
> (require "math.ss")
;;;

which then looks locally, rather than at the global collection paths.


Let's continue.  Once we've done '(require (lib "math.ss"))', we have
access to the internals of the math library.  But there's one
surprise: unlike Python, 'math' itself is not a first-class object.
By default, the require form has the same semantics as Python's "from
[module] import *"!

;;;
> e
2.718281828459045
;;;

A "require" pulls a module's names into our namespace.  However, we
can fix this by using a "prefix" on the module declaration:

;;;
> (require (prefix math. (lib "math.ss" "mzlib")))
> math.e
2.718281828459045
;;;


And now all the names that we pull out of "math.ss" will have the
prefix "math." attached to them.


As a quick ending note on our first use of modules: I've used a period
in the prefix here just to make it more comfortable to people with
Python experience.  But in mzscheme code that I've seen so far, it's
more common to use the colon as an informal namespace separator
instead, like this:

;;;
> (require (prefix math: (lib "math.ss")))
> math:e
2.718281828459045

so I'll try to follow this convention for the rest of the notes.


######################################################################
How do we create a module?

We can make a module with the module special form:

    (module [name of module] [initial-language]
        [module body])

There are a few new concepts here:


o A module must take in an initial set of toplevel bindings from a
  parent initial-language module.  In most modules I've seen so far,
  this means mzscheme.  For example:

    (module hello-module mzscheme
       ...)

  The mzscheme module is built-in:

    http://download.plt-scheme.org/doc/299.100/html/mzscheme/mzscheme-Z-H-5.html#node_sec_5.7

  and since it always exists by default, it's usually the most
  appropriate module to get our initial bindings.


  However, we might instead want to use a slightly richer (or less
  fattening!)  initial environment.  As a concrete example:

    (module eopl-practice (lib "eopl.ss" "eopl")
        ...)

  says that eopl-practice will use the initial environment provides by
  the (lib "eopl.ss" "eopl") module.  (EOPL provides additional stuff
  like discriminated unions and some simple tokenizer/parser support.)


o  Modules don't automatically export everything.


   If we make a module like this:

       (module simple-module mzscheme
          (define say-hello
             (lambda () (printf "hello world~n"))))


   And if we try using it, we might run into a quick shock at first:

        > (module simple-module mzscheme
             (define say-hello
                (lambda () (printf "hello world~n"))))
        > (require simple-module)
        > say-hello
        reference to undefined identifier: say-hello
        repl-3:1:0: say-hello


    What's going on?

    What's happening is that, by default, modules look completely
    opaque.  That is, we have to tell the system what things can be
    exposed to the outside world.  Let's fix the module definition, by
    "providing" everything that we define:


        > (module simple-module mzscheme
             (provide (all-defined))
             (define say-hello
                (lambda () (printf "hello world~n"))))
        > (require simple-module)
        > say-hello
        #<procedure:say-hello>
        > (say-hello)
        hello world


    That's better.


    The PLT module system can be a bit more fine-grained: we can
    refine what our module PROVIDEs:

        http://download.plt-scheme.org/doc/299.100/html/mzscheme/mzscheme-Z-H-5.html#node_sec_5.2


o  The module system makes sure that symbols never collide by erroring
   out early.  So something like this will fail:

        > (module broken-module mzscheme
             (define x 42)
             (define x 'forty-two))
        repl-10:3:3: module: duplicate definition for identifier at: x in: (define-values (x) (quote forty-two))


   But here's another place where things can break:

       (module hello-module-1 mzscheme
          (provide (all-defined))
          (define say-hello
             (lambda () (printf "hello world~n"))))


       (module hello-module-2 mzscheme
          (provide (all-defined))
          (define say-hello
             (lambda () (printf "hiya world~n"))))

       (module another-broken-module mzscheme
          (provide (all-defined))
          (require hello-module-1
                   hello-module-2))

   Trying to define another-broken-module will also break for the same
   reason that broken-module breaks: two modules try to provide the
   same symbols, and they clash!

   To get around name clashes, we can use prefixes:
   
       (module fixed-module mzscheme
          (provide (all-defined))
          (require (prefix m1: hello-module-1)
                   (prefix m2: hello-module-2))
          (define say-hello-twice
             (lambda ()
                (m1:say-hello)
                (m2:say-hello))))

   This seems to be a good solution, and mimics how we avoid name
   collision in other module systems.


## Talk about how modules interact as files --- compare with how
   Python treats .py as modules automatically.

## Add comments from other folks on the PLT list


Some significant differences of modules from Python:

o Unlike Python, PLT Scheme's module system does not allow module
  variables to be directly munged from the outside.  Let's try it:

    > (module test-illegal-mutation mzscheme
        (provide n)
        (define n 42))
    > n
    42
    > (set! n (add1 n))
    repl-5:1:6: set!: cannot mutate module-required variable in: n


  Jens Axel Soegaard sent me a note with the major reasons why
  mzscheme prohibits this.  I'd probably butcher what he told me if I
  paraphrase him, so let me instead just copy what he wrote:

  [Jens Axel Soeggard]
  There are two reasons for this design choice:

    1. Users of a module can't break an invariant by setting one of
       the exported variables tp something wrong.

    2. When modules are compiled seperately, the compiler knows that
       a variable is set! to, if and only if it is set! inside the module.
       That means the compiler can see if a variable never changes - and
       thus use that information in the optimization phase.

       E.g. consider

         (module mzscheme
           (provide a)
           (define a 1)
           (display (if (= a 1) 2 3)))

       Since a is not changed inside the module (and also not from the outside)
       it is safe to assume a is constant. Thus the reference to a in the
       if-expression can be replaced with the value of a, namely 1.

         (module mzscheme
           (provide a)
           (define a 1)
           (display (if (= 1 1) 2 3)))

        Now the if-expression consists of constant expressions and can be folded
        together:

         (module mzscheme
           (provide a)
           (define a 1)
           (display 2))


o Coupled to this is the idea that imported names are actually
  aliases.  That is:

   (module some-module mzscheme
      (provide (all-defined))
      (define add
         (lambda () (set! a (+ a 1))))
      (define a 42))

   (require (prefix m: some-module))
   (require (rename some-module myvar a))
   (printf "~s~n" myvar)
   (m:add)
   (printf "~s~n" myvar)

shows a result that might be slightly surprising.  Here, myvar is
actually an alias for the 'a' in some-module.


We can contrast this with what happens in Python:

    ## some_module.py
    a = 42
    def add():
        global a
        a = a + 1


    ## in another module
    import some_module as m
    from some_module import a as myvar
    print myvar
    m.add()
    print myvar


to see that there's a fairly subtle difference here in how modules
work in PLT Scheme.


## fixme: talk about philosophical differences in MzScheme and
   Python's module systems and consequences of those differences.

   1.  No way to accidently have symbols colide, at the expense of
       having to explicitely resolve those symbol clashes at compile time.

   2.  The use of the (lib ...) stuff means that there's no way to
       accidently import a module in the current working directory
       when we really meant to import a standard library thing.