1 files changed, 188 insertions, 0 deletions
diff --git a/src/mailman/docs/architecture.rst b/src/mailman/docs/architecture.rst
new file mode 100644
index 000000000..e1d5c4ec5
--- /dev/null
+++ b/src/mailman/docs/architecture.rst
@@ -0,0 +1,188 @@
+=============================
+ Mailman 3 Core architecture
+=============================
+
+This is a brief overview of the internal architecture of the Mailman 3 core
+delivery engine.  You should start here if you want to understand how Mailman
+works at the 1000 foot level.  Another good source of architectural
+information is available in the chapter written by Barry Warsaw for the
+`Architecture of Open Source Applications`_.
+
+
+User model
+==========
+
+Every major component of the system is defined by an interface.  Look through
+``src/mailman/interfaces`` for an understanding of the system components.
+Mailman objects which are stored in the database, are defined by *model*
+classes.  Objects such as *mailing lists*, *users*, *members*, and *addresses*
+are primary objects within the system.
+
+The *mailing list* is the central object which holds all the configuration
+settings for a particular mailing list.  A mailing list is associated with a
+*domain*, and all mailing lists are managed (i.e. created, destroyed, looked
+up) via the *mailing list manager*.
+
+*Users* represent people, and have a *user id* and a *display name*.  Users
+are linked to *addresses* which represent a single email address.  One user
+can be linked to many addresses, but an address is only linked to one user.
+Addresses can be *verified* or *not verified*.  Mailman will deliver email
+only to *verified* addresses.
+
+Users and addresses are managed by the *user manager*.
+
+A *member* is created by linking a *subscriber* to a mailing list.
+Subscribers can be:
+
+* A user, which become members through their *preferred address*.
+* An address, which can be linked or unlinked to a user, but must be verified.
+
+Members also have a *role*, representing regular members, digest members, list
+owners, and list moderators.  Members can even have the *non-member* role
+(i.e. people not yet subscribed to the mailing list) for various moderation
+purposes.
+
+
+Process model
+=============
+
+Messages move around inside the Mailman system by way of *queue* directories
+managed by the *switchboard*.  For example, when a message is first received
+by Mailman, it is moved to the *in* (for "incoming") queue.  During the
+processing of this message, it -or copies of it- may be moved to other queues
+such as the *out* queue (for outgoing email), the *archive* queue (for sending
+to the archivers), the *digest* queue (for composing digests), etc.
+
+A message in a queue is represented by a single file, a ``.pck`` file.  This
+file contains two objects, serialized as `Python pickles`_.  The first object
+is the message being processed, already parsed into a `more efficient internal
+representation`_.  The second object is a metadata dictionary that records
+additional information about the message as it is being processed.
+
+``.pck`` files only exist for messages moving between different system queues.
+There is no ``.pck`` file for messages while they are actively being
+processed.
+
+Each queue directory is associated with a *runner* process which wakes up
+every so often.  When the runner wakes up, it examines all the ``.pck`` files
+in FIFO order, deserializing the message and metadata objects, and processing
+them.  If the message needs further processing in a different queue, it will
+be re-serialized back into a ``.pck`` file.  If not (e.g. because processing
+of the message is complete), then no ``.pck`` file is written.
+
+The Mailman system uses a few other runners which don't process messages in a
+queue.  You can think of these as fairly typical server process, and examples
+include the LMTP server, and the HTTP server for processing REST commands.
+
+All of the runners are managed by a *master watcher* process.  When you type
+``mailman start`` you are actually starting the master.  Based on
+configuration options, the master will start the appropriate runners as
+subprocesses, and it will watch for the clean exiting of these subprocesses
+when ``mailman stop`` is called.
+
+
+Rules and chains
+================
+
+When a message is first received for posting to a mailing list, Mailman
+processes the message to determine whether the message is appropriate for the
+mailing list.  If so, it *accepts* the message and it gets posted.  Mailman
+can *discard* the message so that no further processing occurs.  Mailman can
+also *reject* the message, bouncing it back to the original sender, usually
+with some indication of why the message was rejected.  Or, Mailman can *hold*
+the message for moderator approval.
+
+*Moderation* is the phase of processing that determines which of the above
+four dispositions will occur for the newly posted message.  Moderation does
+not generally change the message, but it may record information in the
+metadata dictionary.  Moderation is performed by the *in* queue runner.
+
+Each step in the moderation phase applies a *rule* to the message and asks
+whether the rule *hits* or *misses*.  Each rule is linked to an *action* which
+is taken if the rule hits (i.e. matches).  If the rule misses (i.e. doesn't
+match), then the next rule is tried.  All of the rule/action links are strung
+together sequentially into a *chain*, and every mailing list has a *start
+chain* where rule processing begins.
+
+Actually, every mailing list has *two* start chains, one for regular postings
+to the mailing list, and another for posting to the owners of the mailing
+list.
+
+To recap: when a message comes into Mailman for posting to a mailing list, the
+incoming runner finds the destination mailing list, determines whether the
+message is for the entire list membership, or the list owners, and retrieves
+the appropriate start chain.  The message is then passed to the chain, where
+each link in the chain first checks to see if its rule matches, and if so, it
+executes the linked action.  This action is usually one of *accept*, *reject*,
+*discard*, and *hold*, but other actions are possible, such as executing a
+function, deferring action, or jumping to another chain.
+
+As you might imagine, you can write new rules, compose them into new chains,
+and configure a mailing list to use your custom chain when processing the
+message during the moderation phase.
+
+
+Pipeline of handlers
+====================
+
+Once a message is accepted for posting to the mailing list, the message is
+usually modified in a number of different ways.  For example, some message
+headers may be added or removed, some MIME parts might be scrubbed, added, or
+rearranged, and various informative headers and footers may be added to the
+message.
+
+The process of preparing the message for the list membership (as well as the
+digests, archivers, and NNTP) falls to the *pipeline of handlers* managed by
+the *pipeline* queue.
+
+The pipeline of handlers is similar to the processing chain, except here, a
+handler can make any modifications to the message it wants, and there is no
+rule decision or action.  The message and metadata simply flow through a
+sequence of handlers arranged in a named pipeline.  Some of the handlers
+modify the message in ways described above, and others copy the message to the
+outgoing, NNTP, archiver, or digester queues.
+
+As with chains, each mailing list has two pipelines, one for posting to the
+list membership, and the other for posting to the list's owners.
+
+Of course, you can define new handlers, compose them into new pipelines, and
+change a mailing list's pipelines.
+
+
+Integration and control
+=======================
+
+Humans and external programs can interact with a running Core system in may
+different ways.  There's an extensive command line interface that provides
+useful options to a system administrator.  For external applications such as
+the Postorius web user interface, and the HyperKitty archiver, the
+`administrative REST API <rest-api>` is the most common way to get information
+into and out of the Core.
+
+**Note**: The REST API is an administrative API and as such it must not be
+exposed to the public internet.  By default, the REST server only listens on
+``localhost``.
+
+Internally, the Python API is extensive and well-documented.  Most objects in
+the system are accessed through the `Zope Component Architecture`_ (ZCA).  If
+your Mailman installation is importable, you can write scripts directly
+against the internal public Python API.
+
+
+Other bits and pieces
+=====================
+
+There are lots of other pieces to the Mailman puzzle, such as the set of core
+functionality (logging, initialization, event handling, etc.), mailing list
+*styles*, the API for integrating external archivers and mail servers.  The
+database layer is an critical piece, and Mailman has an extensive set of
+command line commands, and email commands.
+
+Almost the entire system is documented in these pages, but it maybe be a bit
+of a spelunking effort to find it.  Improvements are welcome!
+
+
+.. _`Architecture of Open Source Applications`: http://www.aosabook.org/en/mailman.html
+.. _`Python pickles`: http://docs.python.org/2/library/pickle.html
+.. _`more efficient internal representation`: https://docs.python.org/3/library/email.html
+.. _`Zope Component Architecture`: https://pypi.python.org/pypi/zope.component