4 files changed, 134 insertions, 10 deletions
diff --git a/README.rst b/README.rst
index 60a0d1f55..e58d18297 100644
--- a/README.rst
+++ b/README.rst
@@ -62,6 +62,7 @@ Table of Contents
     src/mailman/docs/hyperkitty
     src/mailman/docs/contribute
     src/mailman/docs/STYLEGUIDE
+    src/mailman/docs/internationalization
     src/mailman/docs/architecture
     src/mailman/docs/8-miles-high
     src/mailman/docs/NEWS
diff --git a/src/mailman/docs/NEWS.rst b/src/mailman/docs/NEWS.rst
index 3a2398b74..c7bbc6de6 100644
--- a/src/mailman/docs/NEWS.rst
+++ b/src/mailman/docs/NEWS.rst
@@ -164,15 +164,6 @@ Interfaces
  * ``ISubscriptionService`` now supports mass unsubscribes.  Given by Harshit
    Bansal.
 
-Internal
---------
- * Add official support for Python 3.6. (Closes #295)
- * A handful of unused legacy exceptions have been removed.  The redundant
-   ``MailmanException`` has been removed; use ``MailmanError`` everywhere.
- * Drop the use of the ``lazr.smtptest`` library, which is based on the
-   asynchat/asyncore-based smtpd.py stdlib module.  Instead, use the
-   asyncio-based aiosmtpd package.
-
 Message handling
 ----------------
  * New DMARC mitigations have been added.  Given by Mark Sapiro.  (Closes #247)
@@ -273,7 +264,12 @@ REST
 
 Other
 -----
- * The test suite is now Python 3.5 compatible.
+ * Add official support for Python 3.5 and 3.6. (Closes #295)
+ * A handful of unused legacy exceptions have been removed.  The redundant
+   ``MailmanException`` has been removed; use ``MailmanError`` everywhere.
+ * Drop the use of the ``lazr.smtptest`` library, which is based on the
+   asynchat/asyncore-based smtpd.py stdlib module.  Instead, use the
+   asyncio-based `aiosmtpd <http://aiosmtpd.readthedocs.io/>`_ package.
  * Improvements in importing Mailman 2.1 lists, given by Aurélien Bompard.
  * The ``prototype`` archiver is not web accessible so it does not have a
    ``list_url`` or permalink.  Given by Aurélien Bompard.
diff --git a/src/mailman/docs/internationalization.rst b/src/mailman/docs/internationalization.rst
new file mode 100644
index 000000000..da61153fb
--- /dev/null
+++ b/src/mailman/docs/internationalization.rst
@@ -0,0 +1,123 @@
+.. _internationalization:
+
+================================
+ Mailman 3 Internationalization
+================================
+
+Mailman does not yet support IDNA (internationalized domain names, RFC
+5890) or internationalized mailboxes (RFC 6531) in email addresses.
+But *display names* and *descriptions* are fully internationalized in
+Mailman, using Unicode.  Email content is handled by the Python email
+package, which provides robust handling of internationalized content
+conforming to the MIME standard (RFCs 2045-2049 and others).
+
+The encoding of URI components addressing a REST endpoint is Unicode
+UTF-8.  Mailman does not currently handle normalization, and we
+recommend consistently using normal form NFC.  (For some languages
+NFKC is risky, as some users' personal names may be corrupted by this
+normalization.)  Mailman does not check for confusables or check
+repertoire.
+
+
+Introduction to Unicode Concepts
+================================
+
+The Unicode Standard is intended to provide an universal set of
+characters with a single, standard encoding providing an invertible
+mapping of characters to integers (called *code points* in this
+context).
+
+
+Repertoires
+-----------
+
+A set of characters is called a *repertoire*.  Unicode itself is
+intended to provide an universal repertoire sufficient to represent
+all words in all written languages, but a system may handle a
+restricted repertoire and still be considered conformant, as long as
+it does not corrupt characters it does not handle, and does not emit
+non-character code points.
+
+
+Convertibility
+--------------
+
+Unicode is intended to provide a character for each character defined
+in a national character set standard.  This is often controversial:
+Chinese characters are often *unified* with Japanese characters that
+appear somewhat different when displayed, while the Cyrillic and Greek
+equivalents of the Latin character "A" are treated as separate
+characters despite being pronounced the same way and being displayed
+as identical glyphs.  These judgments are informed by the notion that
+a text should *round-trip*.  That is, when a text is converted from
+Unicode to another encoding, and then back to Unicode, the result
+should be identical to the source text.
+
+
+Normalization
+-------------
+
+For several reasons, Unicode provides for construction of characters
+by appending *composable characters* (such as accents) to *base
+characters* (typically letters).  But since most languages assign a
+code point to each accented letter, the "round-tripping" requirement
+described above implies that Unicode should provide a code point for
+that accented letter, called a precomposed character.  This means that
+for most accented characters, there are two or more ways to represent
+them, using various combinations of base characters, precomposed
+characters, and composable characters.
+
+There are also a number of cases where equivalent characters have
+different code points (in a few extreme cases, the same character has
+different code points because the original national standard had
+duplicates).  These cases are called *compatibility* characters.
+
+The Unicode Standard requires that the compose character sequence be
+treated identically to the precomposed (single) character by all
+text-processing algorithms.  For convenience in matching, an
+application may choose to *normalize* texts.  There are two
+normalizations.  The *NFC* normal form requires that all compositions
+to precomposed characters that can be done should be done.  It has the
+advantage that the length of a word in characters is the number of
+code points in the word.  The *NFD* normal form requires that all
+precomposed characters be decomposed into a sequence of a base
+character followed by composable characters.  It useful in contexts
+where fuzzy matches (*i.e.*, ignoring accents) are desired.
+
+Finally, in each of these two forms a compatibility character may be
+replaced by its *canonical equivalent*, denoted *NFKC* and *NFKD*,
+respectively.
+
+
+Using Unicode in Mailman
+------------------------
+
+In most cases in Mailman it is highly recommended that input be
+encoded as UTF-8 in NFC format.  Although highly conformant systems
+are becoming more common, there are still many systems that assume
+that one code point is translated to one glyph on display.  On such
+systems NFC will provide a smoother user experience than NFD.  Since
+much of the text data that Mailman handles is user names, and users
+frequently strongly prefer a particular compatibility character to its
+canonical equivalent, NFKC (or NFKD) should be avoided.
+
+There are two other considerations in using Unicode in Mailman.  The
+first is the problem of confusables.  *Confusables* are characters
+which are considered different but whose glyphs are indistinguishable,
+such as Latin capital letter A and Greek capital letter Alpha.
+Similarly, many code points in Unicode are not yet assigned
+characters, or even defined as non-characters, and thus are not part
+of the repertoire of characters represented by Unicode.
+
+Mailman makes no attempt to detect inappropriate use of confusables or
+non-characters (for example, to redirect users to a domain
+disseminating malware).  The risks at present are vanishingly small
+because the necessary support in the mail system itself is not yet
+widespread, but this is likely to change in the near future.
+
+
+Localization
+============
+
+We have it!  We just don't have proper documentation here yet.
+
diff --git a/src/mailman/rest/docs/basic.rst b/src/mailman/rest/docs/basic.rst
index 1f8084ecd..24b919bb2 100644
--- a/src/mailman/rest/docs/basic.rst
+++ b/src/mailman/rest/docs/basic.rst
@@ -2,6 +2,10 @@
  Basic operation
 =================
 
+The encoding of URI components addressing a REST endpoint is Unicode
+UTF-8.  There is :ref:`more information about internationalization in
+Mailman <internationalization>`.
+
 In order to do anything with the REST API, you need to know its `Basic AUTH`_
 credentials, and the version of the API you wish to speak to.