summaryrefslogtreecommitdiff
path: root/Mailman/Queue/Switchboard.py
Commit message (Collapse)AuthorAgeFilesLines
* Reorganize the qrunner infrastructure. First, the package has been renamedBarry Warsaw2007-09-291-190/+0
| | | | | | | | | | | | | | | | | | | | | | from Mailman.Queue to Mailman.queue (note the case change to be more PEP 8 compliant). The Switchboard and Runner classes have been moved into the package __init__.py and the previous class modules have been removed. The switchboard cache is removed; I don't think it was ultimately buying us much. Now, just import the Switchboard class and instantiate it directly. Added an IRunner interface. Renamed the ArchRunner to ArchiveRunner. bin/qrunner and bin/mailmanctl are updated accordingly. For the former, it no long accepts -r=All to run all qrunners. You can still use the short name (e.g. --runner=incoming) to run the built-in queue runners, but this design will eventually allow for plugin qrunners by allowing them to be run specifying the full package path to the class. It also now accepts a leading dot to indicate a qrunner class relative to the Mailman.queue package.
* Convert the rest of test_runners.py to doctests; even though incomplete, theyBarry Warsaw2007-06-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | test everything the old unit tests tested. There are XXX's left in the doctests as reminders to flesh them out. Change the NNTP_REWRITE_DUPLICATE_HEADERS to use proper capitalization. Revert a change I made in the conversion of the Switchboard class: Switchboard.files is no longer a generator. The Runner implementation is cleaner if this returns a concrete list, so that's what it does now. Update the tests to reflect that. The Runner simplifies now too because it no longer needs _open_files() or the _listcache WeakValueDictionary. The standard list manager handles all this now, so just use it directly. Also change the way the Runner sets the language context in _onefile(). It still tries to set it to the preferred language of the sender, if the sender is a member of the list. Otherwise it sets it to the list's preferred language, not the system's preferred language. Removed a conditional that can't possibly happen.
* Convert the Switchboard test in test_runners.py to a doctest. Add anBarry Warsaw2007-06-271-69/+62
| | | | | | | | | | | | | | | ISwitchboard interface and modernize the Python code in the Switchboard.py implementation. The SAVE_MSGS_AS_PICKLES option is removed. Messages are always saved as pickles unless the metadata '_plaintext' key is present, though this should eventually go away too. In testall.py, put the entire VAR_PREFIX in a temporary directory. This helps the switchboard tests by not mixing their data with the installation's queue directories. The Configuration object now also ensures that all the queue and log directories exist -- one more step on the road to getting rid of the autoconf mess.
* Merge exp-elixir-branch to trunk. There is enough working to make me feelbwarsaw2007-05-281-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | confident the Elixir branch is ready to become mainline. Also, fewer branches makes for an easier migration to a dvcs. Don't expect much of the old test suite to work, or even for much of the old functionality to work. The changes here are disruptive enough to break higher level parts of Mailman. But that's okay because I am slowly building up a new and improved test suite, which will lead to a functional system again. For now, only the doctests in Mailman/docs (and their related test harnesses) will pass, but they all do pass. Note that Mailman/docs serve as system documentation first and unit tests second. You should be able to read the doctest files to understand the underlying data model. Other changes included in this merge: - Added the Mailman.ext extension package. - zope.interfaces uses to describe major components - SQLAlchemy/Elixir used as the database model - Top level doinstall target renamed to justinstall - 3rd-party packages are now installed in pythonlib/lib/python to be more compliant with distutils standards. This allows us to use just --home instead of all the --install-* options. - No longer need to include the email package or pysqlite, as Python 2.5 is required (and comes with both packages). - munepy package is included, for Python enums - IRosterSets are added as a way to manage a collection of IRosters. Roster sets are named so that we can maintain the indirection between mailing lists and rosters, where the two are maintained in different storages. - IMailingListRosters: remove_*_roster() -> delete_*_roster() - Remove IMember interface. - Utils.list_names() -> config.list_manager.names - fqdn_listname() takes an optional hostname argument. - Added a bunch of new exceptions used throughout the new interfaces. - Make LockFile a context manager for use with the 'with' statement.
* A few style fixes based on commit reviews.bwarsaw2007-03-211-1/+3
| | | | | | | | | | | | | | | | | | | | Switchboard.py - Use listname.encode('utf-8') to produce the necessary 8-bit string, instead of str(listname). Also update the preceding comment. senddigests.py - Remove an unnecessary import. Decorate.py - Remove a commented out section of code. - Remove some redundant local variables - Reorganize the section that's trying to find a usable encoding for the payload of the modified message. I don't think it really hurts much to try duplicate charsets when lcset == mcset, or when either == utf-8. Just go ahead and try them and let them fail. This simplifies the code. Also, try to get just the minimum necessary code under the UnicodeError. I think it's enough to catch the payload.encode() call.
* Fixes for i18n digest to work.tkikuchi2007-03-021-1/+1
| | | | | | | | | | | | | | | | | | Mailman/Queue/Switchboard.py: listname is returned in unicode. ( '\x80' + 'a' is OK, '\x80' + u'a' is NG) Mailman/Utils.py: Utils.oneline() is extended for returning unicode string. Mailman/Digester.py: next_post_number is not used anywhere. Mailman/database/listdata.py: Attributes added (esp. for non web u/i) Mailman/bin/senddigests.py: Initialization Mailman/Handlers/ToDigest.py: Internal string calculation is done in unicode. So, several fixes. StringIO is used because cStringIO doesn't have encoding attribute.
* Clean up file permissions and umask settings. Now we set the umask to 007bwarsaw2007-01-051-20/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | during early initialization so that we're guaranteed to get the right value regardless of the shell umask used to invoke the command line script. While we're at it, we can remove almost all individual umask settings previously in the code, and make file permissions consistently -rw-rw---- (IOW, files are no longer other readable). The only subsystem that wasn't changed was the archiver, because it uses its own umask settings to ensure that private archives have the proper permissions. Eventually we'll mess with this, but if it ain't broken... Note that check_perms complains about directory permissions, but I think check_perms can be fixed (or perhaps, even removed?!). If we decide to use LMTPRunner and HTTPRunner exclusively then no outside process will be touching our files potentially with the incorrect permissions, umask, owner, or group. If we control all of our own touch points then I think we can lock out 'other'. Another open question is whether Utils.set_global_password() can have its umask setting removed. It locks permissions down so even the group can't write to the site password file, but the default umask of 007 might be good enough even for this file. Utils.makedirs() now takes an optional mode argument, which defaults to 02775 for backward compatibility. First, the default mode can probably be changed to 02770 (see above). Second, all code that was tweaking the umask in order to do a platform compatible os.mkdir() has now been refactored to use Utils.makedirs(). Another tricky thing was getting SQLite via SQLAlchemy to create its data/mailman.db file with the proper permissions. From the comment in dbcontext.py: # XXX By design of SQLite, database file creation does not honor # umask. See their ticket #1193: # http://www.sqlite.org/cvstrac/tktview?tn=1193,31 More details in that file, but the work around is to essentially 'touch' the database file if 'sqlite' is the scheme of the SQLAlchemy URL. This little pre-touch sets the right umask honoring permission and won't hurt if the file already exists. SQLite will happily keep the existing permissions, and in fact that ticket referenced above recommends doing things this way. In the Mailman.database.initialize(), create a global lock that prevents more than one process from entering this init function at the same time. It's probably not strictly necessary given that I believe all the operations in dbcontext.connect() are multi-processing safe, but it also doesn't seem to hurt and prevents race conditions regardless of the database's own safeguards (or lack thereof). Make sure nightly_gzip.py calls initialize().
* - Switchboard.py Added missing call to create error logger.msapiro2006-07-221-1/+3
|
* Added robustness to Switchboards and Runners so that if a runner crashesbwarsaw2006-07-161-6/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | uncleanly (e.g. segfaults the Python interpreter), messages being processed will not be lost. The vulnerability, ideas, and patches are credited to Richard Barrett and Mark Sapiro. Their original work was modified by Barry for this commit and any bugs are his fault. The basic idea is that instead of unlinking a .pck file in dequeue(), the file is renamed to a .bak file. The Switchboard grows a finish() method which then unlinks the .bak file. That class's constructor also grows a 'restore' argument (defaulting to false), which when true moves all .bak files it finds in its hash space to .pck, thereby restoring a file lost while "in flight". This relies on the fact that even with multiple qrunners, exactly one process will be responsible for one hash space slice, so it's never possible (under normal operation) for a .bak file to be renamed to .pck by some other process. Test cases for both the new Switchboard behavior and the use of that by Runner subclasses has been added. There are two things to watch out for, either of which may require some additional changes. There is some small potential to duplicate messages in various queues, if say 'mailmanctl' were improperly started more than once by a site admin. This usually won't happen unless an admin is overly eager with the mailmanctl -s switch, so we can chalk this one up to operator error. I'm not sure what more we can do about that. There's also a possibility that if we're processing a message that continually causes the Python interpreter to crash, we could end up duplicating messages endlessly. This is especially troublesome for the Outgoing runner which could conceivably cause a mail flood. I consider this the more critical issue to defend against, probably by adding a numbering scheme to the .bak file names and refusing to restore a .bak file more than say 3 times without human intervention.
* - Switchboard.py - Closed very tiny holes at the upper ends of queuemsapiro2006-07-091-3/+10
| | | | | slices that could result in unprocessable queue entries. Improved FIFO processing when two queue entries have the same timestamp.
* Massive conversion process so that Mailman can be run from a user specifiedbwarsaw2006-07-081-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | configuration file. While the full conversion is not yet complete, everything that seems to be required to run mailmanctl, qrunner, rmlist, and newlist have been updated. Basically, modules should no longer import mm_cfg, but instead they should import Mailman.configuration.config. The latter is an object that's guaranteed to exist, but not guaranteed to be initialized until some top-level script calls config.load(). The latter should be called with the argument to -C/--config which is a new convention the above scripts have been given. In most cases, where mm_cfg.<variable> is used config.<variable> can be used, but the exceptions are where the default value must be available before config.load() is called. Sometimes you can import Mailman.Default and get the variable from there, but other times the code has to be changed to work around this limitation. Take each on a case-by-case basis. Note that the various directories calculated from VAR_PREFIX, EXEC_PREFIX, and PREFIX are now calculated in config.py, not in Defaults.py. This way a configuration file can override the base directories and everything should work correctly. Other changes here include: - mailmanctl, qrunner, and update are switched to optparse and $-strings, and changed to the mmshell architecture - An etc directory has been added to /usr/local/mailman and a mailman.cfg.sample file is installed there. Sites should now edit an etc/mailman.cfg file to do their configurations, although the mm_cfg file is still honored. The formats of the two files are identical. - list_lists is given the -C/--config option - Some coding style fixes in bin/update, but not extensive - Get rid of nested scope hacks in qrunner.py - A start on getting EmailBase tests working (specifically test_message), although not yet complete.
* - Convert all logging to Python's standard logging module. Get rid of allbwarsaw2006-04-171-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | traces of our crufty old Syslog. Most of this work was purely mechanical, except for: 1) Initializing the loggers. For this, there's a new module Mailman/loginit.py (yes all modules from now on will use PEP 8 names). We can't call this 'logging.py' because that will interfere with importing the stdlib module of the same name (can you say Python 2.5 and absolute imports?). If you want to write log messages both to the log file and to stderr, pass True to loginit.initialize(). This will turn on propagation of log messages to the parent 'mailman' logger, which is set up to print to stderr. This is how bin/qrunner works when not running as a subprocess of mailmanctl. 2) The driver script. I had to untwist the StampedLogger stuff and implement differently printing exceptions and such to log/error because standard logging objects don't have a write() method. So we write to a cStringIO and then pass that to the logger. 3) SMTPDirect.py because of the configurability of the log messages. This required changing SafeDict into a dict subclass (which is better than using UserDicts anyway -- yay Python 2.3!). It's probably still possible to flummox things up if you change the name of the loggers in the SMTP_LOG_* variables in mm_cfg.py. However, the worst you can do is cause output to go to stderr and not go to a log file. Note too that all entry points into the Mailman system must call Mailman.loginit.initialize() or the log output will go to stderr (which may occasionally be what you want). Currently all CGIs and qrunners should be working properly. I wish I could have tested all code paths that touch the logger, but that's infeasible. I have tested this, but it's possible that there were some mistakes in the translation. - Mailman.Bouncers.BounceAPI.Stop is a singleton, but not a class instance any more. - True/False code cleanup, PEP 8 import restructuring, whitespace normalization, and copyright year updates, as appropriate.
* Now that Python 2.3 is the minimum requirement for Mailman 2.2:bwarsaw2006-04-151-11/+4
| | | | | | | | | | | - Remove True/False binding cruft - Remove __future__ statements for nested scopes - Remove ascii_letters import hack from Utils.py - Remove mimetypes.guess_all_extensions import hack from Scrubber.py - In Pending.py, set _missing to object() (better than using []) Also, update copyright years where appropriate, and re-order imports more to my PEP 8 tastes. Whitespace normalize.
* back porting from 2.1.6tkikuchi2005-08-281-226/+39
|
* FSF office has moved. chdcking in for MAIN branch.tkikuchi2005-08-271-1/+1
|
* MarshalSwitchboard._ext_write(), ASCIISwitchboard._ext_write():bwarsaw2003-10-101-6/+2
| | | | | Promote SYNC_AFTER_WRITE to a Defaults.py/mm_cfg.py variable after all.
* SYNC_AFTER_WRITE: New flag which controls whether os.fsync() is calledbwarsaw2003-09-121-1/+23
| | | | | | | | | | on the file descriptor for data files after they've been written. We always flush the buffer, but sync'ing can be a huge performance hit. Still, some sites might want to enable this for additional data integrity. It's disabled by default. enqueue(): Return the newly calculated filebase. This let's the shunt message in Runner.py display the correct (new) filename.
* _Switchboard.__init__(): Fixed slice end-point off-by-one calculation error.bwarsaw2003-04-191-3/+3
|
* Copyright yearsbwarsaw2003-04-021-1/+1
|
* dequeue(): If we can't read a .db file, initialize data to the emptybwarsaw2003-04-021-0/+1
| | | | | | | dictionary, since the following test expects to do a has_key() on the value. Closes SF bug #707608. Backport candidate.
* dequeue(): "rejection-notice" (i.e. the dash) doesn't play nice whenbwarsaw2002-11-191-0/+6
| | | | | | METADATA_FORMAT = METAFMT_ASCII. Because of patch #567288 by Maximillian Dornseif, we have to upgrade the schema of any metadata files (i.e. rejection-notice -> rejection_notice).
* whichq(): New method which returns the queue directory.bwarsaw2002-03-111-0/+3
|
* dequeue(): Handle the case when the email package throws abwarsaw2002-01-261-1/+17
| | | | | | | | | | MessageParseError during parsing of the message. Most likely cause is because of bad MIME encapsulation. /That's/ likely caused by the message being a virus <wink>. Disposition depends on QRUNNER_SAVE_BAD_MESSAGES. When true, the message text is saved in qfiles/bad. When false, it is discarded. We always log the error.
* DumperSwitchboard: New class to provide a public interface for readingbwarsaw2001-10-011-0/+10
| | | | | .db files, and to allow argument-less construction. This is used by bin/dumpdb.
* Convert from mimelib to email.bwarsaw2001-10-011-3/+3
| | | | | Also, use cStringIO directly instead of our own hack-around StringIO module.
* MarshalSwitchboard._ext_read(): If the .db file is of version 2, justbwarsaw2001-08-151-0/+3
| | | | | delete the `filebase' key. That should be enough to update it to version 3.
* MarshalSwitchboard._ext_read(): Add a little comment to thebwarsaw2001-07-191-0/+3
| | | | `mysterious' eval() call.
* Only eval() a float's repr() back to a float if the marshal actuallytwouters2001-07-191-1/+2
| | | | contained it.
* dequeue(): Be a bit more robust about possible exceptions, ensuringbwarsaw2001-07-191-12/+14
| | | | that the msgfp file is definitely closed.
* 'UnimplementedError' -> 'NotImplementedError'twouters2001-07-101-2/+2
|
* Performance enhancement. In order to save on the time it takes tobwarsaw2001-06-271-18/+35
| | | | | | | | | | | | | | | | | | | | | parse and generate the plain text message each time it's enqueued or dequeued, we now use a binary cpickle as the message representation file. This is controlled by a module global SAVE_MSGS_AS_PICKLES which defaults to 1. When a message is saved as a pickle, its extension will be .pck instead of .msg. dequeuing will automatically recognize the different message formats and load accordingly. Specific changes include: enqueue(): If SAVE_MSGS_AS_PICKLES is set and the message metadata does not have a "_plaintext" key, then the message object is stored as a binary cpickle dump for speed. The incoming scripts post, join, leave, etc. will set the _plaintext key because they funnel the text straight from stdin to the .msg file, and the message text is only parsed on the first dequeue from qfiles/in. dequeue(): First try to load and unpickle the .pck file, falling back to loading and parsing the .msg text file if the former is missing.
* Better syslog() calling conventions.bwarsaw2001-06-271-1/+1
|
* dequeue(): Since IOError and OSError are both derived frombwarsaw2001-05-181-1/+1
| | | | | | EnvironmentError, it makes for slightly cleaner and more efficient code to catch the common base class rather than a tuple containing the two derived classes.
* This file was never published in yr 2000.bwarsaw2001-05-141-1/+1
|
* enqueue(): The `received_time' metadata (which is set here once, butbwarsaw2001-05-141-22/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | only if it has no prior value), is encoded into the file name so that we can guarantee FIFO order on the processed files. We can't encode the received time in the file attributes because there isn't enough precision (and I suspect that stat'ing all those files will be too much of a disk I/O drain). Instead, the filebase is composed of the string representation of the current time in float seconds, the symbol `+', and the SHA1 hexdigest of a hash of the uniquifying data. This makes it easy and quick to decode received time for FIFO sorting, but retains the "random" digest for bitrange slicing. Note that the received_time metadata value is never changed once its set so the first part of the filebase will remain unchanged as it moves between queues (while the hexdigest will almost definitely change on each queue move). dequeue(): Be more robust about missing .msg or .db files when the other exists (usually, it'll be the .msg file that's missing). Return None for either the msg or data part of the 2-tuple return value, where None means "missing". files(): Utililize the new file naming convention to break apart the file name and sort the files in FIFO order, while still retaining the bitrange random hash feature. MarshalSwitchboard: All Python versions up to and including Python 2.1 have a bug in the marshal representation of binary floating point numbers. Specifically, it loses precision that Mailman requires. The solution in this class is to have a hardcoded list of known float attributes, convert them to strings via repr() before marshaling the dictionary, and convert them back to floats -- via a safe eval() -- when reading the marshal back from file.
* intermediatebwarsaw2001-02-151-0/+230