summaryrefslogtreecommitdiff
path: root/Mailman/Archiver (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* __init__(): Squirrel away the MailList instance so we can set it onbwarsaw2002-12-021-1/+4
| | | | | | unpickled Articles. getArticle(): Do just that.
* setListIfUnset(): Hack to set self._mlist if it isn't already set.bwarsaw2002-12-021-12/+36
| | | | | | | | | | | | | | | | | | | | | | | This is needed to fully fix SF bug #644294. Old archived Articles won't have the _mlist attribute but we really want that for better handling of page creation. Unfortunately, only the HyperDatabase knows this value so this is a public method it can call after the fact. as_html(): Fix address obscuring in the article pages. If ARCHIVER_OBSCURES_EMAILADDRS is true, then we'll `at-ify' the visible author text and point the mailto: to the list's posting address. HyperArchive.__init__(): Pass the maillist to the HyperDatabase constructor. write_index_entry(): If ARCHIVER_OBSCURES_EMAILADDRS, `at-ify' the author field in the index. This might catch non-addresses with @'s in them, but so what? For the most part, it'll moderately obscure email addresses. __processbody_URLquote(): Futile attempt at code cleaning for a method that I don't think ever gets called. :/
* quick_maketext(): Copy the body of the try: from Utils.maketext() forbwarsaw2002-11-181-8/+14
| | | | | | | | | | | | completeness (this really needs to be refactored). Article.as_html(): Set the encoding of the page to the list's preferred encoding, which seems to make the headers and bodies come out okay when mixing languages -- at least as in the minimal testing I've been able to do. Woo boy, I'm really not positive about this change and would love a second opinion. Martin?
* Simplification and fixes for i18n in the archives. This is related tobwarsaw2002-11-181-62/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | SF patch #634303, but I think it's better because it actually removes code instead of adding it. We'll see what kind of holes Martin shoots through my optimism. ;) Specifically, unicode_quote() -> Utils.uquote() everywhere quick_maketext(): We can get rid of the `raw' argument since we're never going to wrap html text here. Also, put in a defense for lang=None and mlist=None (although in practice that doesn't happen). Finally -- and this is the key -- pass the interpolated text through Utils.uncanonstr() so that what we get back will be a unicode, with any characters outside the lang's charset html-ified. I think this simplifies matters by always encoding the indices to the charset of the list's preferred language, at the expense of potentially larger pages (which I don't care about). html_TOC(): Add a http-equiv meta tag indicating the charset of this page, which should fix display for the table of contents. Note that the various archtoc.html templates need updating but I will check those changes in next. HyperArchive.add_article(), .choose_charset(), .update_dirty_archives(): Removed.
* decode_charset(): We no longer need the EncWord module, sincebwarsaw2002-11-121-16/+23
| | | | | | email.Header should be sufficient. This change uses decode_header() from email.Header instead of EncWord.decode(). The EncWord.py module will be removed.
* ArchiveMail(): Get rid of the try/bare-except wrapper around thebwarsaw2002-11-121-23/+15
| | | | | mailbox processing. It's good enough for the caller of this code to do the exception handling.
* __getstate__(), __setstate__(): Be slightly more defensive.bwarsaw2002-11-071-6/+14
|
* as_html(): Add a link to the listinfo page for the mailing list.bwarsaw2002-11-041-0/+2
|
* Whitespace normalization and pycheckerfication.bwarsaw2002-11-041-113/+111
|
* Fixes for terrible performance hits after the i18n patches werebwarsaw2002-11-041-22/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | applied. The problems were twofold: - The _mlist attribute of Article objects was being pickled, causing the *-article pickles to be around 10x too big. - The article template files were being opened and re-read for each article. Fixes include pickling only a reference to the mailing list (i.e its internal name), and caching both the MailList objects and the template files. Specifically, quick_maketext(): Front-end to the template cache, this has the same signature as Utils.maketext(), but it caches as much of the results as possible. Article._open_list(): A MailList cache, stolen from Runner.py. Can you say "re-factor"? Article.__getstate__(): Don't pickle the _mlist instance, instead, return an __listname key which contains the internal name of the mailing list. Article.__setstate__(): Watch for __listname and call _open_list() to get a real MailList object. Everywhere: Use quick_maketext() instead of Utils.make_text() to take advantage of the template cache.
* Get rid of all the manipulations of the MailList object's lock file.bwarsaw2002-10-191-6/+2
| | | | HyperArch should have nothing to do with it.
* Add a work around for the default small stack size on MacOSX.bwarsaw2002-10-181-0/+19
| | | | | Apparently pipermail tickles a well-known bug on OSX related to deeply recursive regular expressions.
* __init__(): Somehow the archiver was unlocking the list at an oddbwarsaw2002-10-151-1/+1
| | | | | time. I can't figure out why that's happening but defaulting unlock to false seemed to take care of the problem... for now. :(
* Whitespace normalization and some Pychecker cleanup.bwarsaw2002-10-092-9/+7
|
* ArchiveMail(): processUnixMailbox() no longer takes an article classbwarsaw2002-10-091-1/+1
| | | | argument.
* _makeArticle(): Override base class method so that we create abwarsaw2002-10-091-1/+6
| | | | | HyperArch.Article instead -- which takes two extra constructor arguments.
* Whitespace normalization and some Pychecker cleanup.bwarsaw2002-10-091-345/+340
|
* processUnixMailbox(): Small but useful refactoring. Don't pass thebwarsaw2002-10-091-4/+6
| | | | | | | | article class as an argument. Instead call out to an override-able method _makeArticle() to return the article instance. This will be overridden in HyperArch.py. Bump __version__ while we're at it.
* CheckHTMLArchiveDir(): Indentation-o pointed out by Dan Mick.bwarsaw2002-10-081-1/+1
|
* Integrating SF patch #594771, i18n'ified pipermail.bwarsaw2002-10-081-368/+411
| | | | | | | | | | | | | | | | | | | sizeof(): Grows a lang argument. Class Article: - grows attributes _lang, _mlist - default self.charset to the server language's charset - constructor grows a lang and and mlist argument - new __setstate__() method for backcompat with older (pickled) archives - as_html(): use i18n.ctime() - in many places, use standard language templates. All inlined tqs templates in this file have been removed. - volNameToDesc(): new Code cleanup. Whitespace normalization.
* Whitespace normalization.bwarsaw2002-10-081-16/+16
|
* ArchiveMail(): Remove the corrupt archive log message. I think thisbwarsaw2002-10-081-2/+0
| | | | just confuses people and the traceback is enough information.
* processUnixMailbox(): We don't need to catch MessageParseError sincebwarsaw2002-08-291-9/+5
| | | | | | the base class's .next() method will never raise them (it's factory masks them). However, the end condition is now that "m is None" instead of m being false.
* as_html(): Patch #546362 by Nicholas Russo to include the Subject: andbwarsaw2002-08-231-0/+2
| | | | | In-Reply-To: information in archive mailto urls. Closes bug #443952 reported by Stig Hackvan.
* processUnixMailbox(): When an uncaught exception occurs duringbwarsaw2002-08-221-0/+6
| | | | | iteration, provide some more useful information in the logs/error file.
* QuoteHyperChars() -> websafe()bwarsaw2002-05-221-2/+2
| | | | | Also, use Utils.websafe() consistently throughout, instead of the inconsistent calls to cgi.escape().
* Integrating a patch from Martin v. Loewis related to i18n; quoting:bwarsaw2002-05-041-38/+86
| | | | | | | | | | | | | | | | | | | With this, you should be able to observe the following effects: - when reading the mailbox in current mailman, the index will be windows-1257; there will be lots of garbage MIME text - when applying my patch, the utf-8 and iso-8859-1 parts of it will become readable. Japanese and Korean text (in the name of two message authors) will remain obscure. - when making available the Japanese MIME charset names, the Japanese name will become readable (to those which can read Japanese, that is) - when adding the Korean codecs, the Korean name will also become readable - in all cases, the subject encoded x-mvl will remain MIME garbage.
* Archiver.InitVars(): Don't write the emptyarchive.html file tobwarsaw2002-05-031-8/+17
| | | | index.html unless there isn't already an index.html file there!
* Article.__init__(): Use the Reply-To: address as the self.email onlybwarsaw2002-04-291-4/+6
| | | | | | | | as a last resort, if no valid email address could be retrieved from the From: line. Previously it would always defer to the Reply-To: address with unintended side effects. Closes SF bug #224274.
* processUnixMailbox(): Ignore any MessageParseErrors in either thebwarsaw2002-04-031-0/+7
| | | | | skipping or processing loops. They usually mean MIME boundary problems, which legit email rarely has.
* Update copyright years.bwarsaw2002-03-163-3/+3
|
* processUnixMailbox(): Add optional start and end arguments so that webwarsaw2002-03-161-3/+16
| | | | | | | can fast forward past a bunch of messages in the mailbox. This will be used by bin/arch to process a huge .mbox file in chunks (manually). Not fully tested, but it seems to work.
* GetBaseArchiveURL(): Marc MERLIN suggests to tack on the trailingbwarsaw2002-03-071-1/+4
| | | | | slash to avoid an http redirect and second query. However, just add it if it's not already there.
* Bump copyright years.bwarsaw2002-03-051-1/+1
|
* Patches to help localize where file system layout decisions are made.bwarsaw2002-03-052-7/+4
| | | | | | Don't use PUBLIC_ARCHIVE_FILE_DIR and PRIVATE_ARCHIVE_FILE_DIR directly, use the archive_dir() method or the Site module methods instead.
* Remove the import of MailList; this isn't used anywhere in this file.bwarsaw2002-03-011-1/+0
|
* InitVars(), ArchiveMail(): The start of some cleanups flagged bybwarsaw2002-02-111-4/+0
| | | | Pychecker. Removed some unused local variables.
* InitVars(): We have to wrap each os.mkdir() in its own try/except sobwarsaw2002-01-041-5/+8
| | | | | | | that the failure of one won't prevent the attempt of the other. This fixes a problem reported by Dan Buchmann where upgrades fail if there are pre-MM2.1a4 lists that have not received a posting before the upgrade to MM2.1a4.
* T.__init__(): Add comment by Marc MERLIN.bwarsaw2002-01-021-1/+4
|
* Add a hack, based on Marc MERLIN's patch that provides /something/ forbwarsaw2002-01-021-1/+15
| | | | | | | | the archive url before the first message has been posted to the list. InitVars(): Write an initial index.html file which just states that the archive is currently empty. Template comes from emptyarchive.html.
* GetArchLock(), DropArchLock(): Get rid of posixfile locking, which isbwarsaw2001-12-241-16/+7
| | | | deprecated in Python 2.2, in favor of our good ol' standard LockFile.
* processUnixMailbox(): Add some useful diagnostics which only getbwarsaw2001-11-301-0/+4
| | | | | printed in VERBOSE mode (i.e. running bin/arch). Print out a message counter and the Message-ID: of the message being processed.
* write_index_entry(): Slight modification so that the subject andbwarsaw2001-10-281-3/+2
| | | | | author line are always html-escaped. This fixes index summaries for messages with html in their Subject: line for instance.
* Tokio Kikuchi says:bwarsaw2001-10-261-9/+8
| | | | | | | | | | | | Some time ago, someone complained about the pipermail not representing proper charset in the Content-Type header. Here is a patch for the latest CVS (2.1a). With some changes by Barry, specifically to get the charset parameter out of the Content-Type: header using email.Message's interface instead of regexp searching. Please double check this for me!
* GetBaseArchiveURL(): PUBLIC_ARCHIVE_URL should be an absolute urlbwarsaw2001-10-261-0/+4
| | | | | | | containing a %(hostname)s interpolation string. This gets filled in with the list's hostname, which is calculated from a reverse lookup in VIRTUAL_HOSTS using self.host_name as the key. Failing that, DEFAULT_URL_HOST is used.
* processUnixMailbox(): Instantiate an ArchiverMailbox instead of abwarsaw2001-10-241-4/+8
| | | | | | UnixMailbox. The main difference being we pass in the MailList instance to the constructor, so all the magic of the message scrubber can work.
* De-string-module-ification.bwarsaw2001-10-151-37/+33
| | | | | | | import cPickle as pickle Convert to use email package so we're dealing with one kind of Message object.
* De-string-module-ification.bwarsaw2001-10-151-38/+40
| | | | Also, remove pickle import since it doesn't seem to be used anywhere.
* GetBaseArchiveURL(): The semantics of PUBLIC_ARCHIVE_URL are changedbwarsaw2001-10-101-1/+3
| | | | | | to be a template into which %(listname)s will be interpolated. This makes it possible to set PUBLIC_ARCHIVE_URL to something that's appropriate for (hopefully) any external archiver.
* Use cStringIO directly instead of our own hack-around StringIObwarsaw2001-10-011-2/+1
| | | | module. Also, we don't need the string module.