| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
| |
counts) are pretty damned important, we should always log them. __writelog()
now takes an 'important' argument, and writes to the logfile even if it
wasn't requested.
|
| |
|
|
| |
trace when we do logging.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
two LockFile instances on the same file, both would report that they
owned the lock. This is a bug because unlocking the one that
shouldn't have had the lock will unlock both.
__init__(): To fix this for now we keep a class attribute counter
which is incremented each time a new LockFile instance is created.
This wouldn't work in a multithreaded environment, be we don't have
the problem. ;)
__repr__(): Add a little more information to the repr.
refresh(): When raising the NotLockedError, add a little useful
debugging string as the exception value.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
lock acquisition in mailmanctl. Specifically,
_take_possession(): In the while-loop we need to not only check for
linkcount <> 2, but also that the contents of the lock file is our
magic token and not some other process's. This fixes the race
condition.
_disown(): Set the private __owned flag to false so __del__ won't try
to finalize, and thus unlink the lock files. This fixes the reference
counting problem in mailmanctl when both the failing non-forced lock
and the re-aquired force lock both point to the same files.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ownership to its forked children. You shouldn't normally use these
methods, but to make mailmanctl maximally convenient, we need to
create a lock in a parent and manage the lock in the child.
_transfer_to(): Called in the parent, this twiddles with lock
internals to set the temp file name to include the pid of the child
(passed in). It also transfers lock claims to the new temp file,
i.e. the child, and removes the claim made by the parent (the previous
winner).
_take_possession(): Called by the child, this resets the temp file
name to include the pid of the child. For this, it had the pid of the
parent process prior to the fork.
__del__(): Don't finalize() the lock if we don't own it (ownership is
implied by creation, and relinquished by _transfer_to()). This way,
when the parent transfers ownership to the child and then exits with
sys.exit(), the act of destroying the lock instance in the parent
won't unlock the lock.
__init__(): Initialize __owned to 1.
|
| |
|
|
|
|
| |
EnvironmentError, it makes for slightly cleaner and more efficient
code to catch the common base class rather than a tuple containing the
two derived classes.
|
| |
|
|
|
|
| |
__write() can fail with an EPERM. It seems to be a very rare
occurrence, only happens from cron, and (only?) on Solaris 2.6. I'm
too nervous to change this for 2.0.
|
| |
|
|
|
|
|
|
|
| |
it, and in fact our alter-ego owns the file (e.g. the cgi process or
shell account uid if we're mail). The utime() call will fail with an
EPERM which wasn't being caught. Catch it here and return 0 since we
obviously don't own the lock.
TBD: should other calls to __touch() be similarly wrapped?
|
| |
|
|
|
|
|
|
|
|
|
| |
global-lock followed by temp-lock. This means it is once again
illogical for the lock linkcount to be anything other than 0 or 2.
Specifically,
lock(): Remove the elif test for lincount == 1.
unlock(), __break(): First remove the global lock file, followed by
the temp lock file.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lock(): It is an expected condition that the global linkfile can have
a linkcount of 1. This can arise due to a race condition between
proc1 that is in the middle of unlocking, and proc2 which is trying to
acquire the lock. In this case, we simply log the condition and try
again later.
Also, if we're already locked, we should not unlink our tmpfname.
unlock(): We don't need to wrap self.locked() in a try/except.
__break(): Updated the comment at the head of the method based on
Harald's observation. Touching the lockfile doesn't eliminate the
race condition, just makes it more unlikely.
Also, we don't need to wrap the self.__read() in a try/except.
Finally, elaborate the unit test suit to use /dev/random if it exists
(i.e. on Linux).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mailman-developers mailing list and independently with Thomas Wouters,
with contributions by Harald Meland. This new implementation, with
related other checkins, should improve reliability and performance for
high volume sites.
See the thread
http://www.python.org/pipermail/mailman-developers/2000-May/002140.html
for more details. User visible changes include:
- StaleLockFileError is removed.
- The constructor no longer takes a sleep_interval argument.
- LockFile.steal() has been removed.
- LockFile.get_lifetime() has been added.
- LockFile.refresh() and .unlock() take an additional argument,
`unconditionally' (defaults to 0).
- Cleaner finalization implementation.
This file can also be run as a command line script to exercise unit
testing.
|
| |
|
|
|
| |
message to the logfile. Hopefully, this will help us figure out the
mysterious "command_time_out" errors we're getting from Postfix.
|
| | |
|
| |
|
|
| |
self.__writelog().
|
| | |
|
| |
|
|
|
| |
lock. The ctor now takes an optional `withlogging' option which, if
true, turns on logging for this lock. Default is no logging.
|
| |
|
|
|
|
|
|
|
| |
claimant crashes before cleaning up it's hardlink claim.
lock(): There's no reliable way to clean up stale hardlink lockfile
claims programmatically, so the best we can do is signal the error and
let a human fix the problem. This happens when we've stolen the lock,
fail the 2-link test, but happen to be the lockfile winner.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most important: a semantic change. When a lock is acquired, an
additional float value is written to the lock file. This is the
lock's expected lifetime, as an instant some point in the future
(i.e. time.time() + lifetime). This allows processes to give a clue
to other claimants as to how long the lock holder intends to keep the
lock. This is necessary because the same resource may need to be
locked by short lived and long lived processes (e.g. the archiver).
Without this, the short lived process has no idea when the lock owner
should have given up the lock and could steal it out from under it.
It is possible that a process could continually refresh a lock (see
below) in an infloop, thus causing all other claimants to back off
forever. This is (I believe) much less likely than that a process
crashes, leaving a lock turd around.
Rename this file to LockFile.py (retain flock.py for CVS
archival purposes, but soon all references to this module will be
changed to use LockFile instead of flock). The main class in this
module is also named LockFile.
All the exceptions have been changed to class-based. LockError is the
base exception class. AlreadyCalledLockError is changed to
AlreadyLockedError but NotLockedError and TimeOutError are not
changed.
New public methods set_lifetime() and refresh(). The former sets the
lifetime interval so that the same lock object can be reused with a
different lock lifetime. The latter refreshes the lock lifetime for
an already held lock. A process can use this if it suddenly realizes
it needs more time to complete its work.
"hung_timeout" is renamed to lifetime, because of the changed semantics
All private attributes have been __renamed
Docstrings everywhere!
|
| |
|
|
|
|
|
|
|
|
| |
TimeOutError, we want to unlink our tmpfname first, so we don't
potentially leave our linkfile hanging around forever.
The second change adds a `stolen' flag which gets set to 1 when we
determine a link is stale, and should eliminate the occasional assert
errors Ken is seeing. My analysis is included in the comment in the
code, but I still wonder if the logic here isn't still flawed.
|
| | |
|
| |
|
|
|
|
|
|
| |
into the lock file. USE WITH CAUTION. Necessary because in the new
news/mail gating code, the parent acquires the lock, and if
successful, the child will steal it from the parent and then unlock it
when done. This should be safe, with a very small race window, if
any.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
locking problems. JV please eyeball...
1. Got rid of is_locked attribute. Test for locking makes an explicit
test that a) the tmpfname exists and b) the pid read from the file
is our pid. This means that two instances of FileLock in the same
process that share the same lockfile will always be locked and
unlocked together (probably doesn't occur except in testing).
2. Added an __del__() which unlocks
3. Moved the creation of the basic lockfile to __kickstart() and added
a force argument. When force is true (default is false), the
lockfile is first removed then recreated. This is only used if the
lockfile contains bogus data, such that the winner could not be
determined. In that case, the only way to "break" the lock is to
recreate a new basic lockfile. This could potentially leave
tmpfname turds, but in reality this shouldn't happen very often.
4. Encapsulate reading and writing of the lockfile into __write() and
__read(), so that (most importantly) the umask will be set
correctly. This should fix most of the permission problems
associated with the lockfile by assuring that they are group
writable. A side effect: don't eval() the contents of the
lockfile.
5. lock(): Use the above changes. Added an assert for what I think is
an invariant: that the winner filename better not be equal to
self.tmpfname. Moved the timeout check into an else clause of the
pid comparison and eliminated the check of hung_timeout > 0 (this
is the one I'm least sure about).
|
| |
|
|
|
|
| |
Since there's still problems with locking, I added a little better
diagnostic to the one os.unlink() exception I've been getting a lot.
This should be removed when fixed.
|
| |
|
|
|
|
| |
a lock at the same time. This *may* have been Barry's problem.
It should at least fix it (the os.error one), even if I'm not 100%
sure of how it is happening.
|
| |
|
|
| |
and a variable.
|
|
|
code that could easily be independant of the Mailman library, I use
the Python std lib naming conventions of all lowercase method names
instead of the mixed-case way in which most of Mailman is done.
|