Zope sessions have had problems over their lifetime. I'm aiming to fix all of them in Zope 2.7.3. Or at least the last one. ;-)
Almost all of these problems have occurred under higher-than-average load or in timing-dependent ways, which makes them hard to reproduce. Various people (me and Michael Dunstan mainly) have been on the trail over the last six-eight months or so. In this time, several important fixes have been made to various pieces of the Zope 2 sessioning machinery and parts of dependent machinery. Here's a roundup of the bugs and when they were/will be fixed for those interested.
v2.7.1 b1
- Ground-up simplification rewrite of Transience.
- KeyErrors were reported as coming from TemporaryStorage's _load method, which were due to a bug in TS's inband
garbage collection.
- TransientObjects (sessions) may have lost changes because their __setitem__, __delitem__, update, and clear methods
did not signal to the ZODB persistence machinery that they had been changed.
- TemporaryStorage was not usable under a ZEO server.
- Zope's transaction behavior flawed (standard_error_message called "between" two transactions). Symptom: KeyError
- Sessions may have timed out a bit earlier than expected (usually by up to 20 seconds).
- Add a "knob" to transient object containers allowing an admin to adjust the "timeout resolution". Setting this
higher can cause sessioning to do fewer "writes" at the expense of sessioning timeout accuracy.
v2.7.1 b2
- onDelete scripts might have been called arbitrarily later than they were supposed to be
v2.7.2 RC1
- NameErrors could be generated when an onAdd or onDelete script could not be found.
In current CVS (should be out with 2.7.3)
- Zope's transaction behavior flawed round II (explicitly abort the transaction if its commit does not succeed;
in some cases, ZODB's
Transaction.begin() would not abort all jars involved in the transaction). This could
lead to application-level database inconsistency. Symptom: KeyError
- SystemErrors were reported as being raised by SessionDataManager apparently due to a very expensive __len__
method being called in the course of testing an object for truth.
- If sessioning activity was idle for longer than 15 * session_timeout_seconds seconds, and two or more threads thereafter
simultaneously attempted to access a page that used sessions, a KeyError could be raised from Transience in
all but one of those threads ("no current bucket" bug).
- conflict resolution of TransientObjects and Length objects (used in TOC) may be incorrect. Possible symptom: Incorrect
session data, insanely wrong length count on transient object container.
- Banish use of _p_independent until I become smart enough to use it. Possible symptom: KeyError
- Investigate TO.invalidate() for sanity. Symptom: Unknown.
Post-2.7.3
- Dont do inband garbage collection or replentishment. Implies the inclusion of a scheduler in Zope.