A while ago I put up a propsal here that detailed the basis for a
Zope scheduling
system. An
implementation of the ideas in this proposal is now complete on the
chrism-scheduling-branch in the Zope CVS
repository .
The implementation has two features:
The load indicator allows application code to get a rough sense of how "busy" the system is. There is a single load indicator per Zope instance and it is called like so:
from Lifetime import load_avg load = load_avg(30)
In the above example, load will be a float between 0 and 1
indicating how busy Zope's asyncore-based mainloop has been over the
last 30 seconds. You may replace 30 with any number up to 900, as
the current implementation only keeps 15 minutes worth of statistical
data around (though this can of course be changed). I changed the
control panel UI to show the load average over the last 1, 5 and 15
minutes, which is pretty handy given that there's has typically been
no real cross-platform way to get a sense of just how bogged down a
Zope instance is. As an entity unto itself, the load indicator is
very useful, but it ended up being not so useful for the actual
scheduling subsystem. It's just not a good indicator for scheduling
because although it can tell you what has been happening, it can't
predict what will happen next, so there's no way to make use of the
actual load data for scheduling purposes. That said, I think it's
useful and it should go into the HEAD.
While the load indicator ended up exactly as I proposed, the clock ended up being a little more and a little less than I had originally proposed. I wound up implementing a "clock server", which is an honest-to-god ZServer server that internally generates faux HTTP requests on a regular (configurable) basis. To make use of a clock server, you may specify in the Zope configuration a <clock-server> section which contains the methodname that is called, the authentication credentials used for requests, and the interval at which the method should be called. More than one clock server can be run on a per-instance basis. This ends up essentially being a built-in way of doing what people have been doing for a long time: setting up wget+cron to poke a Zope method every so often. Here is an example of a <clock-server> directive in the zope config file that calls a method named "/do_stuff" every 60 seconds under the credentials of the administrator user:
<clock-server>
method /do_stuff
period 60
user admin
password password
</clock-server>
This actually ends up working great. If you do check it out, take a look at the Z2.log and see the requests it generates. Had it been around for a while, it would have saved me a lot of hassle on several projects that had requirements for scheduled tasks to run on a regular basis. On those projects, I wound up writing a simple external XML-RPC clock that just called a method every so often, but the external clock is just one more thing to put in the rc scripts of the system, and just another thing that can fail, and it ends up being surprisingly hard to remember to set up during rollout to customers. ;-)
In the original proposal, I asserted that the clock service should implement a registration interface, and that other app code could subscribe to the clock service to be notified every so often. This isn't done yet, but it is absolutely necessary. Not all code called by the clock needs or wants to actually go through the pain of actually traversing to a Zope method in order to perform its scheduled duties. For instance, the use case that is driving me to write any of this code does not require this: session garbage collection.
Currently, the Zope sessioning garbage collection happens in-band: when the sessioning machinery is exercised by user app code, it figures out whether it needs to expire "stale" session objects and do other housekeeping duties. Unfortunately, this work happens at precisely the wrong time due to its in-band nature: it happens when the system is definitely otherwise in use. It would be much smarter to do this work when the system is not being used or is only lightly used. Pushing this work off to an out of band process would also simplify the Zope sessioning code tremendously. Additionally, because the default sessioning stuff is ZODB-based, the potential for conflicts is very high when the housekeeping happens in-band. There is a subtle bug somewhere in the sessioning code that I think is tickled only when a conflict error occurs and only when session housekeeping is happening. A number people have reported this to me (the symptom is that sessions "disappear"), and I sadly have not been able to track it down. So rather than bang my head up against the walls of this already-dodgy strategy, I've decided to fix it the brute force way, which also happens to be the right way, by creating a low-level scheduling system that the sessioning code can subscribe to in order to be notified on a regular basis; housekeeping will no longer need to happen during an actual HTTP request. Instead it will happen in a separate thread, and only when the system is under light usage. But I don't need to or want to go through the overhead of generating an HTTP request to call a housekeeping method, because I actually don't want the housekeeping method exposed to anybody on the outside (it's dangerous to do so), and I have no idea how to protect the method from invocation without knowing the right credentials, which could be anything. So, I just want to call the method directly: it doesn't require any auth credentials or a REQUEST or any of the other "Zope stuff" that many other things need.
I don't think it's a reasonable long-term solution to fire up a single separate "maintenance" thread at Zope startup that is responsible for just doing session garbage collection. It's sloppy and short-sighted for one, because there are other kinds of tasks that need to happen on a regular basis as well. And if there are other tasks that need to happen, it's likely that each task will at some point need its own thread, because if it performs an action which blocks, other scheduled tasks of the same nature might not be able to run in a reasonable amount of time. Not providing a general solution for this at the Zope level implies that every specific kind of scheduled task might be responsible for maintaining its own thread (or pool of threads). Since doing threading right is hard, it makes sense to generalize this to make it easier for coders who need to do scheduled tasks every so often.
Now, ZServer is interfaced with Zope through calls to a class named ZServerPublisher. There is a fixed-size pool of ZServerPublisher threads that actually do the computational work implied by a request which makes its way in to Zope. These threads are preallocated at Zope startup time, and when a request comes in, ZServer chooses a publisher thread from the pool, blocking if they're all in use until one becomes freed up. Once ZServer has handed the request off to a ZServerPublisher thread, it just goes about its natural business of listening for and responding to requests in its mainloop, forgetting the handoff ever happened. The thread itself actually does the work impled by the request. When its done, it returns itself to the thread pool's waiting queue.
I can imagine a different pool of threads. I'll call the threads which live in this pool "maintenance threads". This pool of threads would also be allocated at Zope startup time. But instead of receiving notifications of pending requests from ZServer, it would receive notifications of pending requests from a singleton scheduler service. The scheduler service has not been written, of course, but it would be a consumer of the clock service. Every so often, the clock would tickle the scheduler, and the scheduler would wake up and check a pending list of tasks. If tasks were waiting to be performed, the scheduler would send each task along to the maintenance thread pool manager. The manager would ensure that each task was provided with its own thread under which it could perform its duties. Pretty simple. The issue with this is that the task-checking needs to happen fast because it would be run in the context of the main thread, blocking other requests until it finishes.
So, given that, I think I should do the following:
None of this is terribly difficult to implement (although it will be time consuming). One challenge is to make the schedule lookups fast enough to happen in the context of the mainloop.
What would be nice in the context of all this scheduling talk is to be able to suspend a Python thread temporarily while another thread runs and finishes. This could be used to provide "fair" access to system resources based on task priorities (I think the computer science term for it would be "preemptive"). For instance, if the session garbage collection thread is doing its thing, and an HTTP request comes in, it would probably make sense to suspend the garbage collector while the request does its work. This would make the system appear to be more responsive to its actual users. I don't think there is a way to do this, though, using normal Python API functions for threading. Any hackery to try to do such a thing at the OS level wouldn't be a reasonable candidate for inclusion into Zope because it needs to run on such a wide variety of platforms (not all of which I even have, not to mention know anything about). If anyone has ideas about this, I'm all ears, however.