I haven't noticed any slowdown due to directorystorage. But our usage patterns are read-heavy and write-light.
For that usage, as far as I can tell the impact of DirectoryStorage is dwarfed by the impact of ZEO across a network. Once stuff is in the ZEO client cache, it doesn't really matter what the underlying storage is anyway -
unless you are doing heavy writes.
But performance is not why we switched to DS. We frequently had filestorage corruption (some of which were AFAICT caused by since-fixed bugs). Nightly runs of fsrefs and fstest frequently mailed me problems that would mysteriously go away after a pack. Many of these were never explained. Frankly after a couple years of sporadic hell, i.e. every few months wasting a day searching for causes and solutions to nasty POSKeyErrors, we just had little confidence in the system. Since moving to DS on the same hardware, we have had exactly zero storage issues in the past year and half. Not one ever.
Also I don't think repozo existed back then, and I was longing for an incremental backup feature, which DS has out of the box.
However, your other comments are spot-on. If you have a big (several GB) DirectoryStorage and don't pack it for a while, it can take all night to make a full backup, and longer to pack. Even with frequent (weekly in our case) packing, a 3 GB storage takes about an hour to back up on our (somewhat old) hardware.
DS also has some other nice features. The replication feature can be handy. You can't do warm failover, but you can pretty cheaply maintain a ready-to-go backup storage on a remote system and use that for cold failover. I'm not aware of any free way to do that with Filestorage.
cold failover and corruption
Posted byjensat
2005-07-22 04:14 PM
As far as corruption goes, I can only echo Tim Peters' words who says he has not seen a real ZODB corruption due to code bugs in a long time. Neither have I, even on the large-scale CMS systems I helped build and maintain at ZC. The only database that I have seen error messages in the log about (with no visible problems on the user side) is zope.org. This should have been solved by migrating the old zope.org software to the new one in a new ZODB during the migration, alas, it wasn't done that way...
I think you can get a pretty rapid failover by using repozo. It's not expensive to run against the database, so it is perfectly viable to run it e.g. once an hour to at least never lose more than 1 hours worth of changes.
As I understand it, the performance of DS depends heavily on the underlying FS type and application access patterns. Anyway, I think DS is great and another good tool in the belt and should be used in the right situations.
For that usage, as far as I can tell the impact of DirectoryStorage is dwarfed by the impact of ZEO across a network. Once stuff is in the ZEO client cache, it doesn't really matter what the underlying storage is anyway -
unless you are doing heavy writes.
But performance is not why we switched to DS. We frequently had filestorage corruption (some of which were AFAICT caused by since-fixed bugs). Nightly runs of fsrefs and fstest frequently mailed me problems that would mysteriously go away after a pack. Many of these were never explained. Frankly after a couple years of sporadic hell, i.e. every few months wasting a day searching for causes and solutions to nasty POSKeyErrors, we just had little confidence in the system. Since moving to DS on the same hardware, we have had exactly zero storage issues in the past year and half. Not one ever.
Also I don't think repozo existed back then, and I was longing for an incremental backup feature, which DS has out of the box.
However, your other comments are spot-on. If you have a big (several GB) DirectoryStorage and don't pack it for a while, it can take all night to make a full backup, and longer to pack. Even with frequent (weekly in our case) packing, a 3 GB storage takes about an hour to back up on our (somewhat old) hardware.
DS also has some other nice features. The replication feature can be handy. You can't do warm failover, but you can pretty cheaply maintain a ready-to-go backup storage on a remote system and use that for cold failover. I'm not aware of any free way to do that with Filestorage.
I think you can get a pretty rapid failover by using repozo. It's not expensive to run against the database, so it is perfectly viable to run it e.g. once an hour to at least never lose more than 1 hours worth of changes.
Replies to this comment