CDBx - CDB Reimplementation for Python
cdbx is a CDB reimplementation for Python.
Supported python versions are Python 2.7 and Python 3.6+.
Note that everything going in and out of the CDB is bytes. This is more relevant for Python 3. Keys and values which are passed as (unicode) strings, are converted to bytes using the latin-1 codec. However, keys and values retrieved from the CDB are always bytes.
# simple file write import cdbx cdb = cdbx.CDB.make('somefile') cdb.add('key', 'value') # ... cdb = cdb.commit() print(cdb['key'])
# atomic file write (original cdbmake behaviour) import os import cdbx cdb = cdbx.CDB.make('somefile.tmp') cdb.add('key', 'value') # ... cdb.commit().close() os.rename('somefile.tmp', 'somefile') cdb = cdbx.CDB('somefile') print(cdb['key'])
# CDB within tempfile import tempfile import cdbx cdb = cdbx.CDB.make(tempfile.TemporaryFile(), close=True) cdb.add('key', 'value') # ... cdb = cdb.commit() print(cdb['key']) cdb.close() # or going out-of-scope, or del cdb # now it's gone print(cdb['key']) IOError: I/O operation on a closed file
Rationale, Advocacy, Key Features
CDB as a concept is a great idea for various reasons. It’s fast, it’s a simple, portable file format and all operations can be done using one open file descriptor.
There are, of course, other interfaces to CDB (most notably python-cdb, which I have been using for a long time).
cdbx tries to solve a few problems with those for me:
support for Python 3
better interface for CDB creation - you only need a file descriptor if you want to. This is important if you want to create a CDB within a single temporary file.
all operations are independent from each other (multiple accesses at the same time are possible). Thanks to the GIL it should be even thread safe. This has not been tested yet, though.
more natural interface for the main CDB class in general
better error handling (especially with regard to python) in some places
Since it’s a complete reimplementation, the licensing issue (people claim to have with public domain software [cdb]) does not apply.
There are some features to add and some cleanups to do before a stable release.
There are hashes (MD5, SHA1 and SHA256) of the download packages stored in the digests file. In order to check the integrity of the downloaded file, use a tool like md5sum (or sha1sum, sha256sum accordingly), e.g.:
$ md5sum -c cdbx-0.2.2.digests cdbx-0.2.2.tar.bz2: OK cdbx-0.2.2.tar.gz: OK cdbx-0.2.2.tar.xz: OK cdbx-0.2.2.zip: OK
In order to check the integrity of the digest file itself, you can check the PGP signature of that file. The file is signed by André Malo, Key-ID 0x029C942244325167:
$ gpg --verify cdbx-0.2.2.digests gpg: Signature made Sun Jul 31 14:07:52 2022 CEST gpg: using RSA key 21B65583FF640D34E8662B6B3DED446369F2EE1A gpg: Good signature from "André Malo <email@example.com>"
cdbx is available under the terms and conditions of the “Apache License, Version 2.0.” You’ll find the detailed licensing terms in the root directory of the source distribution package or online at http://www.apache.org/licenses/LICENSE-2.0.