Skip to content Skip to sidebar Skip to footer

Packing Boolean Array Needs Go Throught Int (numpy 1.8.2)

I'm looking for the more compact way to store boolean. numpy internally need 8bits to store one boolean, but np.packbits allow to pack them, that's pretty cool. The problem is that

Solution 1:

There's no need to convert your boolean array to the native int dtype (which will be 64 bit on x86_64). You can avoid copying your boolean array by viewing it as np.uint8, which also uses a single byte per element:

packed = np.packbits(db_bool.view(np.uint8))

unpacked = np.unpackbits(packed)[:db_bool.size].reshape(db_bool.shape).view(np.bool)

print(np.all(db_bool == unpacked))
# True

Also, np.packbits should now work directly on boolean arrays as of this commit from over a year ago (numpy v1.10.0 and newer).

Solution 2:

Just yesterday, I answered a question to a newcomer on how to deal with bits in Python - as compared to C++. After warning there would be no speed gains, I sketched-up a naive "bitarray" using internally Python's bytearray objects.

This is in no way fast - but if you are no longer operating on your array bits, and just want the output, maybe it is good enough - as you have full control in Python code about the conversion. Otherwise, you can try just hinting the static types and run the same code as Cython, and you will probably want to use an np array with dtype=int8 instead of a bytearray:

classBitArray(object):
    def__init__(self, length):
        self.values = bytearray(b"\x00" * (length // 8 + (1if length % 8else0)))
        self.length = length

    def__setitem__(self, index, value):
        value = int(bool(value)) << (7 - index % 8)
        mask = 0xff ^ (7 - index % 8)
        self.values[index // 8] &= mask
        self.values[index // 8] |= value
    def__getitem__(self, index):
        mask = 1 << (7 - index % 8)
        returnbool(self.values[index // 8] & mask)

    def__len__(self):
        return self.length

    def__repr__(self):
        return"<{}>".format(", ".join("{:d}".format(value) for value in self))

This code was originally posted here: Is there a builtin bitset in Python that's similar to the std::bitset from C++?

Post a Comment for "Packing Boolean Array Needs Go Throught Int (numpy 1.8.2)"