Enum vs Flag for bitmasks in Python
Ever wondered what’s behind those funny looking Unix access permission values like
777 ⚠, or
666 😈? Well, they are
octal representations, where each of the three rightmost digits represents a different part of the permissions for owner, group, and others.
Each of these digits is created from the sum of its component bits. As a result, specific bits add to the sum as it is represented by a numeral:
- The read bit adds 4 to its total (in binary
- The write bit adds 2 to its total (in binary
- The execute bit adds 1 to its total (in binary
These values never produce ambiguous combinations and therefore each sum represents a unique set of permissions. Let’s take
600 as an example. The first digit
6 represents the permissions for the owner. In binary it is
110 and the only way to create it is from the sum of component bits above by adding
010 (write access) to
100 (read access). In other words, the owner has permission to read and write the file, but not to execute it.
Using this system for permissions is optimized for data storage. But also turning permissions on (or off) can be done efficiently in what is known as bit-masking.
You see, to turn certain bits on, the
OR operator can be used. This operator follows the rule that
X OR 1 = 1 and
X OR 0 = X. If from the example above we now also want the owner to have executive permission on the file, we simply use
OR with 1 (execute value) on the current permission value:
6 OR 1 = 7, or in binary:
110 OR 001 = 111
We can also turn off certain bits by using the
AND operator, which follows the rule that
X AND 1 = X and
X AND 0 = 0. To turn off a bit we use it with
0. Interestingly we can leave the bit as it was when using
Turning bits on and off is also referred to as “masking on” and “masking off”.
Using this very old technique to store permission parts is very cool! But are there other use-cases where it makes sense? There are still today a ton of them, although they have become unpopular due to much more efficient hardware allowing the use of a set or dictionary to store the representations that a bit would otherwise hold.
Filtering using bit masking
In Bit I needed a way to store the supported features for each of the numerous objects representing network APIs. Supported features could range from understanding bech32 addresses or segwit transactions. Instead of using a dictionary to map each API to its supported set of features, or store a set of features on each API, I chose to use a system similar to the Unix access permission values.
Making use of bit masking allows for fast filtering of APIs depending on specific features. This can be achieved by using a single
AND operator. We just use a filter value
Y that represents the features we want to filter for, and the value
X representing the supported features of some API object. Calculating
X AND Y gives another value
Z. But recall that each bit in
Z will either be 0 if the bit in
Y was 0, or it will be the bit of
Can you see the subtle implication from this? If the n’th bit of our filtering value
Y was 0 then the n’th bit of the result
Z will be 0. There is no loss of information because we didn’t care if the bit value in
X signaled support for this feature or not. But if the n’th bit of
Y was 1 then we also want the n’th bit of
Z to be 1. This is only possible if the corresponding bit in
X was 1, hence signaling support for this feature. We can thus compare the result
Y itself. If
X AND Y == Y, then and only then will the value of
X signal for (at least) all the features we are filtering for with the value
Let’s take a quick example using Unix permission values again. Let’s say we have a list of files with permission values and want to filter them to keep only those that we are allowed to read, which is the value 4 or in binary
100. One of the files may have a permission value of 5, which in binary is
101 and thus corresponds to read and execute permission.
A filter to see if at least the read permission was set on this file would be checked as follows:
101 AND 100 = 100
We see that the result on the last line is exactly the same value as our filter. This is just as we expected and we would therefore keep it.
If a file gives us permission to write and execute, but not to read, it is represented as a 3 or in binary
011. Applying our filter:
011 AND 100 = 000
The result is different from our filter and we therefore immediately know that the filter was not satisfied.
The real beauty comes from applying complex filters with multiple bits turned on. If we would want to filter for e.g. read and execute permissions, the filter would be
101 and just as before we would check if the final
AND result would equal this value.
A> The value
Z resulting from using the
AND operator on a file’s permission and filter value is the intersection of the permissions in the file and the filter.
Bitmask filtering in Python
Using Python 3.6+
Since Python 3.6 a class
Flag exists for bit mask operations.
>>> from enum import IntFlag, auto >>> >>> class Permissions(IntFlag): ... EXECUTE = auto() ... WRITE = auto() ... READ = auto() ... >>> class PermissionSet: ... def __init__(self, *flags): ... self._permission = Permissions(0) # Initiate no permissions ... for flag in flags: ... self._permission |= Permissions[flag.upper()] ... def __contains__(self, item): ... return (self._permission & item._permission) == item._permission
We can use this
PermissionSet class to store the permissions of a file as follows:
>>> class File: ... def __init__(self, name, *, permission): ... self.name = name ... self._permissions = PermissionSet(*permission)
Let’s use it in action!
>>> # A filter: >>> permission_filter = PermissionSet("read", "execute") >>> >>> # The dummy files: >>> files =  >>> files.append(File("file_rw", permission=("read", "write"))) # Should be excluded from filter >>> files.append(File("file_re", permission=("read", "execute"))) # Should be included from filter >>> files.append(File("file_w", permission=("read", "execute"))) # Should be included from filter >>> files.append(File("file_we", permission=("write", "execute"))) # Should be excluded from filter >>> files.append(File("file_rwe", permission=("read", "write", "execute"))) # Should be included from filter >>> >>> # Filter the files: >>> for f in filter(lambda f: permission_filter in f._permissions, files): ... print(f.name) ... 'file_re' 'file_w' 'file_rwe'
Using Python 3.5
However, also before Python 3.6 this could be achieved using the
IntEnum class, however with manual adjustment of the underlying permission values.
>>> from enum import IntEnum >>> >>> class Permissions(IntEnum): ... EXECUTE = 1 << 0 ... WRITE = 1 << 1 ... READ = 1 << 2 ... >>> class PermissionSet: ... def __init__(self, *flags): ... self._permission = 0 # Initiate no permissions ... for flag in flags: ... self._permission |= Permissions[flag.upper()] ... def __contains__(self, item): ... return (self._permission & item._permission) == item._permission
We can then use it in the exact same way we already did above in the example for
What do you think about this technique? Is it elegant? Or over-engineering? Let me know in the comment below.