Understanding Bit masks


Bit masks enable the simultaneous storage and retrieval of multiple values using one variable. This is done by using flags with special properties (numbers that are the powers of 2). It becomes trivial to symbolize membership by checking if the bit at a position is 1 or 0.

How bitmasks works

Masking employs the bitwise OR and AND operations to encode and decode values respectively.

New composite values are created by a bitwise OR of the original composite and the predefined flag. Similarly, the bitwise AND of a composite and a particular flag validates the presence / absence of the flag.

Assuming we start off with decimal values 1,2,4 and 8. The table below shows the corresponding binary values.

DecimalBinary
00000
10001
20010
40100
81000

The nice thing about this table is that the four bits allow you to represent all numbers in the range 0 – 15 via combinations. For example, 5 which is binary 101, can be derived in two ways.

5     -> 1 + 4

or

101 -> 0001 | 0100

       -> 0101

7 which is 111 can also be derived in the same form.

Since any number in a range can be specified using these few ‘base’ numbers, we can use such a set to model things in the real world. For example, let’s say we want to model user permissions for an application.

Assuming the base permissions are read, write and execute, we can map these values to the base numbers to derive the table below:

PermissionsDecimalBinary
None0000
Read1001
Write2010
Execute4100

Users of the application will have various permissions assigned to them (think ACL). Thus a potential model for visitor, reader, writer and admin roles with our base permissions is:

RolePermissionsDecimalBinary
VisitorsNone0000
ReadersRead1001
WritersRead + Write3011
AdminsRead + Write + Execute7111

Noticed a pattern yet?

All the binary values can be obtained by ‘OR-ing’ the base binary values. For example, admins who have read, write and execute permissions have the value obtained when you do a bitwise OR of 1, 2 and 4.

The UNIX model uses the same numbering system. E.g. 777 translates into 111 111 111 which grants owners, groups and others read, write and execute permissions.

Checking Permissions via Bitmasks

Now, the next question is how do you check if a composite value contains a particular flag? Going back to the binary data basics, this means checking if a bit at some position is a 1 or 0.

The bitwise AND operator comes in handy here – it guarantees a 1 when both values sharing the same index are 1 and 0 in all other cases. Thus, ‘AND-ing’ a composite value and the desired flag would reveal the desired outcome. If we get a value greater than zero as the result, then the user has the permission, otherwise a zero means there is no permission.

The admin role has a bitmask value of 111. To check if he really ‘has’ the execute permission we do a bitwise AND of 111 and the execute flag 100. The result is 100 which proves the permission.

More tables! The two tables show how to check for 2 users; one with the read + write + execute (111) permissions and another with the read and execute (101) permissions.

Read + Write + Execute (111)

Permissions111 bitmaskBinaryHas permission?
Read111001Yes: 111 & 001 → 1
Write111010Yes: 111 & 010 → 1
Execute111100Yes: 111 & 100 → 1

Read + Execute (101)

Permissions101 bitmaskBinaryHas permission?
Read101001Yes: 101 & 001 → 1
Write101010No: 101 & 010 → 0
Execute101100Yes: 101 & 100 → 1

See? Such a simple way to check without having to make unnecessary calls to the server.

How to apply Bitmasks

This post shows the permission model however bit masks can be applied a wide variety of scenarios. Possible examples include checking membership, verifying characteristics and representing hierarchies.

A possible use in games could be to verify the various power-ups an actor has and to add new ones rapidly. Eliminates the need to iterate, use the bitmask.

Hierarchical models, where higher members encompass lower ones (e.g. the set of numbers) can also be adequately modeled using bitmaps.

Language support for Bitmasks

Explicit language support is not needed to use bitmasks, the rules to know are:

  • Use increasing powers of 2 – ensures there is only one flag at the right spot
  • Create basic building blocks that are easy to extend and combine
  • Remember to watch out for overflows and use values that are large enough to hold all possible bit values. For example, C‘s uint_8 / unsigned char can only hold 8 different flags, if you need more, you’d have to use a bigger value.

Some languages provide extra support for bit mask operations. For, example, C# provides the enum FlagsAtribute which implies that the enum would be used for bit masking operations.

Teaser

Q: Why not base 10? After all, we could use 10, 100, 1000 etc.

A: The decimal system falls short because you can have ten different numbers in one position. This makes it difficult to represent the binary ON/OFF state (which maps well to 0/1).

15 thoughts on “Understanding Bit masks

  1. Nice explanation. I used to use bitmaps a lot in Pascal back when memory was at a premium (and the CDC computer I was using had 60-bit words to store them in!).

    Sets are a really nice data type to have or to emulate. When you have a collection of Boolean variables/flags, using sets avoids iteration and ordering and lets you get straight to the values.

    Since the article is entry level, you should probably note that UNIX/Linux permissions like 777 are specified in octal (3-bits) so no one will be confused. I don’t generally see octal being used for much these days since 8-bit bytes/octets are used for just about everything and that works better with hexadecimal (4 bits x 2).

    It’s beyond the scope of this article, but the main place I see bit maps (or at least bit masks) is in specifying subnetwork masks like 10.0.0.255 which represents any IP address between 10.0.0.0 and 10.0.0.254. (I’m not sure 255 itself is allowed.) These always make me stop and think what’s going on.

    Like

    1. Hi Joseph, thanks for the feedback : )

      Yes subnetwork masks are a trick case. The last time I had to work with those extensively was in school.

      The bitmask technique has come in handy for me and it makes code much cleaner.

      Like

    2. 255 = Broadcast (Reserved for DHCP and BOOTP)
      0 = Network (Reserved)
      That’s why they are not used.
      To correct your sentence it’s FROM 10.0.0.1 TO 10.0.0.254 or BETWEEN 10.0.0.0 AND 10.0.0.255 when using a 24bit subnetmask (255.255.255.0)

      Like

  2. I feel the real reason for using bit masks are that it’s set operations (in base 2) are achieved in O(1). They are supported internally at the hardware level.

    Thumbs up to the interesting read.

    Liked by 1 person

  3. Hi,
    What if I have to manage 50 different permissions. so it can be 2 raised to 49. so if a person having all permission, it need to be store like 2^0 + 2^1 + …+2^49… I think it’s huge number. Any approach for this?

    Like

  4. Hi,
    What if I have to manage 50 different permissions. so it can be 2 raised to 49. so if a person having all permission, it need to be store like 2^0 + 2^1 + …+2^49… I think it’s huge number. Any approach for this?

    Like

    1. Hi Ruchir,

      2^50 is still a value that can be stored in a long so that should still fit. However, the question is why do you have so many permissions? Can it be broken down to ‘atomic’ pieces that can be combined for larger permissions.

      For example with the basic foundational pieces of read, write and delete; you can define multiple roles.

      Like

      1. But each skill can be represented as a single binary position so if you do a bitmask you can determine if someone has a skill or not.
        E.g. 0001, 0010 can be two different skills; an entity with both skills will have a bitmask of 0011. Would that work?

        Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.