MongoDB Bitwise Filtering, Faster Status Querying

What is bitwise filtering? Actually, it is quite old school back to when you are still learning essential programming courses. That was using bits to represent a handful of status and use bits to filter status. Basically, bitwise filtering can be replaced by filter a collection of status iteratively.

Bigwise Filtering in C#

In .NET, you can keep multipule status by using a List or use Enumeration Types as Bit Flags. A quick example looks like as follows,

        [Flags]
        enum Days
        {
            Monday = 0x1,
            Tuesday = 0x2,
            Wednesday = 0x4,
            Thursday = 0x8,
            Friday = 0x16,
            Saturday = 0x32,
            Sunday = 0x64
        }

        class Sport
        {
            public Sport(string name, params Days[] folders)
            {
                Name = name;
                PlayOnList = folders;
                PlayOn = folders
                    .Aggregate((current, messageFolder) => current | messageFolder);
            }

            public string Name { get; set; }
            public Days PlayOn { get; set; }
            public IList PlayOnList { get; set; }
        }

In the source code above, PlayOn accepts Bit Flags and PlayOnList is a typical list of enumeration. To filter a specific sport based on day is as simple as follows.

            var sports = new[]
            {
                new Sport("Football", Days.Wednesday),
                new Sport("Table Tennis", Days.Monday, Days.Wednesday),
                new Sport("Basketball", Days.Tuesday, Days.Thursday),
                new Sport("Surf", Days.Saturday)
            };

            var filter1 = new[] { Days.Wednesday, Days.Monday };
            var filter2 = Days.Wednesday | Days.Monday;

            sports.Where(sport => sport.PlayOnList.All(filter1.Contains));
            sports.Where(sport => (sport.PlayOn & filter2) == filter2);

Two different filtering ways, bitwise and iterating. And their algorithm complexities are O(1) and O(n). To prove this theoretical correctness, I ran a quick benchmark for the code above. The result showed bitwise filtering was roughly 15% faster than list iterating. So, bitwise is a better choice when dealing with multiple, esp. co-existing status.

Bigwise Filtering in MongoDB

In MongoDB, filtering items via its collection operations is fairly easy. The code (based on official mongo C# driver) looks like this,

    // collection is a MongoCollection

    collection.Find(Query.In("PlayOnList", new[] { Days.Wednesday, Days.Monday }));

In terms of bitwise filtering, mongo does not provide direct supoort for that. Having said that, MongoDB integrates an JavaScript interpreter to render itself extremely powerful and flexible. Mongo’s Map/Reduce functionality is based on JavaScript as well. JavaScript is first class citizen in MongoDB.

So, with JavaScript, we can quickly create a script to do the bitwise filtering in MongoDB.

    public string CreateScript(string field, Days filter)
    {
        return String.Format("(this.{0} & {1}) == {1}", field, (int)filter);
    }

Then, use Query.Where() to redesign our query.

    var script = CreateScript("PlayOnList", FolderType.Sent);

    collection.Find(Query.Where(script));

Conclusion

How quick the query could be? I know you care about the efficiency. Basically, with average 3 status, bitwise filtering queries 500% quicker than iteration does.

And that’s not the end yet. It seems MongoDB is using SeaMonkey as its JavaScript interpreter. So there is still a lot of space to optimize on interpreter itself, and with better JavaScript interpreter or VM, the performance of bitwise filtering could be even better.

Testing Environment

  • MongoDB: 1.8
  • mongo-csharp-driver: 1.1
About these ads

3 thoughts on “MongoDB Bitwise Filtering, Faster Status Querying

  1. You can actually do some of the bitwise stuff inside your enum
    Example:

    [Flags]
    public enum MongoRecipientTagFilterAppliesTo
    {
    None=0,
    Directory=1,
    CompanyWall=2,
    OrgWall=4,
    UserWall=8,
    AllWall = CompanyWall | OrgWall | UserWall,
    CompanyAndOrgWall = CompanyWall | OrgWall,
    DirectoryAndCompanyWall = Directory | CompanyWall,
    DirectoryAndCompanyAndOrgWall = Directory | CompanyAndOrgWall,
    All = Directory | AllWall,

    }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s