Introduction
Boost.Bloom provides the class template boost::bloom::filter
that can be configured to implement a classical Bloom filter as well as
variations discussed in the literature such as block filters, multiblock filters,
and more.
#include <boost/bloom/filter.hpp> #include <cassert> #include <string> int main() { // Bloom filter of strings with 5 bits set per insertion using filter = boost::bloom::filter<std::string, 5>; // create filter with a capacity of 1'000'000 bits filter f(1'000'000); // insert elements (they can't be erased, Bloom filters are insert-only) f.insert("hello"); f.insert("Boost"); //... // elements inserted are always correctly checked as such assert(f.may_contain("hello") == true); // elements not inserted may incorrectly be identified as such with a // false positive rate (FPR) which is a function of the array capacity, // the number of bits set per element and generally how the boost::bloom::filter // was specified if(f.may_contain("bye")) { // likely false //... } }
The different filter variations supported are specified at compile time
as part of the boost::bloom::filter
instantiation definition.
Boost.Bloom has been implemented with a focus on performance;
SIMD technologies such as AVX2, Neon and SSE2 can be leveraged to speed up
operations.
Boost.Bloom is a header-only library. C++11 or later required.
Bloom Filter Primer
A Bloom filter is a probabilistic data structure where inserted elements can be looked up with 100% accuracy, whereas looking up for a non-inserted element may fail with some probability called the filter’s false positive rate or FPR. The tradeoff here is that Bloom filters occupy much less space than traditional non-probabilistic containers (typically, around 8-20 bits per element) for an acceptably low FPR. The greater the filter’s capacity (its size in bits), the lower the resulting FPR.
One prime application of Bloom filters and similar data structures is for the prevention of expensive disk/network accesses when these would fail to retrieve a given piece of information. For instance, suppose we are developing a frontend for a database with access time 10 ms and we know 50% of the requests will not succeed (the record does not exist). Inserting a Bloom filter with a lookup time of 200 ns and a FPR of 0.5% will reduce the average response time of the system from 10 ms to
(10 + 0.0002) × 50.25% + 0.0002 × 49.75% ≅ 5.03 ms,
that is, we get a ×1.99 overall speedup. If the database holds 1 billion records, an in-memory filter with say 8 bits per element will occupy 0.93 GB, which is perfectly realizable.

In general, Bloom filters are useful to prevent/mitigate queries against large data sets when exact retrieval is costly and/or can’t be made in main memory. Applications have been described in the areas of web caching, dictionary compression, network routing and genomics, among others. Broder and Mitzenmacher provide a rather extensive review of use cases with a focus on networking.
Implementation
The implementation of a Bloom filter consists of an array of m bits, initially set to zero. Inserting an element x reduces to selecting k positions pseudorandomly (with the help of k independent hash functions) and setting them to one.

To check if an element y is in the filter, we follow the same procedure and see if the selected bits are all set to one. In the example figure there are two unset bits, which definitely indicates y was not inserted in the filter.

A false positive occurs when the bits checked happen to be all set to one due to other, unrelated insertions. The probability of having a false positive increases as we add more elements to the filter, whereas for a given number n of inserted elements, a filter with greater capacity (larger bit array) will have a lower FPR. The number k of bits set per operation also affects the FPR, albeit in a more complicated way: when the array is sparsely populated, a higher value of k improves (decreases) the FPR, as there are more chances that we hit a non-set bit; however, if k is very high the array will have more and more bits set to one as new elements are inserted, which eventually will reach a point where we lose out to a filter with a lower k and thus a smaller proportions of set bits.

For given values of n and m, the optimum k is the integer closest to
\(k_{\text{opt}}=\displaystyle\frac{m\cdot\ln2}{n}\)
for a minimum FPR of \(1/2^{k_{\text{opt}}} \approx 0.6185^{m/n}\). See the appendix on FPR estimation for more details.
Variations on the Classical Filter
Block Filters
An operation on a Bloom filter involves accessing k different positions in memory, which, for large arrays, results in k CPU cache misses and affects the operation’s performance. A variation on the classical approach called a block filter seeks to minimize cache misses by concentrating all bit setting/checking in a small block of b bits pseudorandomly selected from the entire array. If the block is small enough, it will fit in a CPU cacheline, thus drastically reducing the number of cache misses.

The downside is that the resulting FPR is worse than that of a classical filter for the same values of n, m and k. Intuitively, block filters reduce the uniformity of the distribution of bits in the array, which ultimately hurts their probabilistic performance.

A further variation in this idea is to have operations select k blocks with k' bits set on each. This, again, will have a worse FPR than a classical filter with k·k' bits per operation, but improves on a plain k·k' block filter.

Multiblock Filters
Multiblock filters take block filters' approach further by having bit setting/checking done on a sequence of consecutive blocks of size b, so that each block takes exactly one bit. This still maintains a good cache locality but improves FPR with respect to block filters because bits set to one are more spread out across the array.

Multiblock filters can also be combined with multi-insertion. In general, for the same number of bits per operation and equal values of n and m, a classical Bloom filter will have the better (lower) FPR, followed by multiblock filters and then block filters. Execution speed will roughly go in the reverse order. When considering block/multiblock filters with multi-insertion, the number of available configurations grows quickly and you will need to do some experimenting to locate your preferred point in the (FPR, capacity, speed) tradeoff space.
Tutorial
Filter Definition
A boost::bloom::filter
can be regarded as a bit array divided into buckets that
are selected pseudo-randomly (based on a hash function) upon insertion:
each of the buckets is passed to a subfilter that marks several of its bits according
to some associated strategy.
template< typename T, std::size_t K, typename Subfilter = block<unsigned char, 1>, std::size_t BucketSize = 0, typename Hash = boost::hash<T>, typename Allocator = std::allocator<T> > class filter;
-
T
: Type of the elements inserted. -
K
: Number of buckets marked per insertion. -
Subfilter
: Type of subfilter used. -
BucketSize
: Size in bytes of the buckets. -
Hash
: A hash function forT
. -
Allocator
: An allocator forT
.
Subfilter
The following subfilters can be selected, offering different compromises between performance and false positive rate (FPR). See the Bloom Filter Primer for a general explanation of block and multiblock filters.
block<Block, K'>
Sets K'
bits in an underlying value of the unsigned integral type Block
(e.g. unsigned char
, uint32_t
, uint64_t
). So,
a filter<T, K, block<Block, K'>>
will set K * K'
bits per element.
The tradeoff here is that insertion/lookup will be (much) faster than
with filter<T, K * K'>
while the FPR will be worse (larger).
FPR is better the wider Block
is.
multiblock<Block, K'>
Instead of setting K'
bits in a Block
value, this subfilter sets
one bit on each of the elements of a Block[K']
subarray. This improves FPR
but impacts performance with respect to block<Block, K'>
, among other
things because cacheline boundaries can be crossed when accessing the subarray.
fast_multiblock32<K'>
Statistically equivalent to multiblock<uint32_t, K'>
, but uses
faster SIMD-based algorithms when SSE2, AVX2 or Neon are available.
fast_multiblock64<K'>
Statistically equivalent to multiblock<uint64_t, K'>
, but uses a
faster SIMD-based algorithm when AVX2 is available.
The default configuration with block<unsigned char,1>
corresponds to a
classical Bloom filter setting K
bits per element uniformly
distributed across the array.
BucketSize
When the default value 0 is used, buckets have the same size as the subarrays subfilters operate on (non-overlapping case). Otherwise, bucket size is smaller and subarrays spill over adjacent buckets, which results in an improved (lower) FPR in exchange for a possibly worse performance due to memory unalignment.
Hash
By default, Boost.ContainerHash is used.
Consult this library’s dedicated section
if you need to extend boost::hash
for your own types.
When the provided hash function is of sufficient quality, it is used
as is; otherwise, a bit-mixing post-process is applied to hash values that improves
their statistical properties so that the resulting FPR approaches its
theoretical limit. The hash function is determined to be of high quality
(more precisely, to have the so-called avalanching property) via the
boost::unordered::hash_is_avalanching
trait.
Capacity
The size of the filter’s internal array is specified at construction time:
using filter = boost::bloom::filter<std::string, ...>; filter f(1'000'000); // array of 1'000'000 bits std::cout << f.capacity(); // >= 1'000'000
Note that boost::bloom::filter
default constructor specifies a capacity
of zero, which in general won’t be of much use — the assigned array
is null.
Instead of specifying the array’s capacity directly, we can let the library figure it out based on the number of elements we plan to insert and the desired FPR:
// we'll insert 100'000 elements and want a FPR ~ 1% filter f(100'000, 0.01); // this is equivalent filter f2(filter::capacity_for(100'000, 0.01));
Once a filter is constructed, its array is fixed (for instance, it won’t
grow dynamically as elements are inserted). The only way to change it is
by assignment/swapping from a different filter, or using reset
:
f.reset(2'000'000); // change to 2'000'000 bits and clears the filter f.reset(100'000, 0.005); // equivalent to reset(filter::capacity_for(100'000, 0.005)); f.reset(); // null array (capacity == 0)
Insertion and Lookup
Insertion is done in much the same way as with a traditional container:
f.insert("hello"); f.emplace(100, 'X'); // ~ insert(std::string(100, 'X')) f.insert(data.begin(), data.end());
Of course, in this context "insertion" does not involve any actual storage of elements into the filter, but rather the setting of bits in the internal array based on the hash values of those elements. Lookup goes as follows:
bool b1 = f.may_contain("hello"); // b1 is true since we actually inserted "hello" bool b2 = f.may_contain("bye"); // b2 is most likely false
As its name suggests, may_contain
can return true
even if the
element has not been previously inserted, that is, it may yield false
positives — this is the essence of probabilistic data structures.
fpr_for
provides an estimation of the false positive rate:
// we have inserted 100 elements so far, what's our FPR? std::cout<< filter::fpr_for(100, f.capacity());
Note that in the example we provided the number 100 externally:
boost::bloom::filter
does not keep track of the number of elements
that have been inserted — in other words, it does not have a size
operation.
Once inserted, there is no way to remove a specific element from the filter. We can only clear up the filter entirely:
f.clear(); // sets all the bits in the array to zero
Filter Combination
boost::bloom::filter
s can be combined by doing the OR logical operation
of the bits of their arrays:
filter f2 = ...; ... f |= f2; // f and f2 must have exactly the same capacity
The result is equivalent to a filter "containing" both the elements
of f
and f2
. AND combination, on the other hand, results in a filter
holding the intersection of the elements:
filter f3 = ...; ... f &= f3; // f and f3 must have exactly the same capacity
For AND combination, be aware that the resulting FPR will be in general
worse (higher) than if the filter had been constructed from scratch
by inserting only the commom elements — don’t trust fpr_for
in this
case.
Direct Access to the Array
The contents of the bit array can be accessed directly with the array
member function, which can be leveraged for filter serialization:
filter f1 = ...; ... // save filter std::ofstream out("filter.bin", std::ios::binary); std::size_t c1=f1.capacity(); out.write((const char*) &c1, sizeof(c1)); // save capacity (bits) boost::span<const unsigned char> s1 = f1.array(); out.write((const char*) s1.data(), s1.size()); // save array out.close(); // load filter filter f2; std::ifstream in("filter.bin", std::ios::binary); std::size_t c2; in.read((char*) &c2, sizeof(c2)); f2.reset(c2); // restore capacity boost::span<unsigned char> s2 = f2.array(); in.read((char*) s2.data(), s2.size()); // load array in.close();
Note that array()
is a span over unsigned char
s whereas
capacities are measured in bits, so array.size()
is
capacity() / CHAR_BIT
.
Choosing a Filter Configuration
Boost.Bloom offers a plethora of compile-time and run-time configuration options, so it may be difficult to make a choice. If you’re aiming for a given FPR or have a particular capacity in mind and you’d like to choose the most appropriate filter type, the following chart may come handy.

The chart plots FPR vs. c (capacity / number of elements inserted) for several
boost::bloom::filter
s where K
has been set to its optimum value (minimum FPR)
as shown in the table below.
c = capacity / number of elements inserted | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | |
filter<1,block<uint32_t,K>> | 3 | 3 | 3 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 6 | 6 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
filter<1,block<uint32_t,K>,1> | 2 | 3 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 7 | 7 | 7 | 7 | 7 |
filter<1,block<uint64_t,K>> | 2 | 3 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 6 | 6 | 6 | 6 | 6 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
filter<1,block<uint64_t,K>,1> | 2 | 3 | 4 | 4 | 4 | 5 | 6 | 6 | 6 | 7 | 7 | 7 | 7 | 7 | 8 | 8 | 8 | 8 | 8 | 9 | 9 |
filter<1,multiblock<uint32_t,K>> | 3 | 3 | 4 | 5 | 6 | 6 | 8 | 8 | 8 | 8 | 9 | 9 | 9 | 10 | 13 | 13 | 15 | 15 | 15 | 16 | 16 |
filter<1,multiblock<uint32_t,K>,1> | 3 | 3 | 4 | 5 | 6 | 6 | 7 | 7 | 8 | 8 | 9 | 9 | 10 | 10 | 12 | 12 | 14 | 14 | 14 | 14 | 15 |
filter<1,multiblock<uint64_t,K>> | 4 | 4 | 5 | 5 | 6 | 6 | 6 | 7 | 8 | 8 | 10 | 10 | 12 | 13 | 14 | 15 | 15 | 15 | 15 | 16 | 17 |
filter<1,multiblock<uint64_t,K>,1> | 3 | 3 | 4 | 5 | 5 | 6 | 6 | 7 | 9 | 10 | 10 | 11 | 11 | 12 | 12 | 13 | 13 | 13 | 15 | 16 | 16 |
filter<K> | 3 | 4 | 4 | 5 | 5 | 6 | 6 | 8 | 8 | 9 | 10 | 11 | 12 | 13 | 13 | 13 | 14 | 16 | 16 | 16 | 17 |
Let’s see how this can be used by way of an example. Suppose we plan to insert 10M elements and want to keep the FPR at 10−4. The chart gives us five possibilities:
-
filter<K>
→ c ≅ 19 bits per element -
filter<1, multiblock<uint64_t, K>, 1>
→ c ≅ 20 bits per element -
filter<1, multiblock<uint64_t, K>>
→ c ≅ 21 bits per element -
filter<1, multiblock<uint32_t, K>, 1>
→ c ≅ 21.5 bits per element -
filter<1, multiblock<uint32_t, K>>
→ c ≅ 23 bits per element
These options have different tradeoffs in terms of space used and performance. If
we choose filter<1, multiblock<uint32_t, K>, 1>
as a compromise (or better yet,
filter<1, fast_multiblock32<K>, 1>
), the only remaining step is to consult the
value of K
in the table for c = 21 or 22, and we get our final configuration:
using my_filter=filter<std::string, 1, fast_multiblock32<14>, 1>;
The resulting filter can be constructed in any of the following ways:
// 1) calculate the capacity from the value of c we got from the chart my_filter f((std::size_t)(10'000'000 * 21.5)); // 2) let the library calculate the capacity from n and target fpr // expect some deviation from the capacity in 1) my_filter f(10'000'000, 1E-4); // 3) equivalent to 2) my_filter f(my_filter::capacity_for(10'000'000, 1E-4));
Benchmarks
(More results in a dedicated repo.)
The tables show the false positive rate and execution times in nanoseconds per operation
for nine different configurations of boost::bloom::filter<int, ...>
where 10M elements have
been inserted.
-
ins.: Insertion.
-
succ. lkp.: Successful lookup (the element is in the filter).
-
uns. lkp.: Unsuccessful lookup (the element is not in the filter, though lookup may return
true
).
Filters are constructed with a capacity c * N
(bits), so c
is the number of
bits used per element. For each combination of c
and a given filter configuration, we have
selected the optimum value of K
(that yielding the minimum FPR).
Standard release-mode settings are used; AVX2 is indicated for Visual Studio builds
(/arch:AVX2
) and GCC/Clang builds (-mavx2
), which causes
fast_multiblock32
and fast_multiblock64
to use their AVX2 variant.
GCC 14, x64
filter<K> |
filter<1,block<uint64_t,K>> |
filter<1,block<uint64_t,K>,1> |
|||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 6 | 2.1566 | 12.21 | 11.06 | 16.94 | 4 | 3.3462 | 4.44 | 4.73 | 4.73 | 5 | 3.0448 | 5.23 | 5.37 | 5.38 |
12 | 9 | 0.3146 | 18.09 | 17.08 | 17.86 | 5 | 1.0310 | 5.02 | 5.07 | 5.15 | 6 | 0.8244 | 6.87 | 6.34 | 6.28 |
16 | 11 | 0.0456 | 28.67 | 29.43 | 17.81 | 6 | 0.4035 | 6.30 | 6.48 | 6.31 | 7 | 0.2885 | 7.43 | 7.29 | 7.57 |
20 | 14 | 0.0066 | 46.54 | 39.91 | 19.26 | 7 | 0.1879 | 10.08 | 10.49 | 9.53 | 8 | 0.1185 | 9.68 | 9.08 | 9.68 |
filter<1,multiblock<uint64_t,K>> |
filter<1,multiblock<uint64_t,K>,1> |
filter<1,fast_multiblock32<K>> |
|||||||||||||
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 5 | 2.4515 | 5.68 | 6.49 | 6.50 | 5 | 2.3208 | 6.06 | 7.47 | 7.71 | 5 | 2.7234 | 3.37 | 3.80 | 3.75 |
12 | 8 | 0.4244 | 7.39 | 9.45 | 9.36 | 8 | 0.3758 | 8.20 | 10.08 | 10.12 | 8 | 0.5407 | 2.72 | 3.38 | 3.35 |
16 | 11 | 0.0776 | 11.28 | 15.08 | 15.13 | 11 | 0.0641 | 17.90 | 15.65 | 15.55 | 11 | 0.1174 | 6.76 | 6.87 | 4.87 |
20 | 13 | 0.0148 | 14.39 | 20.03 | 18.67 | 14 | 0.0120 | 16.41 | 22.94 | 22.46 | 13 | 0.0277 | 9.38 | 9.60 | 6.48 |
filter<1,fast_multiblock32<K>,1> |
filter<1,fast_multiblock64<K>> |
filter<1,fast_multiblock64<K>,1> |
|||||||||||||
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 5 | 2.4625 | 3.36 | 3.73 | 3.70 | 5 | 2.4388 | 4.92 | 5.65 | 5.58 | 5 | 2.3198 | 5.03 | 5.49 | 5.57 |
12 | 8 | 0.4428 | 3.35 | 3.69 | 3.67 | 8 | 0.4190 | 3.46 | 4.77 | 4.76 | 8 | 0.3747 | 4.81 | 5.52 | 5.46 |
16 | 11 | 0.0866 | 6.69 | 7.18 | 5.10 | 11 | 0.0781 | 8.63 | 9.82 | 7.79 | 11 | 0.0651 | 9.80 | 9.55 | 7.63 |
20 | 13 | 0.0180 | 9.08 | 9.05 | 7.13 | 13 | 0.0147 | 11.60 | 13.64 | 9.10 | 14 | 0.0112 | 11.29 | 15.12 | 16.84 |
Clang 18, x64
filter<K> |
filter<1,block<uint64_t,K>> |
filter<1,block<uint64_t,K>,1> |
|||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 6 | 2.1566 | 10.84 | 10.51 | 16.47 | 4 | 3.3462 | 3.83 | 4.63 | 4.49 | 5 | 3.0448 | 4.49 | 5.19 | 5.19 |
12 | 9 | 0.3146 | 15.69 | 15.37 | 16.96 | 5 | 1.0310 | 4.29 | 5.10 | 4.96 | 6 | 0.8244 | 4.98 | 5.78 | 5.73 |
16 | 11 | 0.0456 | 23.83 | 24.82 | 16.99 | 6 | 0.4035 | 5.46 | 6.31 | 6.13 | 7 | 0.2885 | 6.17 | 7.83 | 7.52 |
20 | 14 | 0.0066 | 42.24 | 39.92 | 20.02 | 7 | 0.1879 | 8.79 | 9.61 | 15.23 | 8 | 0.1185 | 5.61 | 6.20 | 5.94 |
filter<1,multiblock<uint64_t,K>> |
filter<1,multiblock<uint64_t,K>,1> |
filter<1,fast_multiblock32<K>> |
|||||||||||||
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 5 | 2.4515 | 3.53 | 4.13 | 4.10 | 5 | 2.3208 | 3.57 | 3.95 | 3.95 | 5 | 2.7234 | 3.03 | 3.02 | 3.04 |
12 | 8 | 0.4244 | 3.03 | 3.69 | 3.66 | 8 | 0.3758 | 4.05 | 4.18 | 4.22 | 8 | 0.5407 | 2.47 | 2.55 | 2.55 |
16 | 11 | 0.0776 | 7.07 | 7.79 | 7.99 | 11 | 0.0641 | 7.26 | 8.07 | 8.04 | 11 | 0.1174 | 5.45 | 5.85 | 4.45 |
20 | 13 | 0.0148 | 9.10 | 10.99 | 10.58 | 14 | 0.0120 | 9.62 | 11.68 | 12.15 | 13 | 0.0277 | 7.77 | 8.39 | 7.29 |
filter<1,fast_multiblock32<K>,1> |
filter<1,fast_multiblock64<K>> |
filter<1,fast_multiblock64<K>,1> |
|||||||||||||
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 5 | 2.4625 | 3.07 | 2.92 | 2.95 | 5 | 2.4388 | 4.18 | 4.73 | 4.71 | 5 | 2.3198 | 4.27 | 4.60 | 4.57 |
12 | 8 | 0.4428 | 2.96 | 2.79 | 2.78 | 8 | 0.4190 | 3.20 | 4.05 | 4.13 | 8 | 0.3747 | 4.33 | 4.53 | 4.66 |
16 | 11 | 0.0866 | 5.54 | 5.62 | 3.92 | 11 | 0.0781 | 6.62 | 7.53 | 5.91 | 11 | 0.0651 | 7.03 | 7.61 | 6.42 |
20 | 13 | 0.0180 | 9.88 | 9.24 | 6.20 | 13 | 0.0147 | 10.04 | 11.53 | 8.07 | 14 | 0.0112 | 10.14 | 11.20 | 7.99 |
Clang 15, ARM64
filter<K> |
filter<1,block<uint64_t,K>> |
filter<1,block<uint64_t,K>,1> |
|||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 6 | 2.1566 | 9.56 | 7.92 | 17.75 | 4 | 3.3462 | 1.95 | 3.56 | 3.32 | 5 | 3.0448 | 2.78 | 2.83 | 2.85 |
12 | 9 | 0.3146 | 23.43 | 21.49 | 22.68 | 5 | 1.0310 | 5.86 | 6.51 | 4.65 | 6 | 0.8244 | 5.33 | 5.76 | 5.96 |
16 | 11 | 0.0456 | 40.51 | 32.73 | 22.26 | 6 | 0.4035 | 8.98 | 8.13 | 7.84 | 7 | 0.2885 | 9.18 | 9.25 | 8.74 |
20 | 14 | 0.0066 | 67.35 | 50.68 | 24.76 | 7 | 0.1879 | 9.51 | 10.22 | 9.37 | 8 | 0.1185 | 8.18 | 7.94 | 7.73 |
filter<1,multiblock<uint64_t,K>> |
filter<1,multiblock<uint64_t,K>,1> |
filter<1,fast_multiblock32<K>> |
|||||||||||||
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 5 | 2.4515 | 3.04 | 2.81 | 3.48 | 5 | 2.3208 | 3.48 | 3.91 | 3.67 | 5 | 2.7234 | 3.06 | 3.46 | 3.47 |
12 | 8 | 0.4244 | 7.57 | 7.39 | 7.99 | 8 | 0.3758 | 6.95 | 8.08 | 9.22 | 8 | 0.5407 | 2.73 | 6.67 | 6.46 |
16 | 11 | 0.0776 | 15.16 | 9.92 | 11.60 | 11 | 0.0641 | 15.35 | 12.67 | 11.48 | 11 | 0.1174 | 10.85 | 10.72 | 7.26 |
20 | 13 | 0.0148 | 17.77 | 17.05 | 18.43 | 14 | 0.0120 | 20.02 | 17.36 | 17.71 | 13 | 0.0277 | 11.06 | 13.68 | 8.15 |
filter<1,fast_multiblock32<K>,1> |
filter<1,fast_multiblock64<K>> |
filter<1,fast_multiblock64<K>,1> |
|||||||||||||
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 5 | 2.4625 | 3.24 | 4.32 | 3.19 | 5 | 2.4515 | 3.67 | 4.58 | 4.33 | 5 | 2.3208 | 3.24 | 4.29 | 4.17 |
12 | 8 | 0.4428 | 5.93 | 5.95 | 4.54 | 8 | 0.4244 | 7.68 | 8.47 | 9.15 | 8 | 0.3758 | 4.12 | 4.68 | 4.52 |
16 | 11 | 0.0866 | 7.36 | 7.47 | 5.01 | 11 | 0.0776 | 9.48 | 8.73 | 8.70 | 11 | 0.0641 | 9.46 | 8.53 | 8.50 |
20 | 13 | 0.0180 | 9.46 | 10.42 | 5.96 | 13 | 0.0148 | 14.29 | 13.25 | 13.52 | 14 | 0.0120 | 15.82 | 13.63 | 13.47 |
VS 2022, x64
filter<K> |
filter<1,block<uint64_t,K>> |
filter<1,block<uint64_t,K>,1> |
|||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 6 | 2.1566 | 14.84 | 13.05 | 17.91 | 4 | 3.3462 | 7.06 | 4.49 | 4.50 | 5 | 3.0448 | 8.11 | 5.74 | 5.85 |
12 | 9 | 0.3146 | 25.18 | 20.59 | 18.58 | 5 | 1.0310 | 9.13 | 5.44 | 5.50 | 6 | 0.8244 | 10.50 | 7.77 | 6.62 |
16 | 11 | 0.0456 | 36.55 | 39.31 | 19.46 | 6 | 0.4035 | 13.40 | 7.31 | 7.28 | 7 | 0.2885 | 12.04 | 8.91 | 14.47 |
20 | 14 | 0.0066 | 83.30 | 83.93 | 24.98 | 7 | 0.1879 | 16.31 | 12.64 | 15.82 | 8 | 0.1185 | 20.81 | 15.83 | 15.73 |
filter<1,multiblock<uint64_t,K>> |
filter<1,multiblock<uint64_t,K>,1> |
filter<1,fast_multiblock32<K>> |
|||||||||||||
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 5 | 2.4515 | 9.50 | 6.31 | 6.39 | 5 | 2.3208 | 9.61 | 6.45 | 6.43 | 5 | 2.7234 | 3.78 | 4.27 | 4.26 |
12 | 8 | 0.4244 | 15.75 | 10.51 | 11.30 | 8 | 0.3758 | 20.97 | 9.03 | 9.37 | 8 | 0.5407 | 3.52 | 6.14 | 4.50 |
16 | 11 | 0.0776 | 25.58 | 20.31 | 18.44 | 11 | 0.0641 | 27.35 | 15.24 | 19.41 | 11 | 0.1174 | 10.92 | 14.32 | 12.54 |
20 | 13 | 0.0148 | 34.78 | 30.36 | 33.15 | 14 | 0.0120 | 38.87 | 28.78 | 25.22 | 13 | 0.0277 | 14.16 | 19.46 | 13.75 |
filter<1,fast_multiblock32<K>,1> |
filter<1,fast_multiblock64<K>> |
filter<1,fast_multiblock64<K>,1> |
|||||||||||||
c | K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
K | FPR [%] |
ins. | succ. lkp. |
uns. lkp. |
8 | 5 | 2.4625 | 3.67 | 4.18 | 4.23 | 5 | 2.4388 | 5.06 | 6.17 | 5.96 | 5 | 2.3198 | 5.12 | 5.82 | 5.86 |
12 | 8 | 0.4428 | 3.86 | 6.11 | 5.10 | 8 | 0.4190 | 5.78 | 8.72 | 7.16 | 8 | 0.3747 | 7.77 | 7.71 | 6.91 |
16 | 11 | 0.0866 | 6.94 | 8.87 | 8.60 | 11 | 0.0781 | 12.55 | 11.10 | 9.40 | 11 | 0.0651 | 12.32 | 15.23 | 15.45 |
20 | 13 | 0.0180 | 12.22 | 16.96 | 14.46 | 13 | 0.0147 | 18.56 | 24.02 | 18.81 | 14 | 0.0112 | 23.05 | 21.37 | 14.28 |
Reference
<boost/bloom/filter.hpp>
Defines boost::bloom::filter
and associated functions.
namespace boost{ namespace bloom{ template< typename T, std::size_t K, typename Subfilter = block<unsigned char, 1>, std::size_t BucketSize = 0, typename Hash = boost::hash<T>, typename Allocator = std::allocator<T> > class filter; template< typename T, std::size_t K, typename S, std::size_t B, typename H, typename A > bool operator==( const filter<T, K, S, B, H, A>& x, const filter<T, K, S, B, H, A>& y); template< typename T, std::size_t K, typename S, std::size_t B, typename H, typename A > bool operator!=( const filter<T, K, S, B, H, A>& x, const filter<T, K, S, B, H, A>& y); template< typename T, std::size_t K, typename S, std::size_t B, typename H, typename A > void swap(filter<T, K, S, B, H, A>& x, filter<T, K, S, B, H, A>& y) noexcept(noexcept(x.swap(y))); } // namespace bloom } // namespace boost
Class Template filter
boost::bloom::filter
— A data structure that supports element insertion
and probabilistic lookup, where an element can be determined to be in the filter
with high confidence or else not be with absolute certainty. The probability
that lookup erroneously classifies a non-present element as present is called
the filter’s false positive rate (FPR).
boost::bloom::filter
maintains an internal array of m
bits where m
is the
filter’s capacity. Unlike traditional containers, inserting an
element x
does not store a copy of x
within the filter, but rather results
in a fixed number of bits in the array being set to one, where the positions
of the bits are pseudorandomly produced from the hash value of x
. Lookup
for y
simply checks whether all the bits associated to y
are actually set.
-
For a given filter, the FPR increases as new elements are inserted.
-
For a given number of inserted elements, a filter with higher capacity has a lower FPR.
By convention, we say that a filter is empty if its capacity is zero or all the bits in the internal array are set to zero.
Synopsis
// #include <boost/bloom/filter.hpp> namespace boost{ namespace bloom{ template< typename T, std::size_t K, typename Subfilter = block<unsigned char, 1>, std::size_t BucketSize = 0, typename Hash = boost::hash<T>, typename Allocator = std::allocator<T> > class filter { public: // types and constants using value_type = T; static constexpr std::size_t k = K; using subfilter = Subfilter; static constexpr std::size_t bucket_size = see below; using hasher = Hash; using allocator_type = Allocator; using size_type = std::size_t; using difference_type = std::ptrdiff_t; using reference = value_type&; using const_reference = const value_type&; using pointer = value_type*; using const_pointer = const value_type*; // construct/copy/destroy filter(); explicit filter( size_type m, const hasher& h = hasher(), const allocator_type& al = allocator_type()); filter( size_type n, double fpr, const hasher& h = hasher(), const allocator_type& al = allocator_type()); template<typename InputIterator> filter( InputIterator first, InputIterator last, size_type m, const hasher& h = hasher(), const allocator_type& al = allocator_type()); template<typename InputIterator> filter( InputIterator first, InputIterator last, size_type n, double fpr, const hasher& h = hasher(), const allocator_type& al = allocator_type()); filter(const filter& x); filter(filter&& x); template<typename InputIterator> filter( InputIterator first, InputIterator last, size_type m, const allocator_type& al); template<typename InputIterator> filter( InputIterator first, InputIterator last, size_type n, double fpr, const allocator_type& al); explicit filter(const allocator_type& al); filter(const filter& x, const allocator_type& al); filter(filter&& x, const allocator_type& al); filter( std::initializer_list<value_type> il, size_type m, const hasher& h = hasher(), const allocator_type& al = allocator_type()); filter( std::initializer_list<value_type> il, size_type n, double fpr, const hasher& h = hasher(), const allocator_type& al = allocator_type()); filter(size_type m, const allocator_type& al); filter(size_type n, double fpr, const allocator_type& al); filter( std::initializer_list<value_type> il, size_type m, const allocator_type& al); filter( std::initializer_list<value_type> il, size_type n, double fpr, const allocator_type& al); ~filter(); filter& operator=(const filter& x); filter& operator=(filter&& x) noexcept( std::allocator_traits<Allocator>::is_always_equal::value || std::allocator_traits<Allocator>::propagate_on_container_move_assignment::value); filter& operator=(std::initializer_list<value_type> il); allocator_type get_allocator() const noexcept; // capacity size_type capacity() const noexcept; static size_type capacity_for(size_type n, double fpr); static double fpr_for(size_type n,size_type m) // data access boost::span<unsigned char> array() noexcept; boost::span<const unsigned char> array() const noexcept; // modifiers template<typename... Args> void emplace(Args&&... args); void insert(const value_type& x); template<typename U> void insert(const U& x); template<typename InputIterator> void insert(InputIterator first, InputIterator last); void insert(std::initializer_list<value_type> il); void swap(filter& x) noexcept(std::allocator_traits<Allocator>::is_always_equal::value || std::allocator_traits<Allocator>::propagate_on_container_swap::value); void clear() noexcept; void reset(size_type m = 0); void reset(size_type n, double fpr); filter& operator&=(const filter& x); filter& operator|=(const filter& x); // observers hasher hash_function() const; // lookup bool may_contain(const value_type& x) const; template<typename U> bool may_contain(const U& x) const; }; } // namespace bloom } // namespace boost
Description
Template Parameters
|
The cv-unqualified object type of the elements inserted into the filter. |
|
Number of times the associated subfilter is invoked per element upon insertion or lookup.
|
|
A subfilter type providing the exact algorithm for
bit setting/checking into the filter’s internal array. The subfilter is invoked |
|
Distance in bytes between the initial positions of consecutive subarrays.
If |
|
A Hash type over |
|
An Allocator whose value type is |
Allocation and deallocation of the internal array is done through an internal copy of the
provided allocator. value_type
construction/destruction (which only happens in
emplace
) uses
std::allocator_traits<Allocator>::construct
/destroy
.
If boost::unordered::hash_is_avalanching<Hash>::value
is true
and sizeof(std::size_t) >= 8
,
the hash function is used as-is; otherwise, a bit-mixing post-processing stage
is added to increase the quality of hashing at the expense of extra computational cost.
Types and Constants
static constexpr std::size_t bucket_size;
Equal to BucketSize
if that parameter was specified as distinct from zero.
Otherwise, equal to used-value-size<subfilter>
.
Constructors
Default Constructor
filter();
Constructs an empty filter using hasher()
as the hash function and
allocator_type()
as the allocator.
Preconditions: |
|
Postconditions: |
|
Capacity Constructor
explicit filter( size_type m, const hasher& h = hasher(), const allocator_type& al = allocator_type()); filter( size_type n, double fpr, const hasher& h = hasher(), const allocator_type& al = allocator_type());
Constructs an empty filter using copies of h
and al
as the hash function and allocator, respectively.
Postconditions: |
|
Iterator Range Constructor
template<typename InputIterator> filter( InputIterator first, InputIterator last, size_type m, const hasher& h = hasher(), const allocator_type& al = allocator_type()); template<typename InputIterator> filter( InputIterator first, InputIterator last, size_type n, double fpr, const hasher& h = hasher(), const allocator_type& al = allocator_type());
Constructs a filter using copies of h
and al
as the hash function and allocator, respectively,
and inserts the values from [first, last)
into it.
Preconditions: |
|
Postconditions: |
|
Copy Constructor
filter(const filter& x);
Constructs a filter using copies of x
's internal array, x.hash_function()
and std::allocator_traits<Allocator>::select_on_container_copy_construction(x.get_allocator())
.
Postconditions: |
|
Move Constructor
filter(filter&& x);
Constructs a filter tranferring x
's internal array to *this
and using
a hash function and allocator move-constructed from x
's hash function
and allocator, respectively.
Postconditions: |
|
Iterator Range Constructor with Allocator
template<typename InputIterator> filter( InputIterator first, InputIterator last, size_type m, const allocator_type& al); template<typename InputIterator> filter( InputIterator first, InputIterator last, size_type n, double fpr, const allocator_type& al);
Allocator Constructor
explicit filter(const allocator_type& al);
Constructs an empty filter using hasher()
as the hash function and
a copy of al
as the allocator.
Preconditions: |
|
Postconditions: |
|
Copy Constructor with Allocator
filter(const filter& x, const allocator_type& al);
Constructs a filter using copies of x
's internal array, x.hash_function()
and al
.
Postconditions: |
|
Move Constructor with Allocator
filter(filter&& x, const allocator_type& al);
Constructs a filter tranferring x
's internal array to *this
if
al == x.get_allocator()
, or using a copy of the array otherwise.
The hash function of the new filter is move-constructed from x
's
hash function and the allocator is a copy of al
.
Postconditions: |
|
Initializer List Constructor
filter( std::initializer_list<value_type> il, size_type m, const hasher& h = hasher(), const allocator_type& al = allocator_type()); filter( std::initializer_list<value_type> il, size_type n, double fpr, const hasher& h = hasher(), const allocator_type& al = allocator_type());
Capacity Constructor with Allocator
filter(size_type m, const allocator_type& al); filter(size_type n, double fpr, const allocator_type& al);
Initializer List Constructor with Allocator
filter( std::initializer_list<value_type> il, size_type m, const allocator_type& al); filter( std::initializer_list<value_type> il, size_type n, double fpr, const allocator_type& al);
Destructor
~filter();
Deallocates the internal array and destructs the internal hash function and allocator.
Assignment
Copy Assignment
filter& operator=(const filter& x);
Let pocca
be std::allocator_traits<Allocator>::propagate_on_container_copy_assignment::value
.
If pocca
, replaces the internal allocator al
with a copy of x.get_allocator()
.
If capacity() != x.capacity()
or pocca && al != x.get_allocator()
, replaces the internal array
with a new one with capacity x.capacity()
.
Copies the values of x
's internal array.
Replaces the internal hash function with a copy of x.hash_function()
.
Preconditions: |
If |
Postconditions: |
|
Returns: |
|
Move Assignment
filter& operator=(filter&& x) noexcept( std::allocator_traits<Allocator>::is_always_equal::value || std::allocator_traits<Allocator>::propagate_on_container_move_assignment::value);
Let pocma
be std::allocator_traits<Allocator>::propagate_on_container_move_assignment::value
.
If pocma
, replaces the internal allocator with a copy of x.get_allocator()
.
If get_allocator() == x.get_allocator()
, transfers x
's internal array to *this
;
otherwise, replaces the internal array with a new one with capacity x.capacity()
and copies the values of x
's internal array.
Replaces the internal hash function with a copy of x.hash_function()
.
Preconditions: |
If |
Postconditions: |
|
Returns: |
|
Initializer List Assignment
filter& operator=(std::initializer_list<value_type> il);
Clears the filter and inserts the values from il
.
Returns: |
|
Capacity
Capacity
size_type capacity() const noexcept;
Postconditions: |
|
Returns: |
The size in bits of the internal array. |
Capacity Estimation
static size_type capacity_for(size_type n, double fpr);
Preconditions: |
|
Postconditions: |
|
Returns: |
An estimation of the capacity required by a |
FPR Estimation
static double fpr_for(size_type n, size_type m);
Postconditions: |
|
Returns: |
An estimation of the resulting false positive rate when
|
Data Access
Array
boost::span<unsigned char> array() noexcept; boost::span<const unsigned char> array() const noexcept;
Postconditions: |
|
Returns: |
A span over the internal array. |
Modifiers
Emplace
template<typename... Args> void emplace(Args&&... args);
Inserts an element constructed from std::forward<Args>(args)...
.
Preconditions: |
|
Insert
void insert(const value_type& x); template<typename U> void insert(const U& x);
If capacity() != 0
, sets to one k * subfilter::k
(not necessarily distinct)
bits of the internal array deterministically selected from the value
hash_function()(x)
.
Postconditions: |
|
Notes: |
The second overload only participates in overload resolution if
|
Insert Iterator Range
template<typename InputIterator> void insert(InputIterator first, InputIterator last);
Equivalent to while(first != last) insert(*first++)
.
Preconditions: |
|
Insert Initializer List
void insert(std::initializer_list<value_type> il);
Equivalent to insert(il.begin(), il.end())
.
Swap
void swap(filter& x) noexcept(std::allocator_traits<Allocator>::is_always_equal::value || std::allocator_traits<Allocator>::propagate_on_container_swap::value);
Let pocs
be std::allocator_traits<Allocator>::propagate_on_container_swap::value
.
Swaps the internal array and hash function with those of x
.
If pocs
, swaps the internal allocator with that of x
.
Reset
void reset(size_type m = 0); void reset(size_type n, double fpr);
First overload: Replaces the internal array if the resulting capacity calculated from m
is not
equal to capacity()
, and clears the filter.
Second overload: Equivalent to reset(capacity_for(n, fpr))
.
Postconditions: |
In general, |
Combine with AND
filter& operator&=(const filter& x);
If capacity() != x.capacity()
, throws a std::invalid_argument
exception;
otherwise, changes the value of each bit in the internal array with the result of
doing a logical AND operation of that bit and the corresponding one in x
.
Returns: |
|
Combine with OR
filter& operator|=(const filter& x);
If capacity() != x.capacity()
, throws an std::invalid_argument
exception;
otherwise, changes the value of each bit in the internal array with the result of
doing a logical OR operation of that bit and the corresponding one in x
.
Returns: |
|
Observers
get_allocator
allocator_type get_allocator() const noexcept;
Returns: |
A copy of the internal allocator. |
Lookup
may_contain
bool may_contain(const value_type& x) const; template<typename U> bool may_contain(const U& x) const;
Returns: |
|
Notes: |
The second overload only participates in overload resolution if
|
Comparison
operator==
template< typename T, std::size_t K, typename S, std::size_t B, typename H, typename A > bool operator==( const filter<T, K, S, B, H, A>& x, const filter<T, K, S, B, H, A>& y);
Returns: |
|
operator!=
template< typename T, std::size_t K, typename S, std::size_t B, typename H, typename A > bool operator!=( const filter<T, K, S, B, H, A>& x, const filter<T, K, S, B, H, A>& y);
Returns: |
|
Subfilters
A subfilter implements a specific algorithm for bit setting (insertion) and
bit checking (lookup) for boost::bloom::filter
. Subfilters operate
on portions of the filter’s internal array called subarrays. The
exact width of these subarrays is statically dependent on the subfilter type.
The full interface of a conforming subfilter is not exposed publicly, hence users can’t provide their own subfilters and may only use those natively provided by the library. What follows is the publicly available interface.
Subfilter::k
Result: |
A compile-time |
typename Subfilter::value_type
Result: |
A cv-unqualified, TriviallyCopyable type to which the subfilter projects assigned subarrays. |
Subfilter::used_value_size
Result: |
A compile-time |
Postconditions: |
Greater than zero and not greater than |
Notes: |
Optional. |
used-value-size
template<typename Subfilter> constexpr std::size_t used-value-size; // exposition only
used-value-size<Subfilter>
is Subfilter::used_value_size
if this nested
constant exists, or sizeof(Subfilter::value_type)
otherwise.
The value is the effective size in bytes of the subarrays upon which a
given subfilter operates.
<boost/bloom/block.hpp>
namespace boost{ namespace bloom{ template<typename Block, std::size_t K> struct block; } // namespace bloom } // namespace boost
Class Template block
boost::bloom::block
— A subfilter over an integral type.
Synopsis
// #include <boost/bloom/block.hpp> namespace boost{ namespace bloom{ template<typename Block, std::size_t K> struct block { static constexpr std::size_t k = K; using value_type = Block; // the rest of the interface is not public } // namespace bloom } // namespace boost
Description
Template Parameters
|
An unsigned integral type. |
|
Number of bits set/checked per operation. Must be greater than zero. |
<boost/bloom/multiblock.hpp>
namespace boost{ namespace bloom{ template<typename Block, std::size_t K> struct multiblock; } // namespace bloom } // namespace boost
Class Template multiblock
boost::bloom::multiblock
— A subfilter over an array of an integral type.
Synopsis
// #include <boost/bloom/multiblock.hpp> namespace boost{ namespace bloom{ template<typename Block, std::size_t K> struct multiblock { static constexpr std::size_t k = K; using value_type = Block[k]; // the rest of the interface is not public } // namespace bloom } // namespace boost
Description
Template Parameters
|
An unsigned integral type. |
|
Number of bits set/checked per operation. Must be greater than zero. |
Each of the K
bits set/checked is located in a different element of the
Block[K]
array.
<boost/bloom/fast_multiblock32.hpp>
namespace boost{ namespace bloom{ template<std::size_t K> struct fast_multiblock32; } // namespace bloom } // namespace boost
Class Template fast_multiblock32
boost::bloom::fast_multiblock32
— A faster replacement of
multiblock<std::uint32_t, K>
.
Synopsis
// #include <boost/bloom/fast_multiblock32.hpp> namespace boost{ namespace bloom{ template<std::size_t K> struct fast_multiblock32 { static constexpr std::size_t k = K; using value_type = implementation-defined; // might not be present static constexpr std::size_t used_value_size = implementation-defined; // the rest of the interface is not public } // namespace bloom } // namespace boost
Description
Template Parameters
|
Number of bits set/checked per operation. Must be greater than zero. |
fast_multiblock32<K>
is statistically equivalent to
multiblock<std::uint32_t, K>
, but takes advantage
of selected SIMD technologies, when available at compile time, to perform faster.
Currently supported: AVX2, little-endian Neon, SSE2.
The non-SIMD case falls back to regular multiblock
.
used-value-size<fast_multiblock32<K>>
is
4 * K
.
<boost/bloom/fast_multiblock64.hpp>
namespace boost{ namespace bloom{ template<std::size_t K> struct fast_multiblock64; } // namespace bloom } // namespace boost
Class Template fast_multiblock64
boost::bloom::fast_multiblock64
— A faster replacement of
multiblock<std::uint64_t, K>
.
Synopsis
// #include <boost/bloom/fast_multiblock64.hpp> namespace boost{ namespace bloom{ template<std::size_t K> struct fast_multiblock64 { static constexpr std::size_t k = K; using value_type = implementation-defined; // might not be present static constexpr std::size_t used_value_size = implementation-defined; // the rest of the interface is not public } // namespace bloom } // namespace boost
Description
Template Parameters
|
Number of bits set/checked per operation. Must be greater than zero. |
fast_multiblock64<K>
is statistically equivalent to
multiblock<std::uint64_t, K>
, but takes advantage
of selected SIMD technologies, when available at compile time, to perform faster.
Currently supported: AVX2.
The non-SIMD case falls back to regular multiblock
.
used-value-size<fast_multiblock64<K>>
is
8 * K
.
Appendix A: FPR Estimation
For a classical Bloom filter, the theoretical false positive rate, under some simplifying assumptions, is given by
\(\text{FPR}(n,m,k)=\left(1 - \left(1 - \displaystyle\frac{1}{m}\right)^{kn}\right)^k \approx \left(1 - e^{-kn/m}\right)^k\) for large \(m\),
where \(n\) is the number of elements inserted in the filter, \(m\) its capacity in bits and \(k\) the number of bits set per insertion (see a derivation of this formula). For a given inverse load factor \(c=m/n\), the optimum \(k\) is the integer closest to:
\(k_{\text{opt}}=c\cdot\ln2,\)
yielding a minimum attainable FPR of \(1/2^{k_{\text{opt}}} \approx 0.6185^{c}\).
In the case of filter of the form boost::bloom::filter<T, K, block<Block, K'>>
, we can extend
the approach from Putze et al.
to derive the (approximate but very precise) formula:
\(\text{FPR}_{\text{block}}(n,m,b,k,k')=\left(\displaystyle\sum_{i=0}^{\infty} \text{Pois}(i,nbk/m) \cdot \text{FPR}(i,b,k')\right)^{k},\)
where
\(\text{Pois}(i,\lambda)=\displaystyle\frac{\lambda^i e^{-\lambda}}{i!}\)
is the probability mass function of a Poisson distribution
with mean \(\lambda\), and \(b\) is the size of Block
in bits. If we’re using multiblock<Block,K'>
, we have
\(\text{FPR}_\text{multiblock}(n,m,b,k,k')=\left(\displaystyle\sum_{i=0}^{\infty} \text{Pois}(i,nbkk'/m) \cdot \text{FPR}(i,b,1)^{k'}\right)^{k}.\)
As we have commented before, in general
\(\text{FPR}_\text{block}(n,m,b,k,k') \geq \text{FPR}_\text{multiblock}(n,m,b,k,k') \geq \text{FPR}(n,m,kk'),\)
that is, block and multiblock filters have worse FPR than the classical filter for the same number of bits set per insertion, but they will be faster. We have the particular case
\(\text{FPR}_{\text{block}}(n,m,b,k,1)=\text{FPR}_{\text{multiblock}}(n,m,b,k,1)=\text{FPR}(n,m,k),\)
which follows simply from the observation that using {block|multiblock}<Block, 1>
behaves exactly as
a classical Bloom filter.
We don’t know of any closed, simple formula for the FPR of block and multiblock filters when
Bucketsize
is not its "natural" size used-value-size<Subfilter>
,
that is, when subfilter subarrays overlap.
We can use the following approximations (\(s\) = BucketSize
in bits):
\(\text{FPR}_{\text{block}}(n,m,b,s,k,k')=\left(\displaystyle\sum_{i=0}^{\infty} \text{Pois}\left(i,\frac{n(2b-s)k}{m}\right) \cdot \text{FPR}(i,2b-s,k')\right)^{k},\)
\(\text{FPR}_\text{multiblock}(n,m,b,s,k,k')=\left(\displaystyle\sum_{i=0}^{\infty} \text{Pois}\left(i,\frac{n(2bk'-s)k}{m}\right) \cdot \text{FPR}\left(i,\frac{2bk'-s}{k'},1\right)^{k'}\right)^{k},\)
where the replacement of \(b\) with \(2b-s\) (or \(bk'\) with \(2bk'-s\) for multiblock filters) accounts for the fact that the window of hashing positions affecting a particular bit spreads due to overlapping. Note that the formulas reduce to the non-ovelapping case when \(s\) takes its default value (\(b\) for block, \(bk'\) for multiblock). These approximations are acceptable for low values of \(k'\) but tend to underestimate the actual FPR as \(k'\) grows. In general, the use of overlapping improves (decreases) FPR by a factor ranging from 0.6 to 0.9 for typical filter configurations.
\(\text{FPR}_{\text{block}}(n,m,b,s,k,k')\) and \(\text{FPR}_\text{multiblock}(n,m,b,s,k,k')\)
are the formulas used by the implementation of
boost::filter::fpr_for
.
Appendix B: Implementation Notes
Hash Mixing
This is the bit-mixing post-process we use to improve the statistical properties of the hash function when it doesn’t have the avalanching property:
\(m\leftarrow\text{mulx}(h,C)\),
\(h'\leftarrow\text{high}(m)\text{ xor }\text{low}(m)\),
where \(\text{mulx}\) denotes 128-bit multiplication of two 64-bit factors, \(\text{high}(m)\) and \(\text{low}(m)\) are the high and low 64-bit words of \(m\), respectively, \(C=\lfloor 2^{64}/\varphi \rfloor\) and \(\varphi\) is the golden ratio.
32-bit mode
Internally, we always use 64-bit hash values even if in 32-bit mode, where the user-provided hash function produces 32-bit outputs. To expand a 32-bit hash value to 64 bits, we use the same mixing procedure described above.
Dispensing with Multiple Hash Functions
Direct implementations of a Bloom filter with \(k\) bits per operation require \(k\) different and independent hash functions \(h_i(x)\), which incurs an important performance penalty, particularly if the objects are expensive to hash (e.g. strings). Kirsch and Mitzenmacher show how to relax this requirement down to two different hash functions \(h_1(x)\) and \(h_2(x)\) linearly combined as
\(g_i(x)=h_1(x)+ih_2(x).\)
Without formal justification, we have relaxed this even further to just one
initial hash value \(h_0=h_0(x)\), where new values
\(h_i\) are computed from \(h_{i-1}\)
by means of very cheap mixing schemes. In what follows
\(k\), \(k'\) are the homonym values
in a filter of the form boost::bloom::filter<T, K, {block|multiblock}<Block, K'>>
,
\(b\) is sizeof(Block) * CHAR_BIT
,
and \(r\) is the number of buckets in the filter.
Bucket Location
To produce a location (i.e. a number \(p\) in \([0,r)\)) from \(h_{i-1}\), instead of the straightforward but costly procedure \(p\leftarrow h_{i-1}\bmod r\) we resort to Lemire’s fastrange technique. Moreover, we combine this calculation with the production of \(h_{i}\) from \(h_{i-1}\) as follows:
\(m\leftarrow\text{mulx}(h_{i-1},r),\)
\(p\leftarrow\lfloor m/2^{64} \rfloor=\text{high}(m),\)
\(h_i\leftarrow m \bmod 2^{64}=\text{low}(m).\)
The transformation \(h_{i-1} \rightarrow h_i\) is a simple multiplicative congruential generator over \(2^{64}\). For this MCG to produce long cycles, \(h_0\) must be odd and the multiplicative constant \(r\) must be \(\equiv \pm 3 \text{ (mod 8)}\): to meet these requirements, the implementation adjusts \(h_0\) to \(h_0'\) and \(r\) to \(r'\). This renders the least significant bit of \(h_i\) unsuitable for pseudorandomization (it is always one).
Bit selection
Inside a subfilter, we must produce \(k'\) values from \(h_i\) in the range \([0,b)\) (the positions of the \(k'\) bits). We do this by successively taking \(\log_2b\) bits from \(h_i\) without utilizing the portion containing its least significant bit (which is always one as we have discussed). If we run out of bits (which happens when \(k'> 63/\log_2b\)), we produce a new hash value \(h_{i+1}\) from \(h_{i}\) using the mixing procedure already described.
SIMD algorithms
fast_multiblock32
When using AVX2, we select up to 8 bits at a time by creating
a __m256i
of 32-bit values \((x_0,x_1,...,x_7)\)
where each \(x_i\) is constructed from
a different 5-bit portion of the hash value, and calculating from this
the __m256i
\((2^{x_0},2^{x_1},...,2^{x_7})\)
with _mm256_sllv_epi32
.
If more bits are needed, we generate a new hash value as
described before and repeat.
For little-endian Neon, the algorithm is similar but the computations
are carried out with two uint32x4_t
s in parallel as Neon does not have
256-bit registers.
In the case of SSE2, we don’t have the 128-bit equivalent of
_mm256_sllv_epi32
, so we use the following, mildly interesting
technique: a __m128i
of the form
\(((x_0+127)\cdot 2^{23},(x_1+127)\cdot 2^{23},(x_2+127)\cdot 2^{23},(x_3+127)\cdot 2^{23}),\)
where each \(x_i\) is in \([0,32)\),
can be reinterpret_cast
ed to (i.e., has the same binary representation as)
the __m128
(register of float
s)
\((2^{x_0},2^{x_1},2^{x_2},2^{x_3}),\)
from which our desired __m128i
of shifted 1s can be obtained
with _mm_cvttps_epi32
.
fast_multiblock64
We only provide a SIMD implementation for AVX2 that relies in two
parallel __m256i
s for the generation of up
to 8 64-bit values with shifted 1s. For Neon and SSE2, emulation
through 4 128-bit registers proved slower than non-SIMD multiblock<uint64_t, K>
.
Release Notes
Boost 1.xx
-
Initial release.
Copyright and License
Of this documentation:
-
Copyright © 2025 Joaquín M López Muñoz
Distributed under the Boost Software License, Version 1.0.