public class BloomFilter extends Object implements IBloomFilter, Externalizable
Modifier and Type  Class and Description 

static class 
BloomFilter.BloomFilterCounters
Counters for bloom filter access and notification of false positives.

Modifier and Type  Field and Description 

BloomFilter.BloomFilterCounters 
counters
Counters are not persistent.

Constructor and Description 

BloomFilter()
Deserialization ctor.

BloomFilter(int n,
double p)
Ctor specifies
maxN := n * 2 . 
BloomFilter(int n,
double p,
int maxN) 
Modifier and Type  Method and Description 

boolean 
add(byte[] key)
Adds the key to the filter.

boolean 
contains(byte[] key)
Test the filter for the key a
true return DOES NOT
guarantee that the key has been added to the filter while a
false return guarantees that the key HAS NOT been added to
the filter. 
long 
disable()
Disables the bloom filter associated with the index.

void 
falsePos()
Notify the bloom filter that a false positive was observed for a key that
for which
IBloomFilter.add(byte[]) reported true (the key was
in fact not in the index). 
long 
getAddr()
Address that can be used to read this object from the store.

long 
getBitLength()
The bit length of the filter.

static long 
getBitLength(int hashFunctionCount,
int nentries)
Return the bit length required to provision a filter having the specified
#of hash functions and the target capacity.

static int 
getEntryCountForErrorRate(int k,
long m,
double p)
This returns the #of index entries at which the bloom filter will have
the specified expected error rate.

double 
getErrorRate()
The false positive error rate estimated as
2^d , where
d is the #of hash functions. 
int 
getHashFunctionCount()
The #of hash functions used by the filter.

static int 
getHashFunctionCount(double errorRate)
Return the #of hash functions required to achieve the specified error
rate.

int 
getMaxN()
The #of index entries at which the filter will have reached its maximum
error rate (from the ctor).

int 
getN()
The expected #of index entries (from the ctor).

double 
getP()
The target error rate when there are
getN() index entries (from
the ctor). 
boolean 
isDirty()
Return
true iff the state of the filter has been modified
but not yet written onto the store. 
boolean 
isEnabled()
Return
true unless the bloom filter has been disabled. 
static BloomFilter 
read(IRawStore store,
long addr)
Read a bloom filter record from the store.

void 
readExternal(ObjectInput in)

String 
toString() 
long 
write(IRawStore store)
Writes the bloom filter on the store and clears the
isDirty()
flag. 
void 
writeExternal(ObjectOutput out) 
public transient BloomFilter.BloomFilterCounters counters
public BloomFilter()
public BloomFilter(int n, double p)
maxN := n * 2
.n
 The expected #of index entries.p
 The target error rate.public BloomFilter(int n, double p, int maxN)
n
 The expected #of index entries.p
 The target error rate.maxN
 The #of index entries at which the filter will have reached
its maximum error rate. (The value is normally calculated by
the BloomFilterFactory
.)IllegalArgumentException
 if n is nonpositive.IllegalArgumentException
 unless p lies in (0:1).IllegalArgumentException
 if maxN is LT n.public final int getN()
public final double getP()
getN()
index entries (from
the ctor).
Note: This class does not know the actual false positive error rate.
However, that is tracked by the AbstractBTree.btreeCounters
.
public double getErrorRate()
2^d
, where
d is the #of hash functions. This will be close to but typically
not exactly the same as the value of getP()
specified to the
ctor.
Note: This class does not know the actual false positive error rate.
However, that is tracked by the AbstractBTree.btreeCounters
.
public final int getMaxN()
public final int getHashFunctionCount()
public final long getBitLength()
public static int getHashFunctionCount(double errorRate)
p
is the probability of a false positive and
d
is the #of hash functions, then
p = pow(2, d)and
d = ceil(ln(p) / ln(2))
public static long getBitLength(int hashFunctionCount, int nentries)
bitLength = ceil(nentries * (hashFunctionCount / ln(2)))
hashFunctionCount
 The #of hash functions.nentries
 The target capacity.public static int getEntryCountForErrorRate(int k, long m, double p)
p = (1  (1  1 / m) ˆ kn) ˆ kwhere m is the bit length of the filter, k is the #of hash functions, and n is the #of items that have been inserted into the filter. or approximately
p = (1  e ˆ kn / m) ˆ ksolving for
n
we obtain
n = m ln( 1  pˆ(1/k) ) / k
k
 The #of hash functions.m
 The bit length of the filter.p
 The expected error rate.http://en.wikipedia.org/wiki/Bloom_filter
public boolean add(byte[] key)
IBloomFilter
add
in interface IBloomFilter
key
 The key.true
iff the filter state was unchanged by the
incorporation of this key into the filter.IllegalStateException
 if the filter has been disable()
dpublic boolean contains(byte[] key)
IBloomFilter
true
return DOES NOT
guarantee that the key has been added to the filter while a
false
return guarantees that the key HAS NOT been added to
the filter.contains
in interface IBloomFilter
key
 The key.true
if the filter has either that key or some key
that is hash equivalent to that key using the hashing function
imposed by the filter; false
iff the filter can
guarantee that the key has not been added to the filter.IllegalStateException
 if the filter has been disable()
dpublic final long getAddr()
Note: This is not a persistent property. However the value is set when the record is read from, or written on, the store.
public static BloomFilter read(IRawStore store, long addr)
store
 the store.addr
 the address of the bloom filter record.public final boolean isDirty()
true
iff the state of the filter has been modified
but not yet written onto the store. The filter is presumed clean when
created or when it is read from the store. The filter will remain clean
until add(byte[])
returns true
, indicating that
the state of the filter has been changed.public long write(IRawStore store)
isDirty()
flag.
Note: This also sets the address on addr
as a sideeffect, but
the address is NOT written into the store since it is not available until
after the record has been serialized.
Note: This method DOES NOT test isDirty()
.
store
 The store.IllegalStateException
 if the filter is not dirty.IllegalStateException
 if the filter is not enabled.public final long disable()
Note: This method is invoked by AbstractBTree.insert(byte[], byte[])
when
the #of index entries exceeds the maximum allowed for the bloom filter.
At that point the BTree
is dirty. Checkpoint
will notice
that the bloom filter is disabled will write its address as 0L so the
bloom filter is no longer reachable from the postcheckpoint record.
public final boolean isEnabled()
true
unless the bloom filter has been disabled.
Note: A bloom filter may be disabled is the #of index entries has exceeded the maximum desired error rate for the bloom filter.
public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException
addr
is set to 0L
, the
dirty
flag is cleared, and enabled
flag is set. It's
necessary to override serialization otherwise java will default the
enabled
flag to false
when an object is
deserialized  whoops!readExternal
in interface Externalizable
in
 IOException
ClassNotFoundException
public void writeExternal(ObjectOutput out) throws IOException
writeExternal
in interface Externalizable
out
 IOException
public void falsePos()
IBloomFilter
IBloomFilter.add(byte[])
reported true
(the key was
in fact not in the index). This method exists solely for reporting and
tracking the actual error rate of the bloom filter.falsePos
in interface IBloomFilter
Copyright © 2006–2016 SYSTAP, LLC DBA Blazegraph. All rights reserved.