BAM
Abstract Machine for Bottom-Up Evaluation with the Push Method
Public Member Functions | List of all members
set_e_c< T > Class Template Reference

#include <set_e.h>

Inheritance diagram for set_e_c< T >:
rel_c

Public Member Functions

 set_e_c (str_t set_e_name)
 
bool insert (T elem)
 
bool contains (T elem) const
 
int hash_size () const
 
int num_buckets () const
 
int num_overflow () const
 
int size_value (relsize_t relsize) const
 
str_t check () const
 
void dump (str_t headline=STR_NULL) const
 
- Public Member Functions inherited from rel_c
str_t name () const
 
bool mem_err () const
 
int num_rows () const
 
int num_cols () const
 
int bound_args ()
 
int free_args ()
 
str_t impl_name ()
 
int num_pages () const
 
void dump (str_t headline=STR_NULL) const
 

Additional Inherited Members

- Protected Member Functions inherited from rel_c
 rel_c (str_t rel_name, int bound, int free, str_t impl)
 
- Protected Attributes inherited from rel_c
mpool_c MemPool
 
str_t Name
 
bool MemErr
 
int NumRows
 
int BoundArgs
 
int FreeArgs
 
str_t ImplName
 

Detailed Description

template<class T>
class set_e_c< T >

Implementation of a set data structure with an extendible hash table.

This is an implementation of a set data structure, i.e. a relation that is accessed with the binding pattern "bb...b" (all arguments bound). The set data structure mainly offers an

insert

method for inserting a tuple into the relation, and a

contains

method for checking whether a given tuple is contained in the set. The

insert

method also contains an element check: It returns

false

if the tuple cannot be inserted because it is already contained in the set, and

true

if the insertion was successful. There is no method for deleting elements.

This implementation uses an extendible hash table.

Note that any reliable implementation must handle special cases like several tuples returning the same hash value (more than fitting into one bucket) or the hash table cannot be extended further. Our implementation does have a strict upper limit on the hash table size. Under bad circumstances, it uses a chain of overflow buckets. Overflow buckets are used only as a last resort, when the hash table has already grown to its maximal size and no further splitting of the bucket is possible. Therefore, once a hash table entry does contain a chain of overflow buckets, we do not need to program any further splitting of it, but simply add more buckets to the chain. We are out of luck in this case anyway. On the other hand, we do have to split the same bucket iteratively, if a single split did not yield a better distribution of the elements (this should be a very uncommon case, though).

The alternative would have been to split only once per insertion, and use an overflow bucket if it does not work. When the next element is inserted, we try again to split. In this case there could be overflow chains that are removed after another insertion. This is NOT the way we do it here.

Note that in both cases empty buckets are possible if the split does not lead to a redistribution of entries. However, only a logarithmic number of them can be produced in a single insertion. We cannot simply remove the empty bucket and replace it by a null pointer because the information about the local bit depth is important.

Of course, all this are border cases, which occur seldom. However, the code has to handle them.


The documentation for this class was generated from the following files: