#include <set_e.h>

Inheritance diagram for set_e_c< T >:

Public Member Functions
	set_e_c (str_t set_e_name)

bool	insert (T elem)

bool	contains (T elem) const

int	hash_size () const

int	num_buckets () const

int	num_overflow () const

int	size_value (relsize_t relsize) const

str_t	check () const

void	dump (str_t headline=STR_NULL) const

Public Member Functions inherited from rel_c
str_t	name () const

bool	mem_err () const

int	num_rows () const

int	num_cols () const

int	bound_args ()

int	free_args ()

str_t	impl_name ()

int	num_pages () const

void	dump (str_t headline=STR_NULL) const

Additional Inherited Members
Protected Member Functions inherited from rel_c
	rel_c (str_t rel_name, int bound, int free, str_t impl)

Protected Attributes inherited from rel_c
mpool_c	MemPool

str_t	Name

bool	MemErr

int	NumRows

int	BoundArgs

int	FreeArgs

str_t	ImplName

Detailed Description

template<class T>
class set_e_c< T >

Implementation of a set data structure with an extendible hash table.

This is an implementation of a set data structure, i.e. a relation that is accessed with the binding pattern "bb...b" (all arguments bound). The set data structure mainly offers an

insert

method for inserting a tuple into the relation, and a

contains

method for checking whether a given tuple is contained in the set. The

insert

method also contains an element check: It returns

false

if the tuple cannot be inserted because it is already contained in the set, and

true

if the insertion was successful. There is no method for deleting elements.

This implementation uses an extendible hash table.

Note that any reliable implementation must handle special cases like several tuples returning the same hash value (more than fitting into one bucket) or the hash table cannot be extended further. Our implementation does have a strict upper limit on the hash table size. Under bad circumstances, it uses a chain of overflow buckets. Overflow buckets are used only as a last resort, when the hash table has already grown to its maximal size and no further splitting of the bucket is possible. Therefore, once a hash table entry does contain a chain of overflow buckets, we do not need to program any further splitting of it, but simply add more buckets to the chain. We are out of luck in this case anyway. On the other hand, we do have to split the same bucket iteratively, if a single split did not yield a better distribution of the elements (this should be a very uncommon case, though).

The alternative would have been to split only once per insertion, and use an overflow bucket if it does not work. When the next element is inserted, we try again to split. In this case there could be overflow chains that are removed after another insertion. This is NOT the way we do it here.

Note that in both cases empty buckets are possible if the split does not lead to a redistribution of entries. However, only a logarithmic number of them can be produced in a single insertion. We cannot simply remove the empty bucket and replace it by a null pointer because the information about the local bit depth is important.

Of course, all this are border cases, which occur seldom. However, the code has to handle them.

The documentation for this class was generated from the following files:

rel/set_e.h
rel/set_e.cpp

Public Member Functions

Additional Inherited Members

Detailed Description

template<class T> class set_e_c< T >

template<class T>
class set_e_c< T >