HashTable

class hirola.HashTable(max, dtype, almost_full=(0.9, 'warn'))

The raw core behind a set or a dictionary’s keys.

A hash table resembles a dictionary where its keys are the keys array but the values are an just an enumeration. It’s core API:

  • Use add to add new keys if they’ve not been already added.

  • The keys lists all keys that have been added in the order that they were added.

  • Use get to retrieve indices of keys in keys.

__init__(max, dtype, almost_full=(0.9, 'warn'))[source]
Parameters:
  • max (Number) – An upper bound for the number of keys which can fit in this table. Sets the max attribute.

  • dtype (Union[dtype, generic, type, str, list, tuple]) – The data type for the table’s keys. Sets the dtype attribute.

  • almost_full – The handling of almost full hash tables. Sets the almost_full attribute.

The max parameter is silently normalised to int and clipped to a minimum of 1 if it is less than 1. Zero length tables are not allowed.

add(keys)[source]

Add keys to the table.

Any key which is already in keys is not added again. Returns the index of each key in keys similarly to get.

Raises:
Warns:

exceptions.AlmostFull – If the table becomes nearly full and is configured to warn (set by the almost_full attribute).

Return type:

ndarray

contains(keys)[source]

Check if a key or keys are in the table.

Parameters:

keys – Elements to check for.

Return type:

Union[bool, ndarray]

Returns:

Either true or false for each key in keys.

This function is equivalent to but faster than table.get(keys) != -1. To check only one key you may also use key in table.

copy(usable=True)[source]

Deep copy this table.

Parameters:

usable – If set to false and this table has called destroy, then the destroyed state is propagated to copies.

Return type:

HashTable

Returns:

Another HashTable with the same size, dtype and content.

destroy()[source]

Release a writable version of keys and permanently disable this table.

Return type:

ndarray

Returns:

A writable shallow copy of keys.

Modifying keys would corrupt the hash table’s internal storage and is therefore blocked (analogous to dict’s blocking of list keys) by setting keys.flags.writeable = False. However, if you no longer need this table then it is safe to do as you please with the keys array. Calling this function releases a writable view of keys but blocks you from adding to or getting from this table in the future. If you want both a writable keys array and functional use of this table then use table.keys.copy().

get(keys, default=-1)[source]

Lookup indices of keys in keys.

Parameters:
  • keys – Elements to search for.

  • default – Returned in place of a missing key. May be any object.

Return type:

ndarray

Returns:

The index/indices of keys in this table’s keys. If a key is not there, returns -1 in its place.

Raises:
resize(new_size, in_place=False)[source]

Copy the contents of this table into a new HashTable of a different size.

Parameters:
  • new_size – The new value for max.

  • in_place – If true resize this table. Otherwise make a modified copy.

Return type:

HashTable

Returns:

Either a new hash table with attributes keys and length matching those from this table or this table.

Raises:

ValueError – If requested size is too small (new_size < len(table)).

Note that there is no performance benefit to resizing in place.

Changed in version 0.3.0: Add the in_place option.

property almost_full

The response to an almost full hash table. Hash tables become dramatically slower, the closer they get to being full. Hirola’s default behaviour is to warn if this happens but can be configured to ignore the warning, raise an error or automatically make a new, larger table.

This is an overloaded parameter.

  • almost_full = None:

    Disable the almost full warning entirely.

  • almost_full = (0.8, "warn"):

    Issue a hirola.exceptions.AlmostFull warning if the table reaches 80% full.

  • almost_full = (0.8, "raise"):

    Raise a hirola.exceptions.AlmostFull exception if the table reaches 80% full.

  • almost_full = (0.7, 2):

    Whenever the table reaches 70% full, double the table size.

For reference, Python’s dict grows 8-fold when two thirds full. To mimic this behaviour, set table.almost_full = (2 / 3, 8).

property dtype: dtype

The data type for the table’s keys.

Use a structured numpy.dtype to indicate that several numbers make up a single key.

Examples

  • Individual float keys: HashTable(100, np.float64).

  • Triplets of floats keys: HashTable(100, (np.float64, 3)).

  • Strings of up to 20 characters: HashTable(100, (str, 20)).

  • Mixed types records data: HashTable(100, [("firstname", str, 20), ("lastname", str, 20), ("age", int)]).

property key_size: int

The number of bytes per key.

property keys: ndarray

The unique elements in the table.

Unlike Python’s builtin dict, this keys is a property rather than a method. Keys must be immutable hence this array is a readonly view. See destroy to release the writable version.

property length: int

The number of elements currently in this table. Aliased via len(table).

property max: int

The maximum number of elements allowed in this table.

Adding keys to exceed this maximum will trigger a HashTableFullError. Nearly full tables will add and get much slower.