acres.fastngram package

Package containing a full in-memory implementation of n-gram matching.

Submodules

acres.fastngram.fastngram module

A faster version of n-gram matching that uses dictionaries for speed-up.

class acres.fastngram.fastngram.CenterMap[source]

Bases: object

A map of center words to contexts.

add(center, left_context, right_context, freq)[source]

Add a center n-gram with a context.

Parameters
  • center (str) –

  • left_context (str) –

  • right_context (str) –

  • freq (int) –

Return type

None

Returns

contexts(center)[source]

Find contexts for a given center word.

Parameters

center

Returns

class acres.fastngram.fastngram.ContextMap[source]

Bases: object

A map of contexts to center words.

add(center, left_context, right_context, freq)[source]

Add a center n-gram with a context.

Parameters
  • center (str) –

  • left_context (str) –

  • right_context (str) –

  • freq (int) –

Return type

None

Returns

centers(left_context, right_context)[source]

Find center n-grams that happen on a given context.

Parameters
  • left_context

  • right_context

Returns

acres.fastngram.fastngram.baseline(acronym, left_context='', right_context='')[source]

A baseline method that expands only with unigrams.

Parameters
  • acronym (str) –

  • left_context (str) –

  • right_context (str) –

Return type

Iterator[str]

Returns

acres.fastngram.fastngram.create_map(ngrams, model, partition=0)[source]

Create a search-optimized represenation of an ngram-list.

Parameters
Return type

Union[ContextMap, CenterMap]

Returns

acres.fastngram.fastngram.fastngram(acronym, left_context='', right_context='', min_freq=2, max_rank=100000)[source]

Find an unlimited set of expansion candidates for an acronym given its left and right context. Note that no filtering is done here, except from the acronym initial partioning.

Parameters
  • acronym (str) –

  • left_context (str) –

  • right_context (str) –

  • min_freq (int) –

  • max_rank (int) –

Return type

Iterator[str]

Returns

acres.fastngram.fastngram.fasttype(acronym, left_context='', right_context='', min_freq=2, max_rank=100000)[source]

Find an unlimited set of expansion candidates given the training contexts of the acronym. Note that no filtering is done here, except from the acronym initial partioning.

Parameters
  • acronym (str) –

  • left_context (str) – Not used.

  • right_context (str) – Not used.

  • min_freq (int) –

  • max_rank (int) –

Return type

Iterator[str]

Returns