acres.evaluation package

Package containing evaluation modules.

Submodules

acres.evaluation.evaluation module

Benchmark code. It’s the main entry point for comparing strategies using evaluation metrics such as precision, recall, and F1-score.

class acres.evaluation.evaluation.Level(value)[source]

Bases: enum.Enum

Enum that holds acronym-solving levels.

TOKEN = 1
TYPE = 2
acres.evaluation.evaluation.analyze(contextualized_acronym, true_expansions, strategy, max_tries)[source]

Analyze a given row of the gold standard.

Parameters
  • contextualized_acronym (Acronym) –

  • true_expansions (Set[str]) –

  • strategy (Strategy) –

  • max_tries (int) –

Return type

Dict[str, bool]

Returns

A dictionary with keys {‘found’, ‘correct’, and ‘ignored’} pointing to boolean.

acres.evaluation.evaluation.do_analysis(topics_file, detection_file, expansion_file, strategy, level, max_tries, lenient)[source]

Analyze a given expansion standard.

Parameters
  • topics_file (str) –

  • detection_file (str) –

  • expansion_file (str) –

  • strategy (Strategy) –

  • level (Level) –

  • max_tries (int) –

  • lenient (bool) –

Return type

Tuple[List[Acronym], List[Acronym], List[Acronym]]

Returns

A tuple with lists containing correct, found, and valid contextualized acronyms

acres.evaluation.evaluation.evaluate(topics, valid_standard, standard, strategy, level, max_tries, lenient)[source]

Analyze a gold standard with text excerpts centered on an acronym, followed by n valid expansions.

Parameters
  • topics (List[Acronym]) –

  • valid_standard (Set[str]) –

  • standard (Dict[str, Dict[str, int]]) –

  • strategy (Strategy) –

  • level (Level) –

  • max_tries (int) –

  • lenient (bool) – Whether to consider partial matches (1) as a valid sense.

Return type

Tuple[List[Acronym], List[Acronym], List[Acronym]]

Returns

A tuple with lists containing correct, found, and valid contextualized acronyms

acres.evaluation.evaluation.plot_data(topics_file, detection_file, expansion_file)[source]

Run all strategies using different ranks and lenient approaches and generate a TSV file to be used as input for the plots.R script.

Parameters
  • topics_file (str) –

  • detection_file (str) –

  • expansion_file (str) –

Returns

acres.evaluation.evaluation.summary(topics_file, detection_file, expansion_file, level, max_tries, lenient)[source]

Save a summary table in TSV format that can be used to run statistical tests (e.g. McNemar Test)

Parameters
  • topics_file (str) –

  • detection_file (str) –

  • expansion_file (str) –

  • level (Level) –

  • max_tries (int) –

  • lenient (bool) –

Returns

acres.evaluation.evaluation.test_input(true_expansions, possible_expansions, max_tries=10)[source]

Test an acronym + context strings against the model.

Parameters
  • true_expansions (Set[str]) –

  • possible_expansions (List[str]) – An ordered list of possible expansions.

  • max_tries (int) – Maxinum number of tries

Return type

bool

Returns

acres.evaluation.metrics module

Helper functions to calculate evaluation metrics.

acres.evaluation.metrics.calculate_f1(precision, recall)[source]

Calculates the F1-score.

Parameters
  • precision (float) –

  • recall (float) –

Return type

float

Returns

acres.evaluation.metrics.calculate_precision(total_correct, total_found)[source]

Calculate precision as the ratio of correct acronyms to the found acronyms.

Parameters
  • total_correct (int) –

  • total_found (int) –

Return type

float

Returns

acres.evaluation.metrics.calculate_recall(total_correct, total_acronyms)[source]

Calculate reall as the ratio of correct acronyms to all acronyms.

Parameters
  • total_correct (int) –

  • total_acronyms (int) –

Return type

float

Returns