acres.evaluation package¶

Package containing evaluation modules.

Submodules¶

acres.evaluation.evaluation module¶

Benchmark code. It’s the main entry point for comparing strategies using evaluation metrics such as precision, recall, and F1-score.

class acres.evaluation.evaluation.Level(value)[source]¶

Bases: enum.Enum

Enum that holds acronym-solving levels.

TOKEN = 1¶

TYPE = 2¶

acres.evaluation.evaluation.analyze(contextualized_acronym, true_expansions, strategy, max_tries)[source]¶

Analyze a given row of the gold standard.

Parameters

contextualized_acronym (Acronym) –
true_expansions (Set[str]) –
strategy (Strategy) –
max_tries (int) –

Return type

Dict[str, bool]

Returns

A dictionary with keys {‘found’, ‘correct’, and ‘ignored’} pointing to boolean.

acres.evaluation.evaluation.do_analysis(topics_file, detection_file, expansion_file, strategy, level, max_tries, lenient)[source]¶

Analyze a given expansion standard.

Parameters

topics_file (str) –
detection_file (str) –
expansion_file (str) –
strategy (Strategy) –
level (Level) –
max_tries (int) –
lenient (bool) –

Return type

Tuple[List[Acronym], List[Acronym], List[Acronym]]

Returns

A tuple with lists containing correct, found, and valid contextualized acronyms

acres.evaluation.evaluation.evaluate(topics, valid_standard, standard, strategy, level, max_tries, lenient)[source]¶

Analyze a gold standard with text excerpts centered on an acronym, followed by n valid expansions.

Parameters

topics (List[Acronym]) –
valid_standard (Set[str]) –
standard (Dict[str, Dict[str, int]]) –
strategy (Strategy) –
level (Level) –
max_tries (int) –
lenient (bool) – Whether to consider partial matches (1) as a valid sense.

Return type

Tuple[List[Acronym], List[Acronym], List[Acronym]]

Returns

A tuple with lists containing correct, found, and valid contextualized acronyms

acres.evaluation.evaluation.plot_data(topics_file, detection_file, expansion_file)[source]¶

Run all strategies using different ranks and lenient approaches and generate a TSV file to be used as input for the plots.R script.

Parameters

topics_file (str) –
detection_file (str) –
expansion_file (str) –

Returns

acres.evaluation.evaluation.summary(topics_file, detection_file, expansion_file, level, max_tries, lenient)[source]¶

Save a summary table in TSV format that can be used to run statistical tests (e.g. McNemar Test)

Parameters

topics_file (str) –
detection_file (str) –
expansion_file (str) –
level (Level) –
max_tries (int) –
lenient (bool) –

Returns

acres.evaluation.evaluation.test_input(true_expansions, possible_expansions, max_tries=10)[source]¶

Test an acronym + context strings against the model.

Parameters

true_expansions (Set[str]) –
possible_expansions (List[str]) – An ordered list of possible expansions.
max_tries (int) – Maxinum number of tries

Return type

bool

Returns

acres.evaluation.metrics module¶

Helper functions to calculate evaluation metrics.

acres.evaluation.metrics.calculate_f1(precision, recall)[source]¶

Calculates the F1-score.

Parameters

precision (float) –
recall (float) –

Return type

float

Returns

acres.evaluation.metrics.calculate_precision(total_correct, total_found)[source]¶

Calculate precision as the ratio of correct acronyms to the found acronyms.

Parameters

total_correct (int) –
total_found (int) –

Return type

float

Returns

acres.evaluation.metrics.calculate_recall(total_correct, total_acronyms)[source]¶

Calculate reall as the ratio of correct acronyms to all acronyms.

Parameters

total_correct (int) –
total_acronyms (int) –

Return type

float

Returns