Evaluation¶
EvaluationManager.py - module for determining the reward¶
Copyright CUED Dialogue Systems Group 2015 - 2017
See also
CUED Imports/Dependencies:
import ontology.OntologyUtils
import utils.Settings
import utils.ContextLogger
-
class
evaluation.EvaluationManager.
EvaluationManager
¶ The evaluation manager manages the evaluators for all domains. It supports two types of reward: a turn-level reward and a dialogue-level reward. The former is accessed using
turnReward()
and the latter usingfinalReward()
. You can either use one or both methods for reward computing.An example where both are used in the traditional reward computation where each turn is penalised with a small negative reward (which is realised with
turnReward()
) and in the end, the dialogue is rewarded with a big positive reward given the overall dialogue (which is realised withfinalReward()
).-
_bootup_domain
(dstring)¶ Ensures that the respective domain’s evaluator is booted up correctly and resets it.
Parameters: dstring (str) – the domain of which the evaulator should be booted. Returns: None
-
_load_domains_evaluator
(domainString=None)¶ Loads and instantiates the respective evaluator as configured in config file. The new object is added to the internal dictionary.
Default is ‘objective’.
Parameters: domainString (str) – the domain the evaluator will work on. Default is None. Returns: None
-
finalReward
(domainString, finalInfo)¶ Computes the final reward for the given domain using finalInfo by delegating to the domain evaluator.
Parameters: - domainString (str) – the domain string unique identifier.
- finalInfo (dict) – parameters necessary for computing the final reward, eg., task description or subjective feedback.
Returns: int – the final reward for the given domain.
-
finalRewards
(finalInfo=None)¶ Computes the
finalReward()
method for all domains where it has not been computed yet.Parameters: finalInfo (dict) – parameters necessary for computing the final rewards, eg., task description or subjective feedback. Default is None Returns: dict – mapping of domain to final rewards
-
print_dialog_summary
()¶ Prints the history of the just completed dialog.
-
print_summary
()¶ Prints the history over all dialogs run thru simulate.
-
restart
()¶ Restarts all domain evaluators.
-
turnReward
(domainString, turnInfo)¶ Computes the turn reward for the given domain using turnInfo by delegating to the domain evaluator.
Parameters: - domainString (str) – the domain string unique identifier.
- turnInfo (dict) – parameters necessary for computing the turn reward, eg., system act or model of the simulated user.
Returns: int – the turn reward for the given domain.
-
-
class
evaluation.EvaluationManager.
Evaluator
(domainString)¶ Interface class for a single domain evaluation module. Responsible for recording/calculating turns, dialogue outcome, reward for a single dialog. To create your own reward model, derive from this class and depending on your requirements override the methods
_getTurnReward()
and_getFinalReward()
.-
_getFinalReward
(finalInfo)¶ Computes the final reward using finalInfo and sets the dialogue outcome.
Should be overridden by sub-class if values others than 0 should be returned.
Parameters: finalInfo (dict) – parameters necessary for computing the final reward, eg., task description or subjective feedback. Returns: int – the final reward, default 0.
-
_getTurnReward
(turnInfo)¶ Computes the turn reward using turnInfo.
Should be overridden by sub-class if values others than 0 should be returned.
Parameters: turnInfo (dict) – parameters necessary for computing the turn reward, eg., system act or model of the simulated user. Returns: int – the turn reward, default 0.
-
doTraining
()¶ Defines whether the currently evaluated dialogue should be used for training.
Should be overridden by sub-class if values others than True should be returned.
Returns: bool – whether the dialogue should be used for training
-
finalReward
(finalInfo)¶ Computes the final reward using finalInfo by calling
_getFinalReward()
. Updates total reward and dialogue outcomeParameters: finalInfo (dict) – parameters necessary for computing the final reward, eg., task description or subjective feedback. Returns: int – the final reward.
-
print_dialog_summary
()¶ Prints a summary of the current dialogue. Assumes dialogue outcome represents success. For other types, override methods in sub-class.
-
print_summary
()¶ Prints the summary of a run - ie multiple dialogs. Assumes dialogue outcome represents success. For other types, override methods in sub-class.
-
restart
()¶ Reset the domain evaluators internal variables. :param: None :returns None:
-
turnReward
(turnInfo)¶ Computes the turn reward using turnInfo by calling
_getTurnReward()
. Updates total reward and number of turnsParameters: turnInfo (dict) – parameters necessary for computing the turn reward, eg., system act or model of the simulated user. Returns: int – the turn reward.
-
-
class
evaluation.SuccessEvaluator.
ObjectiveSuccessEvaluator
(domainString)¶ This class provides a reward model based on objective success. For simulated dialogues, the goal of the user simulator is compared with the the information the system has provided. For dialogues with a task file, the task is compared to the information the system has provided.
-
class
evaluation.SuccessEvaluator.
SubjectiveSuccessEvaluator
(domainString)¶ This class implements a reward model based on subjective success which is only possible during voice interaction through the
DialogueServer
. The subjective feedback is collected and passed on to this class.