Criterions¶
Criterions compute the loss function given the model and batch, roughly:
loss = criterion(model, batch)
isort:skip_file
-
class
fairseq.criterions.
FairseqCriterion
(task)[source]¶ -
-
static
aggregate_logging_outputs
(logging_outputs: List[Dict[str, Any]]) → Dict[str, Any][source]¶ Aggregate logging outputs from data parallel training.
-
forward
(model, sample, reduce=True)[source]¶ Compute the loss for the given sample.
Returns a tuple with three elements: 1) the loss 2) the sample size, which is used as the denominator for the gradient 3) logging outputs to display while training
-
static
-
class
fairseq.criterions.adaptive_loss.
AdaptiveLoss
(task, sentence_avg)[source]¶ This is an implementation of the loss function accompanying the adaptive softmax approximation for graphical processing units (GPU), described in the paper “Efficient softmax approximation for GPUs” (http://arxiv.org/abs/1609.04309).
-
forward
(model, sample, reduce=True)[source]¶ Compute the loss for the given sample.
Returns a tuple with three elements: 1) the loss 2) the sample size, which is used as the denominator for the gradient 3) logging outputs to display while training
-
-
class
fairseq.criterions.composite_loss.
CompositeLoss
(args, task)[source]¶ This is a composite loss that, given a list of model outputs and a list of targets, computes an average of losses for each output-target pair
-
class
fairseq.criterions.cross_entropy.
CrossEntropyCriterion
(task, sentence_avg)[source]¶ -
-
forward
(model, sample, reduce=True)[source]¶ Compute the loss for the given sample.
Returns a tuple with three elements: 1) the loss 2) the sample size, which is used as the denominator for the gradient 3) logging outputs to display while training
-
-
class
fairseq.criterions.label_smoothed_cross_entropy.
LabelSmoothedCrossEntropyCriterion
(task, sentence_avg, label_smoothing, ignore_prefix_size=0, report_accuracy=False)[source]¶ -
-
forward
(model, sample, reduce=True)[source]¶ Compute the loss for the given sample.
Returns a tuple with three elements: 1) the loss 2) the sample size, which is used as the denominator for the gradient 3) logging outputs to display while training
-