Criterions¶

Criterions compute the loss function given the model and batch, roughly:

loss = criterion(model, batch)

class fairseq.criterions.FairseqCriterion(args, task)[source]¶

static add_args(parser)[source]¶: Add criterion-specific arguments to the parser.

static aggregate_logging_outputs(logging_outputs)[source]¶: Aggregate logging outputs from data parallel training.

classmethod build_criterion(args, task)[source]¶

forward(model, sample, reduce=True)[source]¶

Compute the loss for the given sample.

Returns a tuple with three elements: 1) the loss 2) the sample size, which is used as the denominator for the gradient 3) logging outputs to display while training

static grad_denom(sample_sizes)[source]¶: Compute the gradient denominator for a set of sample sizes.

class fairseq.criterions.adaptive_loss.AdaptiveLoss(args, task)[source]¶

This is an implementation of the loss function accompanying the adaptive softmax approximation for graphical processing units (GPU), described in the paper “Efficient softmax approximation for GPUs” (http://arxiv.org/abs/1609.04309).

static aggregate_logging_outputs(logging_outputs)[source]¶: Aggregate logging outputs from data parallel training.

forward(model, sample, reduce=True)[source]¶

Compute the loss for the given sample.

Returns a tuple with three elements: 1) the loss 2) the sample size, which is used as the denominator for the gradient 3) logging outputs to display while training

class fairseq.criterions.composite_loss.CompositeLoss(args, task)[source]¶

This is a composite loss that, given a list of model outputs and a list of targets, computes an average of losses for each output-target pair

static add_args(parser)[source]¶: Add criterion-specific arguments to the parser.

classmethod build_criterion(args, task)[source]¶

static build_underlying_criterion(args, task)[source]¶

class fairseq.criterions.cross_entropy.CrossEntropyCriterion(args, task)[source]¶

static aggregate_logging_outputs(logging_outputs)[source]¶: Aggregate logging outputs from data parallel training.

compute_loss(model, net_output, sample, reduce=True)[source]¶

forward(model, sample, reduce=True)[source]¶

Compute the loss for the given sample.

Returns a tuple with three elements: 1) the loss 2) the sample size, which is used as the denominator for the gradient 3) logging outputs to display while training

class fairseq.criterions.label_smoothed_cross_entropy.LabelSmoothedCrossEntropyCriterion(args, task)[source]¶

static add_args(parser)[source]¶: Add criterion-specific arguments to the parser.

static aggregate_logging_outputs(logging_outputs)[source]¶: Aggregate logging outputs from data parallel training.

compute_loss(model, net_output, sample, reduce=True)[source]¶

forward(model, sample, reduce=True)[source]¶

Compute the loss for the given sample.

Returns a tuple with three elements: 1) the loss 2) the sample size, which is used as the denominator for the gradient 3) logging outputs to display while training