fastNLP.modules.encoder

fastNLP.modules.encoder.char_embedding

class fastNLP.modules.encoder.char_embedding.ConvCharEmbedding(char_emb_size=50, feature_maps=(40, 30, 30), kernels=(3, 4, 5), initial_method=None)[source]
forward(x)[source]
Parameters:x – [batch_size * sent_length, word_length, char_emb_size]
Returns:[batch_size * sent_length, sum(feature_maps), 1]
class fastNLP.modules.encoder.char_embedding.LSTMCharEmbedding(char_emb_size=50, hidden_size=None, initial_method=None)[source]

Character Level Word Embedding with LSTM with a single layer. :param char_emb_size: int, the size of character level embedding. Default: 50

say 26 characters, each embedded to 50 dim vector, then the input_size is 50.
Parameters:hidden_size – int, the number of hidden units. Default: equal to char_emb_size.
forward(x)[source]

:param x:[ n_batch*n_word, word_length, char_emb_size] :return: [ n_batch*n_word, char_emb_size]

fastNLP.modules.encoder.conv

class fastNLP.modules.encoder.conv.Conv(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]

Basic 1-d convolution module. initialize with xavier_uniform

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.conv_maxpool

class fastNLP.modules.encoder.conv_maxpool.ConvMaxpool(in_channels, out_channels, kernel_sizes, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]

Convolution and max-pooling module with multiple kernel sizes.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.embedding

class fastNLP.modules.encoder.embedding.Embedding(nums, dims, padding_idx=0, sparse=False, init_emb=None, dropout=0.0)[source]

A simple lookup table Args: nums : the size of the lookup table dims : the size of each vector padding_idx : pads the tensor with zeros whenever it encounters this index sparse : If True, gradient matrix will be a sparse tensor. In this case, only optim.SGD(cuda and cpu) and optim.Adagrad(cpu) can be used

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.linear

class fastNLP.modules.encoder.linear.Linear(input_size, output_size, bias=True, initial_method=None)[source]

Linear module Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers dropout : dropout rate bidirectional : If True, becomes a bidirectional RNN

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.lstm

class fastNLP.modules.encoder.lstm.LSTM(input_size, hidden_size=100, num_layers=1, dropout=0.0, batch_first=True, bidirectional=False, bias=True, initial_method=None, get_hidden=False)[source]

Long Short Term Memory

Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers. Default: 1 dropout : dropout rate. Default: 0.5 bidirectional : If True, becomes a bidirectional RNN. Default: False.

forward(x, h0=None, c0=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.masked_rnn

class fastNLP.modules.encoder.masked_rnn.MaskedGRU(*args, **kwargs)[source]

Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:

\begin{array}{ll}
r_t = \mathrm{sigmoid}(W_{ir} x_t + b_{ir} + W_{hr} h_{(t-1)} + b_{hr}) \\
z_t = \mathrm{sigmoid}(W_{iz} x_t + b_{iz} + W_{hz} h_{(t-1)} + b_{hz}) \\
n_t = \tanh(W_{in} x_t + b_{in} + r_t * (W_{hn} h_{(t-1)}+ b_{hn})) \\
h_t = (1 - z_t) * n_t + z_t * h_{(t-1)} \\
\end{array}

where \(h_t\) is the hidden state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(r_t\), \(z_t\), \(n_t\) are the reset, input, and new gates, respectively. Args:

input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. nonlinearity: The non-linearity to use [‘tanh’|’relu’]. Default: ‘tanh’ bias: If False, then the layer does not use bias weights b_ih and b_hh.

Default: True
batch_first: If True, then the input and output tensors are provided
as (batch, seq, feature)
dropout: If non-zero, introduces a dropout layer on the outputs of each
RNN layer except the last layer

bidirectional: If True, becomes a bidirectional RNN. Default: False

Inputs: input, mask, h_0
  • input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
  • h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
Outputs: output, h_n
  • output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_k) from the last layer of the RNN, for each k. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
  • h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for k=seq_len.
class fastNLP.modules.encoder.masked_rnn.MaskedLSTM(*args, **kwargs)[source]

Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:

\begin{array}{ll}
i_t = \mathrm{sigmoid}(W_{ii} x_t + b_{ii} + W_{hi} h_{(t-1)} + b_{hi}) \\
f_t = \mathrm{sigmoid}(W_{if} x_t + b_{if} + W_{hf} h_{(t-1)} + b_{hf}) \\
g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hc} h_{(t-1)} + b_{hg}) \\
o_t = \mathrm{sigmoid}(W_{io} x_t + b_{io} + W_{ho} h_{(t-1)} + b_{ho}) \\
c_t = f_t * c_{(t-1)} + i_t * g_t \\
h_t = o_t * \tanh(c_t)
\end{array}

where \(h_t\) is the hidden state at time t, \(c_t\) is the cell state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(i_t\), \(f_t\), \(g_t\), \(o_t\) are the input, forget, cell, and out gates, respectively. Args:

input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. bias: If False, then the layer does not use bias weights b_ih and b_hh.

Default: True
batch_first: If True, then the input and output tensors are provided
as (batch, seq, feature)
dropout: If non-zero, introduces a dropout layer on the outputs of each
RNN layer except the last layer

bidirectional: If True, becomes a bidirectional RNN. Default: False

Inputs: input, mask, (h_0, c_0)
  • input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
  • h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
  • c_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial cell state for each element in the batch.
Outputs: output, (h_n, c_n)
  • output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
  • h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t=seq_len
  • c_n (num_layers * num_directions, batch, hidden_size): tensor containing the cell state for t=seq_len
class fastNLP.modules.encoder.masked_rnn.MaskedRNN(*args, **kwargs)[source]

Applies a multi-layer Elman RNN with costomized non-linearity to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:

h_t = \tanh(w_{ih} * x_t + b_{ih}  +  w_{hh} * h_{(t-1)} + b_{hh})

where \(h_t\) is the hidden state at time t, and \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer. If nonlinearity=’relu’, then ReLU is used instead of tanh. Args:

input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. nonlinearity: The non-linearity to use [‘tanh’|’relu’]. Default: ‘tanh’ bias: If False, then the layer does not use bias weights b_ih and b_hh.

Default: True
batch_first: If True, then the input and output tensors are provided
as (batch, seq, feature)
dropout: If non-zero, introduces a dropout layer on the outputs of each
RNN layer except the last layer

bidirectional: If True, becomes a bidirectional RNN. Default: False

Inputs: input, mask, h_0
  • input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
  • h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
Outputs: output, h_n
  • output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_k) from the last layer of the RNN, for each k. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.
  • h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for k=seq_len.
class fastNLP.modules.encoder.masked_rnn.MaskedRNNBase(Cell, input_size, hidden_size, num_layers=1, bias=True, batch_first=False, layer_dropout=0, step_dropout=0, bidirectional=False, initial_method=None, **kwargs)[source]
forward(input, mask=None, hx=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

step(input, hx=None, mask=None)[source]

execute one step forward (only for one-directional RNN). Args:

input (batch, input_size): input tensor of this step. hx (num_layers, batch, hidden_size): the hidden state of last step. mask (batch): the mask tensor of this step.
Returns:
output (batch, hidden_size): tensor containing the output of this step from the last layer of RNN. hn (num_layers, batch, hidden_size): tensor containing the hidden state of this step

fastNLP.modules.encoder.transformer

class fastNLP.modules.encoder.transformer.TransformerEncoder(num_layers, **kargs)[source]
class SubLayer(input_size, output_size, key_size, value_size, num_atte)[source]
forward(input, seq_mask)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward(x, seq_mask=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fastNLP.modules.encoder.variational_rnn

class fastNLP.modules.encoder.variational_rnn.VarGRU(*args, **kwargs)[source]

Variational Dropout GRU.

class fastNLP.modules.encoder.variational_rnn.VarLSTM(*args, **kwargs)[source]

Variational Dropout LSTM.

class fastNLP.modules.encoder.variational_rnn.VarRNN(*args, **kwargs)[source]

Variational Dropout RNN.

class fastNLP.modules.encoder.variational_rnn.VarRNNBase(mode, Cell, input_size, hidden_size, num_layers=1, bias=True, batch_first=False, input_dropout=0, hidden_dropout=0, bidirectional=False)[source]

Implementation of Variational Dropout RNN network. refer to A Theoretically Grounded Application of Dropout in Recurrent Neural Networks (Yarin Gal and Zoubin Ghahramani, 2016) https://arxiv.org/abs/1512.05287.

forward(input, hx=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.variational_rnn.VarRnnCellWrapper(cell, hidden_size, input_p, hidden_p)[source]

Wrapper for normal RNN Cells, make it support variational dropout

forward(input, hidden, mask_x=None, mask_h=None)[source]
Parameters:
  • input – [seq_len, batch_size, input_size]
  • hidden – for LSTM, tuple of (h_0, c_0), [batch_size, hidden_size] for other RNN, h_0, [batch_size, hidden_size]
  • mask_x – [batch_size, input_size] dropout mask for input
  • mask_h – [batch_size, hidden_size] dropout mask for hidden
Return output:

[seq_len, bacth_size, hidden_size] hidden: for LSTM, tuple of (h_n, c_n), [batch_size, hidden_size]

for other RNN, h_n, [batch_size, hidden_size]

fastNLP.modules.encoder.variational_rnn.flip(input, dims) → Tensor

Reverse the order of a n-D tensor along given axis in dims.

Args:
input (Tensor): the input tensor dims (a list or tuple): axis to flip on

Example:

>>> x = torch.arange(8).view(2, 2, 2)
>>> x
tensor([[[ 0,  1],
         [ 2,  3]],

        [[ 4,  5],
         [ 6,  7]]])
>>> torch.flip(x, [0, 1])
tensor([[[ 6,  7],
         [ 4,  5]],

        [[ 2,  3],
         [ 0,  1]]])
class fastNLP.modules.encoder.LSTM(input_size, hidden_size=100, num_layers=1, dropout=0.0, batch_first=True, bidirectional=False, bias=True, initial_method=None, get_hidden=False)[source]

Long Short Term Memory

Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers. Default: 1 dropout : dropout rate. Default: 0.5 bidirectional : If True, becomes a bidirectional RNN. Default: False.

forward(x, h0=None, c0=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.Embedding(nums, dims, padding_idx=0, sparse=False, init_emb=None, dropout=0.0)[source]

A simple lookup table Args: nums : the size of the lookup table dims : the size of each vector padding_idx : pads the tensor with zeros whenever it encounters this index sparse : If True, gradient matrix will be a sparse tensor. In this case, only optim.SGD(cuda and cpu) and optim.Adagrad(cpu) can be used

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.Linear(input_size, output_size, bias=True, initial_method=None)[source]

Linear module Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers dropout : dropout rate bidirectional : If True, becomes a bidirectional RNN

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.Conv(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]

Basic 1-d convolution module. initialize with xavier_uniform

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class fastNLP.modules.encoder.ConvMaxpool(in_channels, out_channels, kernel_sizes, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]

Convolution and max-pooling module with multiple kernel sizes.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.