fastNLP.modules.encoder¶
fastNLP.modules.encoder.char_embedding¶
-
class
fastNLP.modules.encoder.char_embedding.
ConvCharEmbedding
(char_emb_size=50, feature_maps=(40, 30, 30), kernels=(3, 4, 5), initial_method=None)[source]¶
-
class
fastNLP.modules.encoder.char_embedding.
LSTMCharEmbedding
(char_emb_size=50, hidden_size=None, initial_method=None)[source]¶ Character Level Word Embedding with LSTM with a single layer. :param char_emb_size: int, the size of character level embedding. Default: 50
say 26 characters, each embedded to 50 dim vector, then the input_size is 50.Parameters: hidden_size – int, the number of hidden units. Default: equal to char_emb_size.
fastNLP.modules.encoder.conv¶
-
class
fastNLP.modules.encoder.conv.
Conv
(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]¶ Basic 1-d convolution module. initialize with xavier_uniform
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.conv_maxpool¶
-
class
fastNLP.modules.encoder.conv_maxpool.
ConvMaxpool
(in_channels, out_channels, kernel_sizes, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]¶ Convolution and max-pooling module with multiple kernel sizes.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.embedding¶
-
class
fastNLP.modules.encoder.embedding.
Embedding
(nums, dims, padding_idx=0, sparse=False, init_emb=None, dropout=0.0)[source]¶ A simple lookup table Args: nums : the size of the lookup table dims : the size of each vector padding_idx : pads the tensor with zeros whenever it encounters this index sparse : If True, gradient matrix will be a sparse tensor. In this case, only optim.SGD(cuda and cpu) and optim.Adagrad(cpu) can be used
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.linear¶
-
class
fastNLP.modules.encoder.linear.
Linear
(input_size, output_size, bias=True, initial_method=None)[source]¶ Linear module Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers dropout : dropout rate bidirectional : If True, becomes a bidirectional RNN
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.lstm¶
-
class
fastNLP.modules.encoder.lstm.
LSTM
(input_size, hidden_size=100, num_layers=1, dropout=0.0, batch_first=True, bidirectional=False, bias=True, initial_method=None, get_hidden=False)[source]¶ Long Short Term Memory
Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers. Default: 1 dropout : dropout rate. Default: 0.5 bidirectional : If True, becomes a bidirectional RNN. Default: False.
-
forward
(x, h0=None, c0=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
fastNLP.modules.encoder.masked_rnn¶
-
class
fastNLP.modules.encoder.masked_rnn.
MaskedGRU
(*args, **kwargs)[source]¶ Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:
\begin{array}{ll} r_t = \mathrm{sigmoid}(W_{ir} x_t + b_{ir} + W_{hr} h_{(t-1)} + b_{hr}) \\ z_t = \mathrm{sigmoid}(W_{iz} x_t + b_{iz} + W_{hz} h_{(t-1)} + b_{hz}) \\ n_t = \tanh(W_{in} x_t + b_{in} + r_t * (W_{hn} h_{(t-1)}+ b_{hn})) \\ h_t = (1 - z_t) * n_t + z_t * h_{(t-1)} \\ \end{array}
where \(h_t\) is the hidden state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(r_t\), \(z_t\), \(n_t\) are the reset, input, and new gates, respectively. Args:
input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. nonlinearity: The non-linearity to use [‘tanh’|’relu’]. Default: ‘tanh’ bias: If False, then the layer does not use bias weights b_ih and b_hh.
Default: True- batch_first: If True, then the input and output tensors are provided
- as (batch, seq, feature)
- dropout: If non-zero, introduces a dropout layer on the outputs of each
- RNN layer except the last layer
bidirectional: If True, becomes a bidirectional RNN. Default: False
- Inputs: input, mask, h_0
- input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
- h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
- Outputs: output, h_n
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_k) from the last layer of the RNN,
for each k. If a
torch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence. - h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for k=seq_len.
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_k) from the last layer of the RNN,
for each k. If a
-
class
fastNLP.modules.encoder.masked_rnn.
MaskedLSTM
(*args, **kwargs)[source]¶ Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:
\begin{array}{ll} i_t = \mathrm{sigmoid}(W_{ii} x_t + b_{ii} + W_{hi} h_{(t-1)} + b_{hi}) \\ f_t = \mathrm{sigmoid}(W_{if} x_t + b_{if} + W_{hf} h_{(t-1)} + b_{hf}) \\ g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hc} h_{(t-1)} + b_{hg}) \\ o_t = \mathrm{sigmoid}(W_{io} x_t + b_{io} + W_{ho} h_{(t-1)} + b_{ho}) \\ c_t = f_t * c_{(t-1)} + i_t * g_t \\ h_t = o_t * \tanh(c_t) \end{array}
where \(h_t\) is the hidden state at time t, \(c_t\) is the cell state at time t, \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer, and \(i_t\), \(f_t\), \(g_t\), \(o_t\) are the input, forget, cell, and out gates, respectively. Args:
input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. bias: If False, then the layer does not use bias weights b_ih and b_hh.
Default: True- batch_first: If True, then the input and output tensors are provided
- as (batch, seq, feature)
- dropout: If non-zero, introduces a dropout layer on the outputs of each
- RNN layer except the last layer
bidirectional: If True, becomes a bidirectional RNN. Default: False
- Inputs: input, mask, (h_0, c_0)
- input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
- h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
- c_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial cell state for each element in the batch.
- Outputs: output, (h_n, c_n)
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_t) from the last layer of the RNN,
for each t. If a
torch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence. - h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for t=seq_len
- c_n (num_layers * num_directions, batch, hidden_size): tensor containing the cell state for t=seq_len
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_t) from the last layer of the RNN,
for each t. If a
-
class
fastNLP.modules.encoder.masked_rnn.
MaskedRNN
(*args, **kwargs)[source]¶ Applies a multi-layer Elman RNN with costomized non-linearity to an input sequence. For each element in the input sequence, each layer computes the following function: .. math:
h_t = \tanh(w_{ih} * x_t + b_{ih} + w_{hh} * h_{(t-1)} + b_{hh})
where \(h_t\) is the hidden state at time t, and \(x_t\) is the hidden state of the previous layer at time t or \(input_t\) for the first layer. If nonlinearity=’relu’, then ReLU is used instead of tanh. Args:
input_size: The number of expected features in the input x hidden_size: The number of features in the hidden state h num_layers: Number of recurrent layers. nonlinearity: The non-linearity to use [‘tanh’|’relu’]. Default: ‘tanh’ bias: If False, then the layer does not use bias weights b_ih and b_hh.
Default: True- batch_first: If True, then the input and output tensors are provided
- as (batch, seq, feature)
- dropout: If non-zero, introduces a dropout layer on the outputs of each
- RNN layer except the last layer
bidirectional: If True, becomes a bidirectional RNN. Default: False
- Inputs: input, mask, h_0
- input (seq_len, batch, input_size): tensor containing the features of the input sequence. mask (seq_len, batch): 0-1 tensor containing the mask of the input sequence.
- h_0 (num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch.
- Outputs: output, h_n
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_k) from the last layer of the RNN,
for each k. If a
torch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence. - h_n (num_layers * num_directions, batch, hidden_size): tensor containing the hidden state for k=seq_len.
- output (seq_len, batch, hidden_size * num_directions): tensor
containing the output features (h_k) from the last layer of the RNN,
for each k. If a
-
class
fastNLP.modules.encoder.masked_rnn.
MaskedRNNBase
(Cell, input_size, hidden_size, num_layers=1, bias=True, batch_first=False, layer_dropout=0, step_dropout=0, bidirectional=False, initial_method=None, **kwargs)[source]¶ -
forward
(input, mask=None, hx=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
step
(input, hx=None, mask=None)[source]¶ execute one step forward (only for one-directional RNN). Args:
input (batch, input_size): input tensor of this step. hx (num_layers, batch, hidden_size): the hidden state of last step. mask (batch): the mask tensor of this step.- Returns:
- output (batch, hidden_size): tensor containing the output of this step from the last layer of RNN. hn (num_layers, batch, hidden_size): tensor containing the hidden state of this step
-
fastNLP.modules.encoder.transformer¶
-
class
fastNLP.modules.encoder.transformer.
TransformerEncoder
(num_layers, **kargs)[source]¶ -
class
SubLayer
(input_size, output_size, key_size, value_size, num_atte)[source]¶ -
forward
(input, seq_mask)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
forward
(x, seq_mask=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
fastNLP.modules.encoder.variational_rnn¶
-
class
fastNLP.modules.encoder.variational_rnn.
VarGRU
(*args, **kwargs)[source]¶ Variational Dropout GRU.
-
class
fastNLP.modules.encoder.variational_rnn.
VarLSTM
(*args, **kwargs)[source]¶ Variational Dropout LSTM.
-
class
fastNLP.modules.encoder.variational_rnn.
VarRNN
(*args, **kwargs)[source]¶ Variational Dropout RNN.
-
class
fastNLP.modules.encoder.variational_rnn.
VarRNNBase
(mode, Cell, input_size, hidden_size, num_layers=1, bias=True, batch_first=False, input_dropout=0, hidden_dropout=0, bidirectional=False)[source]¶ Implementation of Variational Dropout RNN network. refer to A Theoretically Grounded Application of Dropout in Recurrent Neural Networks (Yarin Gal and Zoubin Ghahramani, 2016) https://arxiv.org/abs/1512.05287.
-
forward
(input, hx=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.variational_rnn.
VarRnnCellWrapper
(cell, hidden_size, input_p, hidden_p)[source]¶ Wrapper for normal RNN Cells, make it support variational dropout
-
forward
(input, hidden, mask_x=None, mask_h=None)[source]¶ Parameters: - input – [seq_len, batch_size, input_size]
- hidden – for LSTM, tuple of (h_0, c_0), [batch_size, hidden_size] for other RNN, h_0, [batch_size, hidden_size]
- mask_x – [batch_size, input_size] dropout mask for input
- mask_h – [batch_size, hidden_size] dropout mask for hidden
Return output: [seq_len, bacth_size, hidden_size] hidden: for LSTM, tuple of (h_n, c_n), [batch_size, hidden_size]
for other RNN, h_n, [batch_size, hidden_size]
-
-
fastNLP.modules.encoder.variational_rnn.
flip
(input, dims) → Tensor¶ Reverse the order of a n-D tensor along given axis in dims.
- Args:
- input (Tensor): the input tensor dims (a list or tuple): axis to flip on
Example:
>>> x = torch.arange(8).view(2, 2, 2) >>> x tensor([[[ 0, 1], [ 2, 3]], [[ 4, 5], [ 6, 7]]]) >>> torch.flip(x, [0, 1]) tensor([[[ 6, 7], [ 4, 5]], [[ 2, 3], [ 0, 1]]])
-
class
fastNLP.modules.encoder.
LSTM
(input_size, hidden_size=100, num_layers=1, dropout=0.0, batch_first=True, bidirectional=False, bias=True, initial_method=None, get_hidden=False)[source]¶ Long Short Term Memory
Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers. Default: 1 dropout : dropout rate. Default: 0.5 bidirectional : If True, becomes a bidirectional RNN. Default: False.
-
forward
(x, h0=None, c0=None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.
Embedding
(nums, dims, padding_idx=0, sparse=False, init_emb=None, dropout=0.0)[source]¶ A simple lookup table Args: nums : the size of the lookup table dims : the size of each vector padding_idx : pads the tensor with zeros whenever it encounters this index sparse : If True, gradient matrix will be a sparse tensor. In this case, only optim.SGD(cuda and cpu) and optim.Adagrad(cpu) can be used
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.
Linear
(input_size, output_size, bias=True, initial_method=None)[source]¶ Linear module Args: input_size : input size hidden_size : hidden size num_layers : number of hidden layers dropout : dropout rate bidirectional : If True, becomes a bidirectional RNN
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.
Conv
(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]¶ Basic 1-d convolution module. initialize with xavier_uniform
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
fastNLP.modules.encoder.
ConvMaxpool
(in_channels, out_channels, kernel_sizes, stride=1, padding=0, dilation=1, groups=1, bias=True, activation='relu', initial_method=None)[source]¶ Convolution and max-pooling module with multiple kernel sizes.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-