fastNLP.api¶
fastNLP.api.api¶
-
class
fastNLP.api.api.
POS
(model_path=None, device='cpu')[source]¶ FastNLP API for Part-Of-Speech tagging.
Parameters: - model_path (str) – the path to the model.
- device (str) – device name such as “cpu” or “cuda:0”. Use the same notation as PyTorch.
fastNLP.api.converter¶
fastNLP.api.model_zoo¶
fastNLP.api.pipeline¶
fastNLP.api.processor¶
-
class
fastNLP.api.processor.
FullSpaceToHalfSpaceProcessor
(field_name, change_alpha=True, change_digit=True, change_punctuation=True, change_space=True)[source]¶ 全角转半角,以字符为处理单元
-
class
fastNLP.api.processor.
Index2WordProcessor
(vocab, field_name, new_added_field_name)[source]¶ 将DataSet中某个为index的field根据vocab转换为str
-
class
fastNLP.api.processor.
IndexerProcessor
(vocab, field_name, new_added_field_name, delete_old_field=False, is_input=True)[source]¶ - 给定一个vocabulary , 将指定field转换为index形式。指定field应该是一维的list,比如
- [‘我’, ‘是’, xxx]
-
class
fastNLP.api.processor.
Num2TagProcessor
(tag, field_name, new_added_field_name=None)[source]¶ 将一句话中的数字转换为某个tag。
-
class
fastNLP.api.processor.
PreAppendProcessor
(data, field_name, new_added_field_name=None)[source]¶ - 向某个field的起始增加data(应该为str类型)。该field需要为list类型。即新增的field为
- [data] + instance[field_name]
-
class
fastNLP.api.processor.
SeqLenProcessor
(field_name, new_added_field_name='seq_lens', is_input=True)[source]¶ 根据某个field新增一个sequence length的field。取该field的第一维
-
class
fastNLP.api.processor.
SliceProcessor
(start, end, step, field_name, new_added_field_name=None)[source]¶ 从某个field中只取部分内容。等价于instance[field_name][start:end:step]
-
class
fastNLP.api.processor.
VocabIndexerProcessor
(field_name, new_added_filed_name=None, min_freq=1, max_size=None, verbose=0, is_input=True)[source]¶ - 根据DataSet创建Vocabulary,并将其用数字index。新生成的index的field会被放在new_added_filed_name, 如果没有提供
- new_added_field_name, 则覆盖原有的field_name.
-
construct_vocab
(*datasets)[source]¶ 使用传入的DataSet创建vocabulary
Parameters: datasets – DataSet类型的数据,用于构建vocabulary Returns:
-
process
(*datasets, only_index_dataset=None)[source]¶ - 若还未建立Vocabulary,则使用dataset中的DataSet建立vocabulary;若已经有了vocabulary则使用已有的vocabulary。得到vocabulary
- 后,则会index datasets与only_index_dataset。
Parameters: - datasets – DataSet类型的数据
- only_index_dataset – DataSet, or list of DataSet. 该参数中的内容只会被用于index,不会被用于生成vocabulary。
Returns: