fastNLP.api¶

fastNLP.api.api¶

class fastNLP.api.api.POS(model_path=None, device='cpu')[source]¶

FastNLP API for Part-Of-Speech tagging.

Parameters:	model_path (str) – the path to the model. device (str) – device name such as “cpu” or “cuda:0”. Use the same notation as PyTorch.

predict(content)[source]¶

Parameters:	content – list of list of str. Each string is a token(word).
Return answer:	list of list of str. Each string is a tag.

test(file_path)[source]¶

Test performance over the given data set.

Parameters:	file_path (str) –
Returns:	a dictionary of metric values

fastNLP.api.converter¶

fastNLP.api.model_zoo¶

fastNLP.api.pipeline¶

class fastNLP.api.pipeline.Pipeline(processors=None)[source]¶: Pipeline takes a DataSet object as input, runs multiple processors sequentially, and outputs a DataSet object.

fastNLP.api.processor¶

class fastNLP.api.processor.FullSpaceToHalfSpaceProcessor(field_name, change_alpha=True, change_digit=True, change_punctuation=True, change_space=True)[source]¶: 全角转半角，以字符为处理单元

class fastNLP.api.processor.Index2WordProcessor(vocab, field_name, new_added_field_name)[source]¶: 将DataSet中某个为index的field根据vocab转换为str

class fastNLP.api.processor.IndexerProcessor(vocab, field_name, new_added_field_name, delete_old_field=False, is_input=True)[source]¶

给定一个vocabulary , 将指定field转换为index形式。指定field应该是一维的list，比如: [‘我’, ‘是’, xxx]

class fastNLP.api.processor.Num2TagProcessor(tag, field_name, new_added_field_name=None)[source]¶: 将一句话中的数字转换为某个tag。

class fastNLP.api.processor.PreAppendProcessor(data, field_name, new_added_field_name=None)[source]¶

向某个field的起始增加data(应该为str类型)。该field需要为list类型。即新增的field为: [data] + instance[field_name]

class fastNLP.api.processor.SeqLenProcessor(field_name, new_added_field_name='seq_lens', is_input=True)[source]¶: 根据某个field新增一个sequence length的field。取该field的第一维

class fastNLP.api.processor.SliceProcessor(start, end, step, field_name, new_added_field_name=None)[source]¶: 从某个field中只取部分内容。等价于instance[field_name][start:end:step]

class fastNLP.api.processor.VocabIndexerProcessor(field_name, new_added_filed_name=None, min_freq=1, max_size=None, verbose=0, is_input=True)[source]¶

根据DataSet创建Vocabulary，并将其用数字index。新生成的index的field会被放在new_added_filed_name, 如果没有提供: new_added_field_name, 则覆盖原有的field_name.

construct_vocab(*datasets)[source]¶

使用传入的DataSet创建vocabulary

Parameters:	datasets – DataSet类型的数据，用于构建vocabulary
Returns:

process(*datasets, only_index_dataset=None)[source]¶

若还未建立Vocabulary，则使用dataset中的DataSet建立vocabulary；若已经有了vocabulary则使用已有的vocabulary。得到vocabulary: 后，则会index datasets与only_index_dataset。

Parameters:	datasets – DataSet类型的数据 only_index_dataset – DataSet, or list of DataSet. 该参数中的内容只会被用于index，不会被用于生成vocabulary。
Returns:

set_verbose(verbose)[source]¶

设置processor verbose状态。

Parameters:	verbose – int, 0，不输出任何信息；1，输出vocab 信息。
Returns:

class fastNLP.api.processor.VocabProcessor(field_name, min_freq=1, max_size=None)[source]¶: 传入若干个DataSet以建立vocabulary。