pyconversations.tokenizers¶

class pyconversations.tokenizers.BaseTokenizer(name)[source]¶

The abstract Tokenizer class.

abstract tokenize(s)[source]¶

Splits a string into tokens.

class pyconversations.tokenizers.DefaultTokenizer[source]¶

A tokenizer that just uses Python’s basic str.split function.

tokenize(s)[source]¶

Splits a string into tokens.

class pyconversations.tokenizers.LambdaTokenizer(func)[source]¶

An interface that wraps a lambda function

tokenize(s)[source]¶

Splits a string into tokens.

class pyconversations.tokenizers.NLTKTokenizer[source]¶

An NLTK-based tokenizer

tokenize(s)[source]¶

Splits a string into tokens.

class pyconversations.tokenizers.PartitionTokenizer(space=True, charset=None)[source]¶

A custom Tokenizer based off of Partitioner by Jake Ryland Williams.

Notes

tokenize(s)[source]¶

Splits a string into tokens.