pyconversations.feature_extraction.extractors

PostVectorizer

class pyconversations.feature_extraction.PostVectorizer(normalization=None)[source]

Vectorization engine for social media post featurization

__init__(normalization=None)[source]

Constructor for PostVectorizer

Parameters

normalization (None or str) – Can be None, ‘minmax’, ‘mean’, or ‘standard’

fit(xs)[source]

Fits the parameters necessary for normalization and vectorization of posts.

Parameters

xs (List(UniMessage) or List(Conversation) or Conversation)

Returns

PostVectorizer

transform(xs, include_ids=False)[source]

Transforms posts into a a collection of vectors. Will perform this extraction with or without conversational features depending on provided input.

Parameters
  • xs (List(UniMessage) or List(Conversation) or Conversation)

  • include_ids (bool)

Returns

  • np.array – (N, d), where N is the number of posts and d is the number of features

  • dict(Hashable, int) – Optional. Returned if include_ids=True and creates a map from UID to row in returned array

ConversationVectorizer

class pyconversations.feature_extraction.ConversationVectorizer(normalization=None)[source]

Vectorization engine for social media conversation featurization

__init__(normalization=None)[source]

Constructor for ConversationVectorizer

Parameters

normalization (None or str) – Can be None, ‘minmax’, ‘mean’, or ‘standard’

fit(xs)[source]

Fits the normalization parameters

Parameters

xs (Conversation or List(Conversation))

Returns

ConversationVectorizer

transform(xs, include_ids=False)[source]

Returns a set of vectors, one for each supplied conversation.

Parameters
  • xs (Conversation or List(Conversation))

  • include_ids (bool)

Returns

  • np.array

  • dict(Hashable, int) – Optional. Returned if include_ids=True and creates a map from UID to row in returned array

UserVectorizer

class pyconversations.feature_extraction.UserVectorizer(normalization=None)[source]

Vectorizer for creating user parameter vectors

__init__(normalization=None)[source]

Constructor for UserVectorizer

Parameters

normalization (None or str) – Can be None, ‘minmax’, ‘mean’, or ‘standard’

fit(xs)[source]

Fits normalization parameters

Parameters

xs (Conversation or List(Conversation) or List(UniMessage))

Returns

UserVectorizer

transform(xs, include_ids=False)[source]

Returns a set of user vectors for each unique user found

Parameters
  • xs (Conversation, List(Conversation), or List(UniMessage))

  • include_ids (bool)

Returns

  • np.arrary

  • dict(Hashable, int) – Optional. Returned if include_ids=True and creates a map from UID to row in returned array