pyconversations.feature_extraction.extractors¶
PostVectorizer¶
- class pyconversations.feature_extraction.PostVectorizer(normalization=None)[source]¶
Vectorization engine for social media post featurization
- __init__(normalization=None)[source]¶
Constructor for PostVectorizer
- Parameters
normalization (None or str) – Can be None, ‘minmax’, ‘mean’, or ‘standard’
- fit(xs)[source]¶
Fits the parameters necessary for normalization and vectorization of posts.
- Parameters
xs (List(UniMessage) or List(Conversation) or Conversation)
- Returns
PostVectorizer
- transform(xs, include_ids=False)[source]¶
Transforms posts into a a collection of vectors. Will perform this extraction with or without conversational features depending on provided input.
- Parameters
xs (List(UniMessage) or List(Conversation) or Conversation)
include_ids (bool)
- Returns
np.array – (N, d), where N is the number of posts and d is the number of features
dict(Hashable, int) – Optional. Returned if include_ids=True and creates a map from UID to row in returned array
ConversationVectorizer¶
- class pyconversations.feature_extraction.ConversationVectorizer(normalization=None)[source]¶
Vectorization engine for social media conversation featurization
- __init__(normalization=None)[source]¶
Constructor for ConversationVectorizer
- Parameters
normalization (None or str) – Can be None, ‘minmax’, ‘mean’, or ‘standard’
- fit(xs)[source]¶
Fits the normalization parameters
- Parameters
xs (Conversation or List(Conversation))
- Returns
ConversationVectorizer
- transform(xs, include_ids=False)[source]¶
Returns a set of vectors, one for each supplied conversation.
- Parameters
xs (Conversation or List(Conversation))
include_ids (bool)
- Returns
np.array
dict(Hashable, int) – Optional. Returned if include_ids=True and creates a map from UID to row in returned array
UserVectorizer¶
- class pyconversations.feature_extraction.UserVectorizer(normalization=None)[source]¶
Vectorizer for creating user parameter vectors
- __init__(normalization=None)[source]¶
Constructor for UserVectorizer
- Parameters
normalization (None or str) – Can be None, ‘minmax’, ‘mean’, or ‘standard’
- fit(xs)[source]¶
Fits normalization parameters
- Parameters
xs (Conversation or List(Conversation) or List(UniMessage))
- Returns
UserVectorizer
- transform(xs, include_ids=False)[source]¶
Returns a set of user vectors for each unique user found
- Parameters
xs (Conversation, List(Conversation), or List(UniMessage))
include_ids (bool)
- Returns
np.arrary
dict(Hashable, int) – Optional. Returned if include_ids=True and creates a map from UID to row in returned array