pyconversations.message.UniMessage

class pyconversations.message.UniMessage(uid, text='', author=None, created_at=None, reply_to=None, platform=None, lang=None, tags=None, lang_detect=False, tokenizer='partitioner')[source]

The Universal Message class.

This is designed to be the abstract, baseline object that all social media posts / conversation turns inherit from. The only mandatory field is the uid, a unique field.

add_reply_to(tid)[source]

Adds a new UID that this message is replying to.

Parameters

tid (UID) – The UID to be added

Returns

None

add_tag(tag)[source]

Adds a new tag to this message.

Parameters

tag (str) – The tag to be added

Returns

None

property author

Returns the author of this message.

Returns

str – Author name/username

property created_at

Returns the datetime associated with this message.

Returns

datetime.datetime – Time of creation of post. Could be None if not available/processed.

classmethod from_json(data)[source]

Given an exported JSON object for a Universal Message, this function loads the saved data into its fields

Parameters

data (JSON/dict) – The raw message JSON

Returns

Message class – Created inherited UniMessage object

get_mentions()[source]

By default, this will simply return the author of the post (if available) for appropriate anonymization

Returns

set(str) – The mentions detected in this message

property lang

Returns the language this post was written in

Returns

str – Language code of the message text

abstract static parse_datestr(x)[source]

Abstract static method that specifies how to convert the native datetime string into a a Python datetime object.

Parameters

x (str) – The raw datetime string

abstract static parse_raw(raw, lang_detect=False)[source]

Abstract static method that must be implemented by all non-abstract child classes. Concrete implementations should specify how to parse the raw data into this object.

Parameters
  • raw (JSON/dict) – The raw data to be pre-processed.

  • lang_detect (bool) – A boolean which specifies whether language detection should be activated. (Default: False)

property platform

The platform this message was created on

Returns

str – Platform name

redact(redact_map)[source]

Given a set of terms, this function will properly redact all instances of those terms. This function is mainly to use for redacting usernames or user mentions, so as to protect user privacy.

Parameters

redact_map (dict(str, str)) – The map of terms and what they should be replaced with

Returns

None

remove_reply_to(tid)[source]

Removes a UID from the set this message is replying to.

Parameters

tid (UID) – The UID to be removed

remove_tag(tag)[source]

Removes a tag from this message.

Parameters

tag (str) – The tag to remove

Returns

None

property reply_to

Returns the unique identifiers of the messages that are replied to by this message.

Returns

set(UID) – The set of UIDs of the posts this message replies to

property tags

Returns the tags associated with this message.

Returns

set(str) – Set of string tags associated with this message

property text

The text associated with this message.

Returns

str – Message text

to_json()[source]

Function for exporting a Universal Post into a JSON object for storage and later use

Returns

JSON/dict – The JSON formatted UniMessage for disk storage

property tokens

Tokenizes the text of this message

Returns

list(str) – The tokenized text

property uid

The unique identifier of this object.

Returns

UID – Unique identifier for this message.