pyconversations.message.UniMessage¶
- class pyconversations.message.UniMessage(uid, text='', author=None, created_at=None, reply_to=None, platform=None, lang=None, tags=None, lang_detect=False, tokenizer='partitioner')[source]¶
The Universal Message class.
This is designed to be the abstract, baseline object that all social media posts / conversation turns inherit from. The only mandatory field is the uid, a unique field.
- add_reply_to(tid)[source]¶
Adds a new UID that this message is replying to.
- Parameters
tid (UID) – The UID to be added
- Returns
None
- add_tag(tag)[source]¶
Adds a new tag to this message.
- Parameters
tag (str) – The tag to be added
- Returns
None
- property author¶
Returns the author of this message.
- Returns
str – Author name/username
- property created_at¶
Returns the datetime associated with this message.
- Returns
datetime.datetime – Time of creation of post. Could be None if not available/processed.
- classmethod from_json(data)[source]¶
Given an exported JSON object for a Universal Message, this function loads the saved data into its fields
- Parameters
data (JSON/dict) – The raw message JSON
- Returns
Message class – Created inherited UniMessage object
- get_mentions()[source]¶
By default, this will simply return the author of the post (if available) for appropriate anonymization
- Returns
set(str) – The mentions detected in this message
- property lang¶
Returns the language this post was written in
- Returns
str – Language code of the message text
- abstract static parse_datestr(x)[source]¶
Abstract static method that specifies how to convert the native datetime string into a a Python datetime object.
- Parameters
x (str) – The raw datetime string
- abstract static parse_raw(raw, lang_detect=False)[source]¶
Abstract static method that must be implemented by all non-abstract child classes. Concrete implementations should specify how to parse the raw data into this object.
- Parameters
raw (JSON/dict) – The raw data to be pre-processed.
lang_detect (bool) – A boolean which specifies whether language detection should be activated. (Default: False)
- property platform¶
The platform this message was created on
- Returns
str – Platform name
- redact(redact_map)[source]¶
Given a set of terms, this function will properly redact all instances of those terms. This function is mainly to use for redacting usernames or user mentions, so as to protect user privacy.
- Parameters
redact_map (dict(str, str)) – The map of terms and what they should be replaced with
- Returns
None
- remove_reply_to(tid)[source]¶
Removes a UID from the set this message is replying to.
- Parameters
tid (UID) – The UID to be removed
- remove_tag(tag)[source]¶
Removes a tag from this message.
- Parameters
tag (str) – The tag to remove
- Returns
None
- property reply_to¶
Returns the unique identifiers of the messages that are replied to by this message.
- Returns
set(UID) – The set of UIDs of the posts this message replies to
- property tags¶
Returns the tags associated with this message.
- Returns
set(str) – Set of string tags associated with this message
- property text¶
The text associated with this message.
- Returns
str – Message text
- to_json()[source]¶
Function for exporting a Universal Post into a JSON object for storage and later use
- Returns
JSON/dict – The JSON formatted UniMessage for disk storage
- property tokens¶
Tokenizes the text of this message
- Returns
list(str) – The tokenized text
- property uid¶
The unique identifier of this object.
- Returns
UID – Unique identifier for this message.