UniMessage Class¶
A UniMessage is the underlying message/post/social media object within PyConversations. As of v.0.0.3, there are four concrete implementations of UniMessage:
Tweet (Twitter)
RedditPost (Reddit)
ChanPost (4chan)
FBPost (Facebook)
These capture properties like time representation, raw format reading, specific platform tagging, and user regex (if applicable, Twitter & Reddit). For the remainder of this introduction, we shall use the concrete Tweet class. However, note that the operations discussed/detailed here are applicable to all classes that inherit UniMessage.
Instantiation¶
There are three ways to create a UniMessage. First, there is raw instantiation by filling out constructor parameters:
t = Tweet(UID, text='...', author='...',
created_at=datetime.datetime(...),
reply_to=[...UIDs of posts replied to...], lang='...')
Only the UID field is required; all others are set to null values by default. Here, some of the most common parameters are enumerated but there are a few more that may be useful, so check out the full documentation if interested.
Second, there is re-construction from a re-loaded JSON representation:
t = Tweet.from_json(RAW_JSON)
Finally, for some UniMessage implementations there are functions that read raw JSON data as output directly from the platform:
t = Tweet.parse_raw(RAW, detect_language=True)
Though, if using these, see their specific documentation pages for more information.
Properties¶
A UniMessage has several key properties:
UID - a unique identifier for the post
text - the string message associated with the post
author - the name/ID of the author of the message
created_at - the time the post was created
reply_to - a set of UIDs of the posts that this object references/replies to
tags - a set of tags attached to this post
platform - the string name of the platform associated with this post
lang - the language code of the text of the post as given or detected
tokens - a tokenized list using the selected tokenizer and the message of this post
For reply_to and tags, we have add_* and remove_* methods for managing their members:
post.add_tag('test_tag')
post.remove_tag('test_tag')
User Mentions & Redaction¶
A set of the authors mentioned (if supported) can be generated with:
post.get_mentions()
This is useful for privacy and redaction purposes when used with the .redact() function which takes a set of terms and what they should be mapped to:
term_map = {m: 'USER' for m in post.get_mentions()} # map from user names to redaction token
post.redact(term_map)
Alternate Representation¶
UniMessage can be exported to a JSON format that can be reloaded later:
post = post.from_json(post.to_json()) # invariance over this operation
Parsing Platform Datetimes¶
Where applicable, concrete implementations have a static function for parsing platform specific timestamps into a Python datetime representation:
dt = Tweet.parse_datestr(RAW)