With the increasing presence of censorship on Chinese social media, it is imperative to provide the users of platforms such as Sina Weibo a way to freely share information without alerting the censors and systems of surveillance on social media. The aim of this project is to implement a Real-Time Keyword Aggregator that collects keywords that have most likely resulted in censorship of posts from various publicly available archives of censored sina weibo posts. In this work, utilize a Distributed Computing based technique to identify additional possible keywords from the posts using a TF-IDF based technique. The result of this project will be a large, continuously populated and curated homophone dictionary for currently censored keywords on Sina Weibo.
![](https://gvu.gatech.edu/sites/default/files/research_lab_images/lab-comp.social.jpg)
The comp.social lab focuses on the design and analysis of social media. According to their website they "like puppies, mixed methods and new students (particularly MS)."