SSIX - Social Sentiment Indices powered by X-Scores

The main source of data for the SSIX project is undoubtedly Twitter. In fact, it is the only platform still providing free and continuous access to live streams of contents, thanks to the official Streaming APIs (for more information, read this article on how to collect data from Twitter).

Although all the data retrieved from Twitter has to be considered as publicly accessible data – primarily because users have automatically given consent to the distribution of their data to third parties outside the Twitter platform – the SSIX project draws particular attention to the privacy of collected data.

To reach this, the data ingestion infrastructure of the whole architecture, provided by 3rdPLACE, applies an anonymization process to the data entering the system before storing it. This consists in the removal from the original Twitter object (received in JSON format) of all those fields that have been identified as potentially sensitive. These fields are not used by the SSIX platform for applying sentiment analysis and for the generation of the X-Scores.

Here is a list of the fields that are currently discarded:

user.name
user.screen_name
user.description
in_reply_to_screen_name
entities.user_mentions.name
entities.user_mentions.screen_name
user.profile_banner_url
user.profile_image_url
user.profile_image_url_https
user.profile_background_image_url
user.profile_background_image_url_https
user.url
quoted_status.user.name
quoted_status.user.screen_name
retweeted_status.user.name
retweeted_status.user.screen_name

All the information kept after this process is stored in a secure repository on the Google Cloud Platform, as explained in this previous article.

 

This blog post was written by SSIX partner 3rdPLACE.
For the latest update, like us on
Facebook, follow us on Twitter and join us on LinkedIn.

 

Tweet about this on TwitterShare on Facebook0Share on Google+0Share on LinkedIn0
Author :
Print

Leave a Reply