Bitcoin, social networks and semantic analysis (part 2)

An unconventional approach like semantic analysis can help to solve an unconventional problem, such the ability to predict the cryptocurrencies exchange rate trend versus the classic currencies


An unconventional approach like semantic analysis can help to solve an unconventional problem, such the ability to predict the cryptocurrencies (Bitcoin in particular) exchange rate trend versus the classic currencies.

This was the point made by Gian Piero Oggero in his article “Bitcoin, social networks and semantic analysis” where it was clarified how the application of standard exchange predictive methodology normally adopted by the traditional financial markets shows signs of inefficiency and produces questionable results in an innovative environment such Bitcoin.

During the last few months the Bitcoin transaction volume has gone up, thus generating a lot of interest that reverberated on the web and in particular on Social Networks. Starting from this consideration and taking in consideration the complexity when we talk about predictive analysis we actually refer to “Nowcasting”.

The main difference between the Bitcoin trading and the traditional trade is its extreme volatility: transactions are not in real time and it can take up to 5 minutes before its completion.

Thus, it become realistic the need of monitoring all conversations with appropriate tools (like Damantic’s Discover Environment specifically engineered for Open Source INTelligence activities) in order to identify a possible correlation between the volume (quantity of messages) and the “quality” (positive or negative) of the conversation referring to the Bitcoin price.
Among the preliminary monitoring activities it is crucial the sources selection, meaning the sources of data that we are going to process. For each identified font we need to carefully assign a relative scoring of authority and reliability; in essence, an “anybody” message does not have the same ranking, in terms of market influence than a communication posted by a corporation (i.e Expedia, eBay) or a large news publisher (The Guardian, The New York Times etc…). At the same time, a group with an elevated number of “followers” could be a determining factor in the sentiment ranking.

During the monitoring section, each conversation is enhanced with metadata obtained from the semantic software application for the NLP (Natural Language Processing). The new information, such message category, its main theme and “mood” obtained from the sentiment analysis, are now part (along a time series of Bitcoin price), as input, in a variant of Kalman filter (1), known financial model used for the generation of optimized projections.

The information obtained from the time series data are integrated with the cross-section data taking advantage, in this way all the available information in the data, otherwise distributed in numeric data (quantitative) and data coming from the analysis of the conversations (qualitative).

Due to space reason we will focus on the general considerations obtained from Twitter’s data processing, that in light of the volume obtained, is once again, one of the most important sources for the determination of user’s mood and perceptions and therefore
it will be considered as the main source for Bitcoin trade analysis.
We identified 4 main emerging properties:
  1. a positive and negative tweets balance is in direct correlation with a high volume of transactions;
  2. a positive tweets unbalance balance is in direct correlation with a higher average price;
  3. when the exchange volume has been (and it currently is) high, the emotional value of the tweets is generally positive;
  4. a positive and negative tweets ratio and stable ratio between the emotional value and the volume of transactions can be a signal a speculative “ momentum” of the market place.
Summarizing, all the above properties can show us how the “virtual emotions” obtained from the tweets analysis, can be treated as an additional marker for a correct interpretations of the Bitcoin exchange fluctuations (or any cryptocurrency, for the matter, as long as there is available a large enough number of conversations).

In this article we presented some of the main techniques used by Damantic in the forecast of cryptocurrencies trade value: the ratio between the exchange volume and the number of total (positive and negative) opinions obtained from the conversations analysis can suggest us a relationship stronger than a simple matter of a random event. The explanation seems to be the complex combination of “standard” techniques for the generation of time series models along with the possibilities offered by new technologies of semantic analysis, in particular sentiment analysis, therefore confirming that every cryptocurrency is, in fact, a “social currency”.

(1) For detailed information of the filter of Kalman please see R.E. Kalman, “A New Approach to Linear Filtering and Prediction Problems”, Transactions of the ASME–Journal of Basic Engineering, vol. 82, Series D, pp. 35–45, 1960

* Head of Data Scientist - Damantic