Fake data: How bitcoin can (not!) predict the future of Tesla stock price

I was running through my Twitter feed the other day and stumbled upon a a tweet by a fellow Icelandic economist, Viðar Ingason:

For those of you who read Icelandic, I do not need to explain Vidar’s comment on the chart (which is not his creation). However, for those who do not read Icelandic — the 99.994% of the human population — here is a rough translation:

„It is interesting to see how everybody wants to predict the Tesla share price peak. Which is fine and dandy. But this kind of chart provides no information about anything. This is called spurious correlation and it provides no insights at all. I prefer the correlation between avocados and Bitcoin.“

Vidar is 100% correct. This chart is misleading, to say the least. I am not saying anyone intended to mislead anyone with this chart, I do not know the complete history of it’s creation. Probably the chart is either a product of thought laziness or, more likely, not meant to be taken seriously. Regardless of the intent, it is an excellent example of the dangers of data.

The original chart

The chart in in the tweet above plots two series of data over time (time-series):

  • Bitcoin price in red, and
  • the stock price of Tesla in green.

On the vertical axis measure is USD. The green Tesla line is the actual stock price of Tesla from from around March-2019 until the 5-Februar-2020. The blue line is the Bitcoin price, from July-2017 to January-2019 (the 4th, I think). Below I have recreated the chart.

Figure 1: The chart, recreated

Note: the horizontal axis in the original chart was based on Tesla dates, in this chart I have used the Bitcoin dates (reason being, I’m still getting the hang of R and it is infuriating to work with time-series in it, after growing up with Stata).

by looking at figure 1 you get an acute feeling of the chart creator being onto something. The blue line, demonstrating the recent Tesla stock price madness, is rising at about the same rate as bitcoin is doing. That is: the Tesla data and Bitcoin data, as it is presented in this graph, are correlated; Bitcoin goes up Tesla goes up.

That is no coincidence. Because the author of this graph has managed to break every single rule of honest statistics, all at once [1]. In fact, it is such a deceiving chart that it is hard to even start pointing out where it goes wrong. I’ll give it a shot.

How did the chart creator get there?

The first thing the author of this chart did was to download data. The Tesla data he downloaded looks, on it’s own, like this:

Figure 2: Tesla share price

Telsa Share price, for the entire period in the original graph. I apologise for the switching of colours, but I als till figuring out the annoying aesthetics in ggplot2.

Other than being absolutely bunkers, in terms of evaluation of a tiny carmaker with completely unknown future potential, this chart is okay. It simply shows the madness of markets, but is completely honest.

The chart creator then downloaded the Bitcoin data, which when honestly presented is another testament to the irrationality of crowds and markets:

Figure 3: Bitcoin price

I refuse to call bitcoin prices an exchange rate. As of today it does not function as a currency, but rather as a horse race track for nerds.

We can overlay the two data series from figure 3 and 4, to show how these two compare, without any adjustments of the data. Figure 4 below shows that apple-to-apple comparison, which is neither interesting nor insightful.

Figure 4: Bitcoin and Tesla share price, as they are

Bitcoin and Tesla are not very comparable, in absolute terms. That is not to say that data of different magnitude can not be comparable, not at all. But it is still worth plotting to understand what is being compared.

What the graph creator does next is not a all criminal, it is rather a convenient way to investigate  if the two series of data are correlated. What he does is simple:

  • First, he multiplies the price of Tesla by a factor of (around) 13;
  • then he creates a new vertical axis on the right side, which is a factor 13 multiple of the vertical axis on the left side.

The outcome of that manipulation is the following graph:

Figure 5: Bitcoin and Tesla share price, with two axis

Overlaying the Bitcoin and Tesla prices shows simply two uncorrelated series of data, doing their thing.

And now we are getting into the real manipulation of the data. Pay close attention to the end of the chart above. notice that early 2019 Tesla share costed around $300, shortly after the stock starts it’s upward acid-trip to ridiculous values. Next the graph creator moves that bit of the Tesla data back to may 2017, when Bitcoin was flipping out for the first time.

At that point, he could have stopped and said: „hey, at some point in the past, Bitcoin prices rose at an exponential rate, as Tesla just did in the last 8 months, or so“. Had he done that, the chart would have taken the following form:

Figure 6: Tesla stock price moved back 548 trading days, with all the Bitcoin data

I spent quite a time developing my own colour pallet to use for these graphs. But then when I moved the colour syntax inside the aes of ggplot2, it was overwritten. I need to figure this out. any advice out there?.

The chart above is not really good at supporting any hypothesis about the future of Tesla. It says: sure, Bitcoin had a stupid rally, — as Tesla is doing now, but there after Bitcoin did all sorts of nonsense — and it is unlikely that Tesla will follow that pattern.

But he didn’t.

Instead the chart creator cut the Bitcoin series off some time in January 2019. Which gives a really nice correlation, plus a plausible story about the future. A story that says: Tesla is like bitcoin, it will shoot up, and shortly after crash — just like Bitcoin. For dramatic effect, I have reproduced the original chart from the beginning here below:

Figure 7: The original chart, recreated, reproduced

Nice but nonsense.

What’s wrong with that?

Good that you asked. The answer is tons. But the fundamental problem is that if you cut, almost, any time-series you are bound to find a short period of exponential growth with in it. Especially if that data is of a stock price, especially if it is a crazy speculative stock, that defies all fundamentals.

As an example, in the same Bitcoin data, in 2019, the price suddenly started growing at an exponential rate (see figure 3). The author could have equally:

  • moved the Tesla data back around 150 days,
  • multiplied it by a factor of 14, and
  • added secondary axis.

The result of that would have been the following chart:

Figure 8: Another, miraculous, correlation of Tesla and Bitcoin prices

Miraculously Bitcoin predicts Tesla stock price twice.

Funny enough, the relationship with the second Bitcoin rally is at least as strong as with the first one. Which goes to show how increasable easy it is — when you have time series, of certain properties, with sufficient amount of observations — to mine the data for similar patterns. Finally, had the chart creator limited the time horizon to Des-2018 to Feb-2020, he would have ended up with the following chart:

Figure 9: A rally followed by a modest drop and some random nonsense is not as sexy

Another story, same data.

In which case the story would have been:  Tesla will come a bit down, but stay strong after this rally. Which is not as sexy as a big crash, predicted by Bitcoin.


[1] There is a lot of jargon that can be used to describe related issues. The purpose of this post is not to write an academic description of the problem, there are academics out there that do that very well, and much better than I could ever do. Interested readers can use DuckDuckGo (or Bing, or Google) to search for jargon such as: Endogeneity, Autocorrelation, unit-root and Cointegration if interested in rabbit holes.

[2] In producing this analysis I used Rstudios. I am new to that software (I am of the Stata generation), and I am slowly getting the hang of it. I am falling in love with the features, but the syntax is still giving me hard time. My analysis can be found on my GitHub page: https://github.com/Eikonomics/TeslaBitcoinNonsense.

[3] In this post I give the chart creator a hard time. I do not intend to assign him an intent, but it is rather written in this way to make the point clear.

Færðu inn athugasemd

Skráðu umbeðnar upplýsingar að neðan eða smelltu á smámynd til að skrá þig inn:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Breyta )

Twitter picture

You are commenting using your Twitter account. Log Out /  Breyta )

Facebook photo

You are commenting using your Facebook account. Log Out /  Breyta )

Tengist við %s

%d bloggurum líkar þetta: