I was running through my Twitter feed the other day and stumbled upon a a tweet by a fellow Icelandic economist, Viðar Ingason:
Mjög gaman að sjá hvað öllum langar til að spá fyrir um toppinn á Tesla. Allt í lagi með það. En svona graf segir einmitt ekki neitt. Þetta kallast spurious correlation og ekkert hægt að lesa út úr þessu. Annars finnst mér fylgnin á milli Avocado og Bitcoin mun skemmtilegri. pic.twitter.com/pOkaOOJFD9
— Vidar Ingason (@vidaringa) February 4, 2020
For those of you who read Icelandic, I do not need to explain Vidar’s comment on the chart (which is not his creation). However, for those who do not read Icelandic — the 99.994% of the human population — here is a rough translation:
„It is interesting to see how everybody wants to predict the Tesla share price peak. Which is fine and dandy. But this kind of chart provides no information about anything. This is called spurious correlation and it provides no insights at all. I prefer the correlation between avocados and Bitcoin.“
Vidar is 100% correct. This chart is misleading, to say the least. I am not saying anyone intended to mislead anyone with this chart, I do not know the complete history of it’s creation. Probably the chart is either a product of thought laziness or, more likely, not meant to be taken seriously. Regardless of the intent, it is an excellent example of the dangers of data.
The original chart
The chart in in the tweet above plots two series of data over time (time-series):
- Bitcoin price in red, and
- the stock price of Tesla in green.
On the vertical axis measure is USD. The green Tesla line is the actual stock price of Tesla from from around March-2019 until the 5-Februar-2020. The blue line is the Bitcoin price, from July-2017 to January-2019 (the 4th, I think). Below I have recreated the chart.
Figure 1: The chart, recreated
by looking at figure 1 you get an acute feeling of the chart creator being onto something. The blue line, demonstrating the recent Tesla stock price madness, is rising at about the same rate as bitcoin is doing. That is: the Tesla data and Bitcoin data, as it is presented in this graph, are correlated; Bitcoin goes up Tesla goes up.
That is no coincidence. Because the author of this graph has managed to break every single rule of honest statistics, all at once . In fact, it is such a deceiving chart that it is hard to even start pointing out where it goes wrong. I’ll give it a shot.
How did the chart creator get there?
The first thing the author of this chart did was to download data. The Tesla data he downloaded looks, on it’s own, like this:
Figure 2: Tesla share price
Other than being absolutely bunkers, in terms of evaluation of a tiny carmaker with completely unknown future potential, this chart is okay. It simply shows the madness of markets, but is completely honest.
The chart creator then downloaded the Bitcoin data, which when honestly presented is another testament to the irrationality of crowds and markets:
Figure 3: Bitcoin price
We can overlay the two data series from figure 3 and 4, to show how these two compare, without any adjustments of the data. Figure 4 below shows that apple-to-apple comparison, which is neither interesting nor insightful.
Figure 4: Bitcoin and Tesla share price, as they are
What the graph creator does next is not a all criminal, it is rather a convenient way to investigate if the two series of data are correlated. What he does is simple:
- First, he multiplies the price of Tesla by a factor of (around) 13;
- then he creates a new vertical axis on the right side, which is a factor 13 multiple of the vertical axis on the left side.
The outcome of that manipulation is the following graph:
Figure 5: Bitcoin and Tesla share price, with two axis
And now we are getting into the real manipulation of the data. Pay close attention to the end of the chart above. notice that early 2019 Tesla share costed around $300, shortly after the stock starts it’s upward acid-trip to ridiculous values. Next the graph creator moves that bit of the Tesla data back to may 2017, when Bitcoin was flipping out for the first time.
At that point, he could have stopped and said: „hey, at some point in the past, Bitcoin prices rose at an exponential rate, as Tesla just did in the last 8 months, or so“. Had he done that, the chart would have taken the following form:
Figure 6: Tesla stock price moved back 548 trading days, with all the Bitcoin data
The chart above is not really good at supporting any hypothesis about the future of Tesla. It says: sure, Bitcoin had a stupid rally, — as Tesla is doing now, but there after Bitcoin did all sorts of nonsense — and it is unlikely that Tesla will follow that pattern.
But he didn’t.
Instead the chart creator cut the Bitcoin series off some time in January 2019. Which gives a really nice correlation, plus a plausible story about the future. A story that says: Tesla is like bitcoin, it will shoot up, and shortly after crash — just like Bitcoin. For dramatic effect, I have reproduced the original chart from the beginning here below:
Figure 7: The original chart, recreated, reproduced
What’s wrong with that?
Good that you asked. The answer is tons. But the fundamental problem is that if you cut, almost, any time-series you are bound to find a short period of exponential growth with in it. Especially if that data is of a stock price, especially if it is a crazy speculative stock, that defies all fundamentals.
As an example, in the same Bitcoin data, in 2019, the price suddenly started growing at an exponential rate (see figure 3). The author could have equally:
- moved the Tesla data back around 150 days,
- multiplied it by a factor of 14, and
- added secondary axis.
The result of that would have been the following chart:
Figure 8: Another, miraculous, correlation of Tesla and Bitcoin prices
Funny enough, the relationship with the second Bitcoin rally is at least as strong as with the first one. Which goes to show how increasable easy it is — when you have time series, of certain properties, with sufficient amount of observations — to mine the data for similar patterns. Finally, had the chart creator limited the time horizon to Des-2018 to Feb-2020, he would have ended up with the following chart:
Figure 9: A rally followed by a modest drop and some random nonsense is not as sexy
In which case the story would have been: Tesla will come a bit down, but stay strong after this rally. Which is not as sexy as a big crash, predicted by Bitcoin.
 There is a lot of jargon that can be used to describe related issues. The purpose of this post is not to write an academic description of the problem, there are academics out there that do that very well, and much better than I could ever do. Interested readers can use DuckDuckGo (or Bing, or Google) to search for jargon such as: Endogeneity, Autocorrelation, unit-root and Cointegration if interested in rabbit holes.
 In producing this analysis I used Rstudios. I am new to that software (I am of the Stata generation), and I am slowly getting the hang of it. I am falling in love with the features, but the syntax is still giving me hard time. My analysis can be found on my GitHub page: https://github.com/Eikonomics/TeslaBitcoinNonsense.
 In this post I give the chart creator a hard time. I do not intend to assign him an intent, but it is rather written in this way to make the point clear.