For an introduction to the topic see this post.
Calling in all hackers interested in gliding/paragliding: let's bring modern data mining into the sport! There's plenty of IGC files residing in various corners of the Internet.
Some time ago I've been experimenting with ~50GB of logs scraped from SoaringSpot (years 2002 - 2015), to plot maps of good/bad thermal regions. The results of my work are here. I believe, however, that more stuff can be discovered from this data set. Hence I've decided to release the thermal data set I've been using.
The data comes as a plain CSV file. One record represents a fragment of a glider track corresponding to circling inside a thermal, this is essentially the point of the entry and the point of the exit. Plus some extra, computed values. Description of the columns:
And the computed values, i.e. values implied by the above:
The data about Sun angle/radiation power was computed using pysolar. Timestamps are standard UNIX timestamps, in UTC.
Vast majority of the thermals are in Europe:
Per-continent density maps:
In total, 235MB/746MB compressed/uncompressed, almost 5.000.000 thermals, based on 50GB of IGC files. BitTorrent download: