Explore the Strava API and find out with Python
In the past few years Strava became my go-to app to track my cycling and running activities. One of the many cool features on the app is the ability for your friends to give you kudos for your activities. And sometimes, these kudos are just the morale boost you need to keep going.
In 2022 as I was preparing for a climb in the Alps and used Strava more regularly than before, I noticed that some people were more generous than others with their kudos. But I had no precise idea of who it was and this got me thinking. Is there a way to have a deeper look at my stats than what the app offers? Well, there is.
In this article, I used a very common “Extract Transform Load” approach to structure the project. I’m showing how I managed to get my activities data from Strava (Extract), compute new insights (Transform) and build the “Kudos Graph” and other visualizations to know who to thank for the support (Load).
I found over time that this approach was a great way to organize similar data projects. One last note before jumping in, everything shown here is reproductible and the code is available on Github (link at the end of the article) so you can build your own.
Let’s get coding!
The first thing we need to do is to authenticate — i.e. getting the access token from Strava. The following function will do just that with a POST request, containing the details we got in the previous section (client id, client secret, refresh token and authorization code) to the endpoint https://www.strava.com/oauth/token.
In this section, we create two functions to:
- Get the list of all the activities of the profile.
Using the access token we previously got and specifying two dates defining the scope of activities we are interested in, we get the list of all activities between these two dates and their main characteristics.
- Get the list of kudoers of a specific activity.
Unfortunately the list of the activities’ kudoers is not contained in the result of the previous request. We need to build the get_kudos function which returns the list of kudoers for a single activity, identified by its activity_id.
Now that we got the data we wanted, the idea is to keep only what we need and put this in a Pandas Dataframe.
The transform function below extracts from the list of activities the following data:
- The activity id which is used as a unique identifier for an activity.
- The number of kudos of each activity.
- The list of all kudoers for an activity by leveraging the get_kudos() function in a loop.
- The distance of each activity.
- The time each activity took.
- The type of activity.
⚠️ There is a limitation in the usage of the Strava API. We are limited to 100 calls every 15 minutes and 1000 calls per day.
In this project, we are calling the API once to get the list of activities, and then once per activity to get the list of kudoers in each one.
This means, if you have more than 100 activities in the considered window, the code as it is will not work and you will need to slightly modify it to comply with the API usage limit.
The only thing left to do is to capitalize on the functions we just built and start plotting some interesting things!
In my case, I am considering my activities in 2022, to this date — 24/10/2022.
From our data structure, it is super easy to get a few high level KPIs in the given period:
Because we got the sport type for each activity in the previous section, we can also easily investigate if certain types of activities are more prone to receiving kudos than others. Here are the average number of kudos per type of activities:
Even if it is not to be the most popular type of activities, running was the sport where I had the most data points and so this is where I tried to dig a bit more. We can try to understand why an activity would get more kudos than another. Let’s look at the possible correlation between the distance of the run and the number of kudos the activity would get.
It turns out that there seems to be a positive correlation, i.e. the longer the run, the higher the number of kudos, as shown in the graph below.
Granted, the statistical significance of this result is debatable given the small number of data points we considered. The only certain conclusion here is that I need to run more.
We could go further in the analysis, looking at the influence of other variables, but I’ll leave that for another article.
Finally, we can plot the “Kudos Graph” in which we can see who our top supporters are and give them a shout out.
Of course, some people are more addicted to Strava than others and will give kudos as they scroll down their activity feed, while others will only open the app once in a while and give kudos only to the most recent activities they happen to see.
This graph is in no way about judging people for giving kudos or not, it’s simply about illustrating new insights you would see no where else — not even in the premium version of the app.
No doubt there is way more we can do with all the data we can get from the Strava API. This was simply a first shot at answering an unusual question and a good exercise to get things going.
If you want to analyse your Strava activities and figure out who your top supporters are, the entire code can be found here:
Thanks for reading all the way to the end of the article!
Feel free to leave a message below, or reach out to me through LinkedIn if you have any questions / remarks!
More to come!
Who’s your Number 1 Supporter on Strava? Republished from Source https://towardsdatascience.com/whos-your-number-1-supporter-on-strava-5a888230f361?source=rss—-7f60cf5620c9—4 via https://towardsdatascience.com/feed