Freeing my Fitbit data

I love my Fitbit, but I want to play with the data myself, not just look at the graphs the fine folks at Fitbit have decided I should see. Fortunately, there’s a great step-by-step guide with a script to download Fitbit data into a Google Drive spreadsheet, which handles interfacing with the Fitbit API. 

I downloaded my data beginning with April 2012 when I started using Fitbit. There were 18 days without any steps recorded (days when I forgot to wear the Fitbit or it ran out of batteries, plus a few days when it broke and I had to have it replaced), which I eliminated from the analysis. Other than those few days, I have a complete record of my activity for the past 21 months, which is pretty cool. 

I used Python to plot the daily step counts along with a time-smoothed rolling average to make it easier to see long-term trends. 


The most obvious features to me are the high-step-count spikes during the summer, when I’m most likely to go adventuring outdoors on the weekends. To get a better sense of the distribution of my steps, I plotted a histogram:


I average about 10,000 – 11,000 steps per day, but there’s a lot of variation. I usually end up recording at least 12,000 steps on days when I bike to work, because I find that each pedal rotation registers as a step. I suspect that the tail on the left side of the histogram (<5000 step days) are mostly days when I was sick or otherwise feeling down, because I generally find it downright difficult to walk fewer than 5000 steps in a day unless I have a cold or similar. 

Finally, I broke the histogram down by days of the week. It got a little messy to look at, so I used gaussian kernel smoothing to turn the histogram into a density plot that’s easier on the eyes: 


Most of the days are similar to each other, but Saturday clearly has the most variation, and the most days with very high step counts. In fact, I walk 15,000+ steps 17% of the time on Saturdays, but less than 10% of the time on other days of the week. And on that note, the dog and I are going for a walk. 

Here’s the code I used to generate the graphs: 

[gist /]

Dental year-in-review

Earlier this year in May, I started tracking my dental hygiene habits by giving myself a sticker on a calendar every time I flossed my teeth. This started because I really hate flossing — there’s nothing inherently fun or rewarding about it, so I wondered if there was anything I could do to incentivize it or make it more fun.

I just kept going with it, and now I have eight months of flossing history to look at. I entered the data into Excel (giving myself a ‘1’ on days where I had a sticker on the calendar and a ‘0’ otherwise), exported it as a CSV, and used python to graph the results broken down by days of the week.

I used python’s matplotlib to make the graph, then I also used RPy to make the graph using R since I’ve never really directly compared graphs made using the two methods. Here’s the matplotlib plot:


and here’s the RPy plot:


I prefer the bars of the RPy plot because they line up nicely without any effort, but I do enjoy how matplotlib turned the y-axis labels horizontal for me. Of course for any more complex plotting I think the advantages of RPy would become much more apparent.

But more importantly: what can I learn about my oral health from this analysis? My flossing habits during the week are quite good, with >80% success rate regardless of the day, but on the weekends I tend to sleep in, fail to follow a standardized morning routine, and consequently end up promising myself I’ll floss at night (hint: it never happens). Don’t tell my dentist.

Here’s the code I used to generate these images: