The most irritating differences between R and Python

Lately I’ve been switching back and forth between R and Python for data analysis, so I’ve been getting used to keeping track of the many differences in syntax and behavior between the two languages. I don’t think I’d argue that either way is right or wrong most of the time; they’re just different due to their different histories and conventions.

In most cases where the languages differ, a valid R expression will produce an error in Python or vice versa, making it easy to tell that something is wrong. What makes these three differences irritating is that they’re all instances where the syntax looks the same but the meaning is different, making it easy to write code that fails silently. This makes it easy to get an incorrect result with no indication that anything is amiss.

 

1. R’s arrays are 1-indexed, while Python’s arrays are 0-indexed.

In R, element 1 is bubbles:

 

Image

 

In Python, element 1 is turtles:

Image

There’s a fascinating history behind the origin of 0-indexed arrays, by the way.

2. In Python, assignment between two names binds both names to the same object, while in R, assignment between two names creates a new object.

In R, if I assign “b = a” and then change b, it doesn’t affect a because R creates a copy:

Image

 

 

In Python, if I assign “b = a” and then change b, the value of a changes too because both names point to the same object. (I used a numpy array here, but the same is true with other data types.)

Image

 

 

To achieve the R-style behavior in Python, use copy() to create a copy:

Image

If you want to achieve the Python behavior in R, too bad, you’re out of luck. As far as I know there’s no way to make two different names point to the same object (although I can’t say I’ve ever desired that behavior in R anyway).

 

3. The colon operator’s end location is included in the results in R, but not included in the results in Python.

In R, when you subset a data frame using the colon operator, the result includes both the start and end values. In Python, when you slice an array using the colon operator, the result includes the start value but not the end value.

Thus, a[1:3] in R returns 3 elements:

Image

but in Python it only returns 2 elements:

Image

 

This makes it easy to end up with off-by-one errorsĀ if you’re not careful (which are, of course, one of the two hard things in computer science).