In R, the operators “|” and “&” indicate the logical operations OR and AND. For example, to test if x
equals 1 and y
equals 2 we do the following:
> x = 1; y = 2
> (x == 1) & (y == 2)
[1] TRUE
However, if you are used to programming in C you may be tempted to write
#Gives the same answer as above (in this example...)
> (x == 1) && (y == 2)
[1] TRUE
At this point you could be lulled into a false sense of security and believe that they could be used interchangeably. Big mistake.
Let’s consider another example, this time a vector comparison:
> z = 1:6
> (z > 2) & (z < 5)
[1] FALSE FALSE TRUE TRUE FALSE FALSE
> z[(z>2) & (z<5)]
[1] 3 4
but the double “&&” gives
> (z > 2) && (z < 5)
[1] FALSE
> z[(z > 2) && (z < 5)]
integer(0)#Probably not what you want
It’s all gone a bit pear shaped! In fact it could have been worse:
> (z > 2) && (z < 5)
[1] TRUE
> z[(z > 0) && (z < 5)]
[1] 1 2 3 4 5 6
Now you’ve the wrong answer and something that would be very tricky to spot. This is because R recylces the TRUE
variable.
What’s the difference?
Well from the R help page:
“The longer form evaluates left to right examining only the first element of each vector”
where the longer form refers to “&&”. So
> (z > 2) && (z < 5)
[1] FALSE
is equivalent to:
> (z[1] > 2) & (z[1] < 5)
[1] FALSE
The same concept applies to the OR operator, “|”.
As the commentators point out below, another key difference is for the longer form
“Evaluation proceeds only until the result is determined”
This concept is highlighted in the following example:
> f = function(){cat("My name is f\n");return(TRUE)}
> g = function(){cat("My name is g\n");return(FALSE)}
> f() | g()
My name is f
My name is g
[1] TRUE
> f() || g()
My name is f
[1] TRUE
This has two benefits:
- Evaluation will be faster. In the above example, the function
g
isn’t evaluated (thanks to Andrew Robson and NotMe) - Also, you can use the double variety to check a property of a data structure before carrying on with your analysis, i.e.
all(!is.na(x)) && mean(x) > 0
(thanks to Pat Burns for this tip)