Chapter 17 Conditional Statements

A programmer is going to the grocery store and his wife tells him, “Buy a gallon of milk, and if there are eggs, buy a dozen.” So the programmer goes, buys everything, and drives back to his house. Upon arrival, his wife angrily asks him, “Why did you get 13 gallons of milk?” The programmer says, “There were eggs!”

17.1 Simple Conditions

Going hand in hand with for loops are what we call conditional statements, or statements that evaluate to TRUE or FALSE. These are really important, especially inside of for loops (and nested for loops too!), because what they allow us to do is operate only on certain observations of data that meet our criteria.

You can think of conditionals like flow charts, where the statement is the node (splitting point), and the operations and commands are what R does down the corresponding path. They’re essentially a bunch of statements that say “If my condition is true, then do this set of operations.”

We’ve been using a few conditional statements all along, we just didn’t realize it. For example, after each iteration of our for loops, R checks the boundaries. If the next index is still within those boundaries, then R goes through the loop. Otherwise, it exits out of the loop and goes to the next line of code.

Conditional statements are very similar to for loops. One way to use a conditional statement is by using the ifelse() function. The three arguments you’ll need (in order) are the condition you’d like to check, what to do if the condition is met, and what to do if the condition isn’t met. As a quick vectorized example, let’s say we have a vector x that has the numbers 1 through 10. We want to change all the numbers that are greater than (but not equal to) 6 to just be 6, otherwise we want to keep the value that’s in x. ifelse() is great at this! Take a look.

##  [1] 1 2 3 4 5 6 6 6 6 6

Another way to insert a conditional statementis to use if() and put your condition(s) inside the parenthesis. Then, to specify what to do when that condition is met, use a set of curly braces ({}) and put the code you’d like to run inside of those braces.

Let’s go back to the example from the last chapter with that multiplication table.

1 2 3 4 5 6 7 8 9 10 11 12
1 1 2 3 4 5 6 7 8 9 10 11 12
2 2 4 6 8 10 12 14 16 18 20 22 24
3 3 6 9 12 15 18 21 24 27 30 33 36
4 4 8 12 16 20 24 28 32 36 40 44 48
5 5 10 15 20 25 30 35 40 45 50 55 60
6 6 12 18 24 30 36 42 48 54 60 66 72
7 7 14 21 28 35 42 49 56 63 70 77 84
8 8 16 24 32 40 48 56 64 72 80 88 96
9 9 18 27 36 45 54 63 72 81 90 99 108
10 10 20 30 40 50 60 70 80 90 100 110 120
11 11 22 33 44 55 66 77 88 99 110 121 132
12 12 24 36 48 60 72 84 96 108 120 132 144

This time, however, let’s get more creative with our operation. Let’s replace the square numbers in the table with their square root. That is, if i and j are the same number, put i in that place instead of the squared value. We’ll have to use a nested loop again, but now we can check the condition and act accordingly.

1 2 3 4 5 6 7 8 9 10 11 12
1 1 2 3 4 5 6 7 8 9 10 11 12
2 2 2 6 8 10 12 14 16 18 20 22 24
3 3 6 3 12 15 18 21 24 27 30 33 36
4 4 8 12 4 20 24 28 32 36 40 44 48
5 5 10 15 20 5 30 35 40 45 50 55 60
6 6 12 18 24 30 6 42 48 54 60 66 72
7 7 14 21 28 35 42 7 56 63 70 77 84
8 8 16 24 32 40 48 56 8 72 80 88 96
9 9 18 27 36 45 54 63 72 9 90 99 108
10 10 20 30 40 50 60 70 80 90 10 110 120
11 11 22 33 44 55 66 77 88 99 110 11 132
12 12 24 36 48 60 72 84 96 108 120 132 12

Cool, it worked flawlessly! Also, make sure you see how the code is styled. The closing braces all line up vertically with what they open, and after each opening brace, we went to the next line and indented one tab. This makes it easy to see what happens where in the code, and allows us to easily edit it as we need to.

Conditinoal statements are sometimes a little tricky though. For example, let’s say you want to check if the \(i\)th element of a vector called x is NA. You may think that you be tempted to write something like if(x[i] == NA), but this is WRONG! Instead, you want to use the is.na() function. The line should read if(is.na(x[i])).

17.2 Compound Conditions

Compound conditions are just combining conditions with & or | (“and” and “or” in R). You can make conditions as compounded as you want, but just be careful that they don’t get to be too complex. If you’re trying to hone in on one particular point in your data, it may be a better idea to just manipulate that one point.

It’s also a good idea to use parentheses extensively when writing your conditions, that way there’s no confusion as to which compounds go together. As an example, if you want to execute code if condition a is met and either b or conditon c are also met, you should write this condition as if(a & (b | c)) where a, b, and c are the conditions that you want to be met.

Compounding conditions can help you to cover a lot of cases in your data, but the expressions can get confusing quick. Since conditional statements can be nested, it may be better to take the bigger conditions and then nest another conditional statement inside of the broader one.

One very important thing about conditional statements (and really all code in general) is that they’re evaluated and executed in the exact order that you specify them in. That’s why it’s important to import a dataset before trying to do any kind of manipulation on it, or why you should have a variable declared before you try and use it. Conditional statements are no different. If we write the following pseudocode (code syntax but regular words),

R will start by evaluating condition A, and then move to condition B. If we flipped them, R would evaluate condition B first. Be careful when you specify the conditions! If what you want to do where it says Do something different here depends on something you did where condition A was met, you may not get the result you wanted at the end.

In this example, it’s also entirely possible we could meet condition A and condition B, but really we’d like to handle those conditions differently. That’s where else if comes in handy.

17.3 else and else if

When we write these conditional statements, we may have more than two possible scenarios that we want to address. This is where else if makes itself useful. This basically says “If the condition we just checked isn’t met, but this new condition is met, do this thing I’m going to tell you to do.” else alone says “If none of the other conditions are met, do this last thing.” We can deploy them (in pseudocode) as follows:

17.4 Extra Things for Conditionals

A few quick points things to remember:

  • ! is called the negation operator. It’s the R equivalent of the word “not.” That’s why != is “does not equal.” However, it works with functions like is.na() as well. So if you wanted to do some calculations only if the \(i\)th element of x is not NA, you can count on if(!is.na(x[i])) to do the job.

  • %in% is a shortcut way to check if a value is in another vector. If we have a variable called temp, and we want to know if temp is in a vector v1, writing if(temp %in% v1) will do the trick. You can negate this by using ! and enclosing your conditions