Chapter 7 Coding Style

It’s almost time to start applying what you’ve learned, but before we get into writing code, we should take a minute to start talking about more good coding practices. It’s often said that the best way to break bad habits is to not fall into them in the first place, so we’ll try to get into good habits right from the get-go.

The code that you write will be read by R, which will ignore extra spaces, correct for indentation, and for all intents and purposes run properly, assuming that it’s syntactically correct. However, you will also be reading through your code, both as you write and debug it. That means that it should be easy for you – or anyone else for that matter – to read as well as R. This chapter’s whole purpose is to make this as easy as possible for you to do.

7.1 Comments

There’s one important symbol/operator that we left out in chapter 5, and that’s the comment symbol. A comment is just a note for yourself so that you can explain what a block of code does, why you wrote the code a particular way, or really just anything else that you’d like to note at that point in the code. They won’t be evaluated by R as commands, so it may be useful to even comment out parts of your code (make line(s) of code into comments to prevent them from running but save yourself from retyping). To make a comment in R, use #.

Pro tip: You can highlight whole lines of code, then go to the Code menu at the top and select Comment/Uncomment Lines to comment out (or uncomment) sections of code at a time. The shortcut on a Mac is Cmd + Shift + C, and on a PC it’s Ctrl + Shift + C.

Get in the habit of commenting frequently as you code. As we said before, these are ways for you to remember what you did so that when you revisit your code, you remember what your thought process was.

7.2 Spacing

One really good practice is to put a single space before and after any operator you use. While it does lengthen the line of code itself, it makes it much easier to debug. You may find that you’ve used a = to check a condition when you should have used a ==, or you may see that you only put one * when you meant to use the right side as an exponent.

Spacing also refers to spacing lines of code out within your script. Have a line that actually takes up 3 lines? No problem, R can handle that, but it’ll be a pain to read. Just find a good breaking point in the line (usually after a comma) and go to the next line. It allows you to see more of your code in an easier format. You may even wish to put each individual argument of a function on its own line in some cases, and that’s encouraged! Keeping with our “script-is-a-MS-Word-Doc” analogy from earlier, related code lines should be grouped together, and separated from other grouped lines of code, just like you’d separate ideas in your essay into paragraphs. They should follow a logical order, be organized into groups, and after each group, you should skip a line to signal the next group is beginning. As we said before, R will ignore empty lines of code, so there’s no harm in skipping a line to organize your thoughts.

7.3 Indentation

If you were an absolutely perfect code-writer, this part would take care of itself. However, there’s no such thing as a perfect code-writer, therefore this is worth mentioning. In languages such as Python, indentation. Is. Everything. In R, it’s not as imperative in terms of functionality, but it’s equally imperative in terms of legibility. Once we get to control structures, you’ll be able to see these with much more clarity, but it’s a good idea that any time your code takes up more than one line and you’re working inside of parenthesis, braces, or brackets, you indent your code one tab (four spaces) to the right. Open a new set of parenthesis/braces/brackets after you’ve indented once? Indent again! No harm, only help!

7.4 Consistency

In an essay, you wouldn’t switch fonts, colors, or page layouts in the middle, would you? You shouldn’t change much in your code scripts either. Your code should be consistent in as many ways as you can find. This comes up a lot with naming and assignment, so those will be where we’ll turn our focus for now.

Name variables in the same way every time you name a variable in a script. If you usually use camel case to name your variables, make sure all of your variables (where applicable) are named with camel case. If you use underscores (_) in names, don’t switch to naming things with periods instead. (Example: if you name something my_variable, don’t name another variable my.new.variable). Names should be unique, concise, and descriptive.

In terms of assignment, pick either = or <- and stick with it. It’s good to realize that they do the same thing, and it’s even better to practice using both, but within a script, you should stick to just one. That way, anyone that reads it can clearly identify where you’ve named a variable.

7.5 Simplifying Coding Style

This may seem like a lot of very specific things to keep in mind. Luckily, RStudio has a built-in capability to handle this for you and make your life much easier. Simply highlight all of your code that you’d like formatted (it should be all of it), go to the Code menu, and select Reformat Code (Mac: Cmd + Shift + A. PC: Ctrl + Shift + A). Then, go back into the Code menu (with the code still highlighted) and select Reindent Lines (Mac: Cmd + I. PC: Ctrl + I.) While this won’t be necessary right away, this is an excellent, powerful tool to keep in your back pocket for when you do end up needing it.