Really Basic R Tips – Intro.

Aren’t there plenty of basic R tutorials out there? Yes, and they are great. However I wanted to share a few things that I wish I had understood a bit better when I first started working with R. I am sure they probably were covered in one of the many books, videos, blog posts, or tutorials that I consumed when beginning to learn R. Many of those resources are much more comprehensive and thorough than this series of posts is going to be. That is really great, and the excruciating detailĀ ISĀ important for a complete understanding and performing complex tasks. However, for anyone coming to R from outside of computer science(which is going to be most people, it definitely was me), you are more interested in simply getting stuff done than reveling in the minutia of what the computer is doing. This will come later. After a solid foundation is gained, these details provide the basis of much more powerful and elegant analyses. In the mean time, they can present a real “can’t see the forest for the trees” problem, as the shear number and precision of them can be overwhelming and make it hard to keep straight just what is applicable and important in any particular situation. This series will hopefully help bring clarity to some of the initial confusion of learning R, and in all likelihood, programming in general.

If you are coming to R from some field other than computer science, there is an important thing to keep in mind: coding is a tool. The word coding might bring to mind a scrawny, pale, young man, swaddled in a stretched and stained black hoody, scolioatically hunched over a keyboard, bathed in flickering green light from the screen inches from his face, typing furiously to produce text filled with arcane strings of symbols that seem to have no real meaning, surrounded by empty, crumpled Cheetos bags and piles of half empty energy drink cans. Try to forget this nerdy stereo type. For you, code, and in particular R, is simply a tool. In fact, it is tool that you already have experience with, though at first this may not seem obvious. I am making an assumption about the reader(not usually safe…but let me know if this does not apply to you, as I want to hear more about your life), that is: that you have used a calculator.

A calculator is really nothing more or less than a purpose built computer that is programmed using the standard “language” of arithmetic. Realizing this can be useful in a couple of ways. First, it helps break down the idea that you have to be a computer geek to benefit from being able to code. This is like saying an accountant has to have a fetish for building calculators from scratch to be able to use one to track the daily expenditures of a business. This is self evidently not true(though the two things are not mutually exclusive). A calculator is simply a tool and coding in R is simply a somewhat more advanced tool that lets you do MUCH more than a simple calculator.

The other important thing that thinking this way shows us, is that the output is *nearly* completely dependent on the input you provide. It may seem like “This stupid script is not doing what it is supposed to” but usually this really means “This script is not doing it what I want it to because what I want it to do is actually different from what I am telling it to do”. This often is the crux of an issue you may be having and the way through the difficulty is to step back, make sure you understand what you are trying to do, what you are telling the program to do, and what the output that it is giving you really means. The computer is simply a tool, not a wizard that can plumb the depths of your soul and use what it sees there to conjure the fulfillment of your deepest desires. You wouldn’t expect to push “x^2 + 7x + 10 = ” on a calculator and have it tell you that the intercepts are -2 and -5. The calculator is capable of helping you determine this, but you have to tell it how to do it in a way that it “understands”, specifically (-7 – sqrt(7^2 – 4*1*10))/(2*1) and then (-7 + sqrt(7^2 – 4*1*10))/(2*1) . Try to keep this in mind as you work to learn to program and realize that especially at first, what it actually means to learn to code is that you are learning how to provide the correct input to a very powerful calculator.

In the next post, the focus will be on getting setup to work with R and how to make your workflow easier. This will enable spending less time on fiddly nuts and bolts issues and instead make it easier to get real work done with R.

*I say nearly because due to the much more complex nature of computer programs and scripts, you can sometimes get errors from the software you are using itself. However, this is usually quite rare when using stable releases of common tools to perform common tasks.