Why?

September 17, 2011

UK R Courses – 2012

Filed under: Conferences, R, Teaching — Tags: , , — csgillespie @ 1:01 pm

The School of Mathematics & Statistics at Newcastle University (UK), are again running some R courses. In January, 2012, we will run:

The courses aren’t aimed at teaching statistics, rather they aim to go through the fundemental concepts of R programming. Further information is available at the course website. If you have any questions, feel free to contact me: colin.gillespie@newcastle.ac.uk

 

Bespoke courses are also on request.

January 28, 2011

R books for undergraduate students

Filed under: R, Teaching — Tags: , , , , , , — csgillespie @ 10:18 pm

In a recent post, I asked for suggestions for introductory R computing books. In particular, I was looking for books that:

  • Assume no prior knowledge of programming.
  • Assume very little knowledge of statistics. For example, no regression.
  • Are cheap, since they are for undergraduate students.

Some of my cons aren’t really downsides as such. Rather, they just indicate that the books aren’t suitable for this particular audience. A prime example is “R in a Nutshell”.

I ended up recommending five books to the first year introductory R class.

Recommended Books

  • A first course in statistical programming with R (Braun & Murdoch)
    • Pros: I quite like this book (hence the reason I put it on my list). It has a nice collection of exercises, it “looks nice” and doesn’t assume knowledge of programming. It also doesn’t assume (or try to teach) any statistics.
    • Cons: When describing for loops and functions the examples aren’t very statistical. For example, it uses Fibonacci sequences in the while loop section and the sieve of Eratosthenes for if statements.
  • An introduction to R (Venables & Smith)
    • Pros: Simple, short and to the point. Free copies available. Money from the book goes to the R project.
    • Cons: More a R reference guide than a textbook.
  • A Beginner´s Guide to R by Zuur.
    • Pros: Assumes no prior knowledge. Proceeds through concepts slowly and carefully.
    • Cons: Proceeds through concepts very slowly and carefully.
  • R in a Nutshell by Adler.
    • I completely agree with the recent review by Robin Wilson: “Very comprehensive and very useful, but not good for a beginner. Great book though – definitely has a place on my bookshelf.”
    • Pros: An excellent reference.
    • Cons: Only suitable for students with a previous computer background.
  • Introduction to Scientific Programming and Simulation Using R by Jones, Maillardet and Robinson.
    • Pros: A nice book that teaches R programming. Similar to the Braun & Murdoch book.
    • Cons: A bit pricey in comparison to the other books.

Books not being recommended

These books were mentioned in the comments of the previous post.

  • The Basics of S-PLUS by Krause & Olson.
    • Most students struggle with R. Introducing a similar, but slightly different language is too sadistic.
  • Software for Data Analysis: Programming with R by Chambers.
    • Assumed some previous statistical knowledge.
  • Bayesian Computation with R by Albert.
    • Not suitable for first year students who haven’t taken any previous statistics courses.
  • R Graphics by Paul Murrell
    • I know graphics are important, but a whole book for an undergraduate student might be too much. I did toy with the idea of recommending this book, but I thought that five recommendations were more than sufficient.
  • ggplot2 by Hadley Wickham.
    • Great book, but our students don’t encounter ggplot2 in their undergraduate course.

Online Resources

  • Introduction to Probability and Statistics by Kerns
    • Suitable for a combined R and statistics course. But I don’t really do much stats in this module.
  • The R Programming wikibook (a work in progress).
    • Will give the students this link.
  • Biological Data Analysis Using R by Rodney J. Dyer. Available under the CC license.
    • Nice resource. Possibly a little big for this course (I know that this is very picky, but I had to draw the line somewhere). Will probably use it for future courses.
  • Hadley Wickham’s devtools wiki (a work in progress).
    • Assumes a good working knowledge of R
  • The R Inferno by Patrick Burns
    • Good book, but too advanced for students who have never programmed before.
  • Introduction to S programming
    • It’s in french – this may or may not be a good thing depending on your point of view ;)

December 21, 2010

R programming books

Filed under: R — Tags: , , , — csgillespie @ 3:05 pm

My sabbatical is rapidly coming to an end, and I have to start thinking more and more about teaching. Glancing over my module description for the introductory computational statistics course I teach, I noticed that it’s a bit light on recommend/background reading. In fact it has only two books:

  • A first course in statistical programming with R (Braun & Murdoch)
    • Pros: I quite like this book (hence the reason I put it on my list). It has a nice collection of exercises, it “looks nice” and doesn’t assume knowledge of programming. It also doesn’t assume (or try to teach) any statistics.
    • Cons: When describing for loops and functions the examples aren’t very statistical. For example, it uses Fibonacci sequences in the while loop section and the sieve of Eratosthenes for if statements.
  • An introduction to R (Venables & Smith)
    • Pros: Simple, short and to the point. Free copies available. Money from the book goes to the R project.
    • Cons: More a R reference guide than a textbook.

What other good R books could I recommend? In particular, I’m looking for books that:

  • Assume no prior knowledge of programming.
  • Assume very little knowledge of statistics. For example, no regression.
  • Doesn’t try to teach statistics. So no “R with ….” type books.
  • Are cheap!

Suggestions  welcome (needed!)

November 23, 2010

R Style Guide

Filed under: R — Tags: , , — csgillespie @ 3:51 pm

Each year I have the pleasure (actually it’s quite fun) of teaching R programming to first year mathematics and statistics students. The vast majority of these students have no experience of programming, yet think they are good with computers because they use facebook!

Debugging students' R scripts

The class has around 100 students, and there are eight practicals. In some of  these practicals  the students have to submit code. Although the code is “marked” by a script, this only detects if the code is correct. Therefore, I have to go through a lot of R functions by hand and find bugs.

  • First year the course ran, I had no style guide.
    • Result: spaghetti R code.
  • Second year: asked the students to indent their code.
    • In fact, during practicals I refused to debug in any R code that hadn’t been indented.
    • Result: nicer looking code and more correct code.
  • This year I intend to introduce a R style guide based loosely on Google’s and Hadley’s guides.
    • One point that’s in my guide and not (and shouldn’t be) in the above style guides, is that all functions must have one and only return statement. I tend to follow the single return rule for the majority of my R functions, but do, on occasions, break it. The bible of code styling, Code Complete, recommends that you use returns judiciously.

R Style Guide

This style guide is intended to be very light touch. It’s intended to give students the basis of good programming style, not be a guide for submitting to cran.

File names

File names should end in .R and, of course, be meaningful. Files should be stored in a meaningful directory – not your Desktop!

GOOD: predict_ad_revenue.R
BAD: foo.R

Variable & Function Names

Variable names should be lowercase. Use _ to separate words within a name. Strive for concise but meaningful names (this is not easy!)

GOOD: no_of_rolls
BAD: noOfRolls, free

Function names have initial capital letters and are written in CamelCase

GOOD: CalculateAvgClicks
BAD: calculate_avg_clicks , calculateAvgClicks

If possible, make function names verbs.

Curly Braces

An opening curly brace should never go on its own line; a closing curly brace should always go on its own line.

GOOD:
if (x == 5) {
  y = 10
}
RtnX = function(x) {
  return(x)
}
BAD:
RtnX = function(x)
{
  return(x)
}

Functions

Functions must have a single return function just before the final brace

GOOD:
IsNegative = function(x){
  if (x < 0) {
    is_neg = TRUE
  } else {
    is_neg = FALSE
  }
  return(is_neg)
}
BAD:
IsNegative = function(x) {
  if (x < 0){
    return(TRUE)
  } else {
    return(FALSE)
  }
}

Of course, the above function could and should be simplified to
is_neg = (x < 0)

Commenting guidelines

Comment your code. Entire commented lines should begin with # and one space.  Comments should explain the why, not the what.

What’s missing

I decided against putting a section in on “spacing” , i.e. place spaces around all binary operators (=, +, -, etc.). I think spacing may be taking style a bit too far for a first year course.

Comments welcome!

The Shocking Blue Green Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.