R Markdown & Bloggin’: Part 1 – Inserting Code

As a data scientist, I find the vast majority of the useful content I produce just gets stored into a rainy day folder until I have further need for it. I think it would be more beneficial if I brought some of the functions, processes, and knowledge I have developed over the years out into the daylight. I can think of no better first step than to make bloggin’ as easy as possible, and the easiest way I can think of is to use a tool I am expert at – R. Thus, the following post contains some example code regarding how to “easy button” employ R Markdown to generate blog content that looks great on blogs, such as r-bloggers.com. Also, I intend to spotlight a number of functions I have developed within my company’s personal R package – more on that later. Let’s see how this goes for now.

Setting up R Markdown

The very basics regarding how to make and “knit” an R Markdown (.Rmd) file won’t be covered here. If you want a quick run-through, RStudio has a great guide to get you started. Either way, you’ll need to load the packages {knitr} and {rmarkdown}, as well Pandoc. I like to use the installr’s installr::install.pandoc() function so I don’t have to download Pandoc separately. For now, I am going to assume you know the basics. Generating an html_document seems to be the most straightforward method to generate a flat webpage. There are many other convenient alternatives, however, I am going to “KISS”" for now.

Highlighting Code with R Markdown

One of the first things I tried to sort out was how to make code – not just R code – look better using R Markdown. On a day-to-day basis I have to write – or at least read – code from all over the internet. I therefore try not to shy away from other programming languages, especially if people have developed some really powerful algorithms that don’t have an R equivalent yet. As a group, we choose to use Alex Gorbatchev’s SyntaxHighlighter for our blog posts because we thought it looked the slick. There were a number of helpful guides around (Salabim, Galili), including a script written exclusively for R by Yihui.

Sourcing in Files

R Markdown can be somewhat confusing with regards to how one “includes” content into the HTML document they are attempting to generate. I know of 7 methods that allow the userR to source in external scripts using rmarkdowns header arguments. The arguments theme and hightlight allow the userR to select between a number of preset internal default css themes. However, I am not aware that there is anything stopping someone from just adding their own script within the ~/rmarkdown/rmd/h/bootstrap-3.3.5/css directory, instead of the ones listed on the R Markdown GitHub file. Eran Raviv’s blog demonstrates several of the different default theme settings.

Example Rmd header:

output:
  html_document:
    self_contained: false
    css: style.css
    template: template.html
    theme: null
    highlight: null
    includes:
      in_header:    ./html_folder/header.html
      before_body:  working_dir_doc_prefix_logo.html
      after_body:   ./html/doc_suffix.html

My current setup uses two HTML files loaded into the in_header and after_body argument like so:

  html_document:
    self_contained: false  
    keep_md: true
    theme: null
    highlight: null
    pandoc_args: [
      "--id-prefix", "equablog-"
    ]
    includes:
      in_header: ./html/header.html
      after_body: ./html/footer.html

The self_contained argument set to false means that we want to preserve references to external resources like images in our HTML. This can be important for web publishing in the case that your syndication service does not accept HTML with images embedded as base64-encoded data.

header.html & footer.html files

These are the copies of the header and footer HTML script I am using for this very blog. The method I chose doesn’t require the installation of any software. If you want more control and options with regards to syntax compatibility and personal preferences, the software can be obtained here at Alex Gorbatchev’s GitHub repo

header.html

<link href='http://alexgorbatchev.com/pub/sh/current/styles/shCore.css'
rel='stylesheet' type='text/css'/>
<link href="http://alexgorbatchev.com/pub/sh/current/styles/shThemeDefault.css"
rel="stylesheet" type="text/css" />

<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shCore.js'
type='text/javascript'></script>
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shAutoloader.js'
type='text/javascript'></script>

footer.html

<script type="text/javascript">
    SyntaxHighlighter.autoloader(
      "http://equastat.com/wp-content/uploads/2016/04/shBrushR.js",
      "plain  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPlain.js",
      "sql  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushSql.js",
      "js  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJScript.js",
      "html  xml http://alexgorbatchev.com/pub/sh/current/scripts/shBrushXml.js",
      "cpp  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCpp.js",
      "csharp  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCSharp.js",
      "css  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushCss.js",
      "java  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJava.js",
      "php  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPhp.js",
      "py  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPython.js",
      "ruby  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushRuby.js",
      "vb  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushVb.js",
      "perl  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPerl.js",
      "scala  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushScala.js",
      "shell  http://alexgorbatchev.com/pub/sh/current/scripts/shBrushBash.js",
      "ahk  http://users.on.net/~mjneish/syntax/scripts/shBrushAhk.js",
      "ps  http://users.on.net/~mjneish/syntax/scripts/shBrushPowerShell.js"
    );
  SyntaxHighlighter.config.bloggerMode = true;
  SyntaxHighlighter.defaults["toolbar"] = false;
  SyntaxHighlighter.all();
  </script>

You might have noticed in the above code section that – for ‘r’ – the highlighter script needed to be downloaded and sourced from our server, or the code chunk would intermittently malfunction. The big problem I encountered was if any of these brushes failed to load it could cause the autoloader function to fail, and then code highlighting would not work at all. There was a lot of guess-and-check involved with implementing highlighters with this method. I found that repeatedly checking the rendered browser for console errors was a somewhat tedious task. So it is always better to have a web developer friend on-hand to help you through the process. Hopefully, you won’t need the help.

Highlighting Examples

The following is a few examples of highlighted code using SyntaxHighlighter’s default CSS scripts. I grabbed a few of these code examples from a site called rosettacode.org. There are lots of other themes you can use. All this entails is downloading the respective .css theme you are interested in.

R
# Using an example vector "arg"
arg = c(1, 2, 3, 4, 5)
redundent_sum <- function(...) {
  Reduce(sum, as.list(...))
}
redundent_sum(arg)

using <pre class="brush: r">...</pre>

csharp
// Comment
#include <numeric>
int sum = arg.Sum();
int prod = arg.Aggregate((runningProduct, nextFactor) => runningProduct * nextFactor);

using <pre class="brush: csharp">...</pre>

Scala
  val seq = Seq(1, 2, 3, 4, 5)
  val sum = seq.sum

using <pre class="brush: scala">...</pre>

Matlab
  array = [1, 2, 3, 4, 5]
  sum(array,1)

using <pre class="brush: plain">...</pre>

Bash
  LIST='1 2 3 4 5';
  SUM=0;
  for i in $LIST; do
    SUM=$[$SUM + $i]
  done;

using <pre class="brush: shell">...</pre>

Python
  # Default
  numbers = [1, 2, 3, 4, 5]
  total = sum(numbers)
  # Using numpy
  from numpy import r_
  numbers = r_[1:5]
  total = numbers.sum()

using <pre class="brush: py">...</pre>

Perl
  my @list = ( 1, 2, 3, 4, 5);
  $sum  += $_ foreach @list;

using <pre class="brush: perl">...</pre>

Ruby
  arr = [1,2,3,4,5]
  p sum = arr.inject(0) { |sum, item| sum + item }

using <pre class="brush: ruby">...</pre>

Java
  Arrays.stream(arg).sum();
  /** or */
  Arrays.stream(arg).reduce(0, (a, b) -> a + b);

using <pre class="brush: java">...</pre>

Some brushes work better than others, and there is no reason why you couldn’t push the peas into the carrots – so to speak – if wanted to mix and match brushes and code based on your preferences.

One hurdle I did notice, however, was when code snippets used syntax that looked like raw HTML, expressions that contain the greater-than character (i.e. < ) were particularly hard for R Markdown to compile clairvoyantly. In such cases, using HTML character &alt; sufficed instead. I am sure there is a better way to handle such issues using R Markdown, – in the way it was intended – however, I wanted to avoid learning about curly braces and .css classes for now. 😉

Gist Snippets

Another useful means of displaying code is to use GitHub Gist snippets. For example, the code <script src="https://gist.github.com/1804862.js?file=shBrushR.js"></script> creates a good looking code snippet.

Additionally, if you have too many gist snippets to keep organized you can use GistBox app, which connects directly into GitHub.

Note, a lot of this content had to be changed after the resent update of rmarkdown-v0-9-5. Luckily, as of April 10th, 2016 all of these tools should be up-to-date.

Thanks for skimming, and I hope this blurb motivates you all to blog about your endeavors more often!

Reino Bruner

Originally Published April 8, 2016

Updated May 6, 2016


Also published on Medium.

Leave a Comment:

Your email address will not be published. Required fields are marked *