Michael R. Cook Ruby and Golang Developer

Sitemaps with the Roda Framework

There can be a number of reasons for wanting to add sitemap to your site, but perhaps the most important is for Google - and other search engines (SE’s) of course! From an SEO viewpoint Google expects your site to have a sitemap and thankfully, this is very easy to do with Ruby websites.

In this short tutorial we’ll be implementing an automated sitemap for a site built with the Roda framework. I’ll be building on top of my previous Roda Blog tutorial, which you might find useful to look over before continuing. ## Requirements

Rather than having to build the XML by hand we’ll be using the builder and tilt GEMs. These will make it much easier to generate valid XML documents. Before continuing, add the following to your Gemfile and then run bundle install

# ./Gemfile

gem "tilt", "~> 2.0.1"
gem "builder", "~> 3.2.2"

Routing

The tilt GEM supports the Builder engine out-of-the-box, so we just specify the builder extension on the sitemap route, and the file will be parsed correctly.

# ./myapp.rb

route do |r|
  r.get "sitemap.xml" do
    @posts = Post.reverse_order
    response["Content-Type"] = "text/xml"
    render("sitemap", ext: 'builder')
  end

  # ...
end

As we’ll be delivering an XML document we’ve created the route to trigger on /sitemap.xml. We also force the “Content-Type” to XML for this route.

The sitemap specification states that a single file can include up to 50,000 entries. If your site will have more than that then you’ll need to implement multiple sitemaps. More information on the protocol can be found at sitemaps.org.

Builder View

Using builder is very straight forward. Let’s see the whole file first;

# ./views/sitemap.builder

xml.instruct!

xml.urlset(
  :"xmlns:xsi"          => "http://www.w3.org/2001/XMLSchema-instance",
  :"xsi:schemaLocation" => "http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd",
  :xmlns                => "http://www.sitemaps.org/schemas/sitemap/0.9"
) do
  xml.url do
    xml.loc "http://localhost:9292/"
    xml.changefreq "daily"
    xml.priority "1"
  end

  xml.url do
    xml.loc "http://localhost:9292/about"
    xml.changefreq "monthly"
    xml.priority "0.5"
  end

  @posts.each do |post|
    xml.url do
      xml.loc "http://localhost:9292/posts/#{post.id}"
      xml.lastmod post.updated_at.iso8601
      xml.changefreq "weekly"
      xml.priority "0.7"
    end
  end
end

The first thing to note is the file extension. By using .builder we’re telling tilt that we want this to be processed using the Builder engine.

Sitemaps must be valid XML so the first item needs to be a Processing Instruction; xml.instruct! generates an <?xml version="1.0" encoding="UTF-8"?>.

The root node should be a <urlset> and requires the appropriate namespace attributes (xmlns:xsi, xsi:schemaLocation, xmlns). The basic construct of xml.urlset is how we’ll generate the rest of our xml.url nodes and this should be self explanatory from the code above.

It’s difficult to find exact recommendations on what values to give xml.priority, so those above are my own preference. Generally, important pages have a high priority and less important a low priority. For my own feeds I usually give the homepage a “1”, general pages like about between “0.2” and “0.6”, while Posts are around “0.7” (at least for a blog site). You’ll have to make your own judgement based on what your site is about, but perhaps the one absolute rule is that you should not set all values to be the same.

Our builder view should output something like the following;

# http://localhost:9292/sitemap.xml

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://localhost:9292/</loc>
    <changefreq>daily</changefreq>
    <priority>1</priority>
  </url>
  <url>
    <loc>http://localhost:9292/about</loc>
    <changefreq>monthly</changefreq>
    <priority>0.5</priority>
  </url>
  <url>
    <loc>http://localhost:9292/posts/1</loc>
    <lastmod>2015-04-17T12:50:11+01:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.7</priority>
  </url>
  <url>
    <loc>http://localhost:9292/posts/2</loc>
    <lastmod>2015-04-17T13:07:21+01:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.7</priority>
  </url>
</urlset>

After deploying your changes head over to your Google Webmaster account and add the sitemap URL. If Google is being nice to you then you should enjoy all the SEO goodness that will bring.

You can also use these exact same technique to generate a RSS feed for your blog. You can read more about the specification on the RSS Advisory Board website…I may even do a specific RSS tutorial if there is demand for it.

If you have any comments or issues with the above code please do let me know via the comments below, or contact me directly.

Leave a Comment

About Me

Hi, my name is Michael and this is my personal blog. Here I’ll be posting my coding thoughts and experiments, specifically in regards to building websites in Ruby (Rails, Roda, Sinatra, etc). This site is powered by Thunderaxe, a blogging platform I’m building using the Roda Ruby framework, which I hope to be open sourcing in the near future.