Twitter integration and parsing links with Ruby on Rails

As per usual, it’s been a while since I’ve written anything about Ruby on Rails or RidingResource. A while back, we had an issue with Twitter integration that messed up the homepage. I got some time today to fix things, so I figured I would write a little bit about how RidingResource has integrated Twitter into our homepage and about how parsing links works in Ruby on Rails.

When we were first building RidingResource, we decided it might be cool to have the last few tweets from our RidingResource Twitter account displayed on the home page. It took me a minute, but, like with most things you want to do with Ruby on Rails, there’s a gem/plugin for that. The one we chose happens to be Dancroak’s Twitter Search. The neat thing about this gem is that it allows you to grab things off Twitter very easily, and then use them however you like.

def home
  ## set up twitter client
  @client = TwitterSearch::Client.new 'equine'
 
  ## pull in last 3 tweets
  @tweets = @client.query :q => 'from:ridingresource', :rpp => 3
end

Well, this is all well and good, but one thing I quickly realized when we initially did this is that when we posted a link in a tweet, it is parsed fine if you look at it on Twitter’s website or through other clients, but we were basically just regurgitating the text of the tweet with no markup. When I went to reinstate the Twitter feed on the homepage today, I started looking into ways to parse URLs with Ruby (on Rails) that were already in strings and to display them with the proper hypertext markup. What I found were some neat snippets on DZone that did just that.

After some careful Googling and search-term hackery, I stumbled upon this DZone snippet that discusses how to convert URLs into hyperlinks. This snippet written by James Robertson makes use of a gem, alexrabarts-tld, which does some checking to see if items are actually a real domain TLD. As James found, like with many things Ruby, you can pass the items that come out of a gsub regexp into a block, which enables us to replace the URL in the string with the hypertext for the URL.

Because we were going to use this substitution on every single tweet to check if there were any URLs, I created a nifty little helper function to do just that.

def hyperlink_parser(string)
  return string.gsub(/((\w+\.){1,3}\w+\/\w+[^\s]+)/) {|x| is_tld?(x) ? "<a href='#{x}' class='#{link_class}'>#{x}</a>" : x}
end

One thing I noticed was that if URLs were in the text that had http:// in them, we would match the rest of the URL, hyperlink the FQDN and other parts of the link, but ignore the http://. It looked really funny to have a link that looked like http://www.erikjacobs.com I realized that this was just a “problem” with the original RegExp that Jason had created, so I started to do some sleuthing.

The first thing I did was try to find a RegExp tester. While there are many out there, the one I ended up using was Rubular (conveniently uses Ruby – look at that!), which displays the results of our RegExp search against the text in real time. Some careful googling of selected RegExp and URL terms resulted in yet another snippet from DZone, by Rafael Trindade. Getting closer!

Lastly, since this function might just possibly be used elsewhere, and since I wanted to apply a style at least to the links generated by its use here, I decided to add another argument to the helper method for the link class. The result is the following helper:

def hyperlink_parser(string, link_class="")
  return string.gsub(/(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?/) {|x| is_tld?(x) ? "<a href='#{x}' class='#{link_class}'>#{x}</a>" : x}
end

The view essentially just iterates over the tweets that came back from the search, rendering a partial:

<% for tweet in @tweets -%>
  <%= render :partial => "tweet", :object => tweet %>
<% end -%>

And inside the partial we do a few things with the tweet, including linking to the original tweet on Twitter’s site, showing the date, and parsing the text for the URL and returning it:

<li><p><a href="http://www.twitter.com/RidingResource/status/<%= tweet.id %>" target="_blank"><%= tweet.created_at[5,11] %></a> <%= hyperlink_parser(tweet.text, "tweet") %></p></li>

Hopefully some of you will find this useful in your quest to either integrate Twitter into your Ruby on Rails’ projects, or to perhaps parse some things that live inside strings into markup. There is almost surely something that already does this, so I probably re-invented the wheel, but it didn’t take long and it seems to work.