Jun 22

Endpoint Resolver: Getting tinyurl out of the Twitter stream

JavaScript, Tech, Web Services with tags: , , 11 Comments »

Sometimes you can get in the zone just enough to be productive on a plane. On my flight to Mexico City yesterday, I created Endpoint a project that contains a server proxy, JavaScript client, and Greasemonkey Script with a mission. The mission is to take a URL, work out if it is a redirect (via a Location: header), and then return the final endpoint for it.

Why did I do this?

I was brainstorming functionality for a Twitter client with James Strachan (he is working on gtwit) and we talked about how annoying tinyurl / is.gd / snurl / you name it URLs are. They don’t tell you where you are going, and you could get Rick Rolled (if you are lucky) or much much worse.

So, I wanted to create a library, and one client (Greasemonkey) to test it out. Then anyone else could use it too to resolve directly from their Web pages.

How does it work

You load up the JavaScript via script src and then you can call resolve, passing the URL and a callback that will get the result. A few examples:

// Simple version
Endpoint.resolve('http://snurl.com/2luj3', function(url) { 
  alert(url); 
});
 
// Using the original URL to work out if it has changed
Endpoint.resolve(
  document.getElementById('testurl').value, 
  function(url, orig) { 
    alert(url); 
    alert(Endpoint.isRedirecting(url, orig));
  }
);
 
// How it is used in the Twitter Endpoint Resolver
Endpoint.resolve(url, function(resulturl, originalurl) {
  if (!Endpoint.isRedirecting(resulturl, originalurl)) return;
 
  newtext = newtext.replace(originalurl, resulturl, "g");
  jQuery(el).html(newtext);
});

Under the hood, a bunch of stuff is happening. I would love to be able to just use XMLHttpRequest to dynamically hit the URL and look at the headers, but the same-origin policy stops me.

This is why I have the server proxy, which returns a JSONP callback.

When you call resolve(url, callback) the script tag is created on the fly and added to the DOM. The callback function is all handled to allow multiple calls, and then the chain unravels.

Here you can see it all in action, showing how my Twitter stream will go through and the URLs will dynamically change from their tinyurl versions to whereyouaregoing.com:

I wanted to use App Engine to host the server proxy, but unfortunately I can’t work out how to do that yet. You have access to the URLFetch API to access resources from App Engine. Unfortunately for me, one of the features is that it understands redirects and just goes on through to the full resource itself, with no way to get the endpoint from the headers in the response.

It was also interesting to read Steve Gilmor talk about these services all be it in a post that is hard to actually understand ;)

Also, Simon Willison just put up a simple service on App Engine, json-time, that “exposes Python’s pytz timezone library over JSON.” I think that we will see a lot of these types of mini-Web services hosted on App Engine. Taking Python utility and making services from its goodness is an obvious choice.