Movement in the death of old browsers and IE 6 Google Microsoft and Yahoo!; Dare can’t see straight
Jun 22

Endpoint Resolver: Getting tinyurl out of the Twitter stream

JavaScript, Tech, Web Services with tags: , , Add comments

Sometimes you can get in the zone just enough to be productive on a plane. On my flight to Mexico City yesterday, I created Endpoint a project that contains a server proxy, JavaScript client, and Greasemonkey Script with a mission. The mission is to take a URL, work out if it is a redirect (via a Location: header), and then return the final endpoint for it.

Why did I do this?

I was brainstorming functionality for a Twitter client with James Strachan (he is working on gtwit) and we talked about how annoying tinyurl / is.gd / snurl / you name it URLs are. They don’t tell you where you are going, and you could get Rick Rolled (if you are lucky) or much much worse.

So, I wanted to create a library, and one client (Greasemonkey) to test it out. Then anyone else could use it too to resolve directly from their Web pages.

How does it work

You load up the JavaScript via script src and then you can call resolve, passing the URL and a callback that will get the result. A few examples:

// Simple version
Endpoint.resolve('http://snurl.com/2luj3', function(url) { 
  alert(url); 
});
 
// Using the original URL to work out if it has changed
Endpoint.resolve(
  document.getElementById('testurl').value, 
  function(url, orig) { 
    alert(url); 
    alert(Endpoint.isRedirecting(url, orig));
  }
);
 
// How it is used in the Twitter Endpoint Resolver
Endpoint.resolve(url, function(resulturl, originalurl) {
  if (!Endpoint.isRedirecting(resulturl, originalurl)) return;
 
  newtext = newtext.replace(originalurl, resulturl, "g");
  jQuery(el).html(newtext);
});

Under the hood, a bunch of stuff is happening. I would love to be able to just use XMLHttpRequest to dynamically hit the URL and look at the headers, but the same-origin policy stops me.

This is why I have the server proxy, which returns a JSONP callback.

When you call resolve(url, callback) the script tag is created on the fly and added to the DOM. The callback function is all handled to allow multiple calls, and then the chain unravels.

Here you can see it all in action, showing how my Twitter stream will go through and the URLs will dynamically change from their tinyurl versions to whereyouaregoing.com:

I wanted to use App Engine to host the server proxy, but unfortunately I can’t work out how to do that yet. You have access to the URLFetch API to access resources from App Engine. Unfortunately for me, one of the features is that it understands redirects and just goes on through to the full resource itself, with no way to get the endpoint from the headers in the response.

It was also interesting to read Steve Gilmor talk about these services all be it in a post that is hard to actually understand ;)

Also, Simon Willison just put up a simple service on App Engine, json-time, that “exposes Python’s pytz timezone library over JSON.” I think that we will see a lot of these types of mini-Web services hosted on App Engine. Taking Python utility and making services from its goodness is an obvious choice.

11 Responses to “Endpoint Resolver: Getting tinyurl out of the Twitter stream”

  1. Dan Shaw Says:

    Endpoint Resolver works great, especially on pages were there aren’t too many urls. Pages such as http://twitter.com/dalmaer which is full of links tend to break. This could get annoying. I remember the annoyance of tweet primal screams breaking pages. I never used tinyurls until Twitter and have since become quite comfortable with it (perhaps too comfortable). Injecting an optional expand toggle like they’ve implemented on http://summize.com might be a better approach in the long run.

  2. Michael Mahemoff Says:

    Good work. Only, I would have called it “rick unrolled” :D.

    json-time: Yes, we need more json services, security risks notwithstanding. Time is one I recently looked into while writing a clock gadget, since as we all know, JS doesn’t really do much with timezones.

  3. Michael Mahemoff Says:

    PS It must be said this is a lot more comprehensible and practical than a certain related article that recently appeared on techcrunch!!!

  4. James Strachan Says:

    Lovely! :)

  5. Dan Brook Says:

    “I would love to be able to just use XMLHttpRequest to dynamically hit the URL and look at the headers, but the same-origin policy stops me.”

    It’s a good thing GreaseMonkey provides GM_xmlhttpRequest to get around such a blocker then:

    http://wiki.greasespot.net/GM_xmlhttpRequest

    :)

  6. jeka911 Says:

    Good script, ty. But I’ve bit changed it. Long links (for ex – links to google maps) broke twitter design, so I’ve replaced:

    newtext = newtext.replace(originalurl, resulturl, “g”);

    with:

    newtext = newtext.replace(”href=\”"+originalurl, “href=\”"+resulturl, “g”);
    newtext = newtext.replace(originalurl, resulturl.length > 30 ? resulturl.substr(0,30)+”…” : resulturl, “g”);

    Now it’s better. Hope this will be usefull for somebody.

  7. Sachin Says:

    Nice work…very helpfull

  8. dion Says:

    @jeka911

    nice. I tweaked it again and checked it in. I added a title=”full url” so if you hover on the … it shows up right there as well as at the bottom of the browser.

    Cheers,

    Dion

  9. dion Says:

    @Dan Brook,

    Very true. If you just use it for Twitter, that would be a more efficient way to go. This way it works for everything (including GWT). The server side also does recursive gets to clear through multiple redirects.

    Cheers,

    Dion

  10. Iraê Says:

    Got pissed of with tinyurl’s today and remembered this alwesome post.

    But your server can cat a little slow sometimes, I re read the article and saw that App Engine was no good.

    Have you tought of AppJet ? (http://appjet.com/)

    This could be an good example of Javascript on both sides with their awesome service. I’ll try it sometime, if I find the time…

  11. kcu Says:

    Nice work! That minimizes the risk of getting rickrolled ;)

Leave a Reply

Spam is a pain, I am sorry to have to do this to you, but can you answer the question below?

Q: What is the number before 3? (just put in the digit)