Endpoint Resolver: Getting tinyurl out of the Twitter stream
Sometimes you can get in the zone just enough to be productive on a plane. On my flight to Mexico City yesterday, I created Endpoint a project that contains a server proxy, JavaScript client, and Greasemonkey Script with a mission. The mission is to take a URL, work out if it is a redirect (via a Location:
header), and then return the final endpoint for it.
Why did I do this?
I was brainstorming functionality for a Twitter client with James Strachan (he is working on gtwit) and we talked about how annoying tinyurl / is.gd / snurl / you name it URLs are. They don’t tell you where you are going, and you could get Rick Rolled (if you are lucky) or much much worse.
So, I wanted to create a library, and one client (Greasemonkey) to test it out. Then anyone else could use it too to resolve directly from their Web pages.
How does it work
You load up the JavaScript via script src
and then you can call resolve, passing the URL and a callback that will get the result. A few examples:
// Simple version Endpoint.resolve('http://snurl.com/2luj3', function(url) { alert(url); }); // Using the original URL to work out if it has changed Endpoint.resolve( document.getElementById('testurl').value, function(url, orig) { alert(url); alert(Endpoint.isRedirecting(url, orig)); } ); // How it is used in the Twitter Endpoint Resolver Endpoint.resolve(url, function(resulturl, originalurl) { if (!Endpoint.isRedirecting(resulturl, originalurl)) return; newtext = newtext.replace(originalurl, resulturl, "g"); jQuery(el).html(newtext); });
Under the hood, a bunch of stuff is happening. I would love to be able to just use XMLHttpRequest
to dynamically hit the URL and look at the headers, but the same-origin policy stops me.
This is why I have the server proxy, which returns a JSONP callback.
When you call resolve(url, callback)
the script
tag is created on the fly and added to the DOM. The callback function is all handled to allow multiple calls, and then the chain unravels.
Here you can see it all in action, showing how my Twitter stream will go through and the URLs will dynamically change from their tinyurl versions to whereyouaregoing.com:
I wanted to use App Engine to host the server proxy, but unfortunately I can’t work out how to do that yet. You have access to the URLFetch API to access resources from App Engine. Unfortunately for me, one of the features is that it understands redirects and just goes on through to the full resource itself, with no way to get the endpoint from the headers
in the response.
It was also interesting to read Steve Gilmor talk about these services all be it in a post that is hard to actually understand ;)
Also, Simon Willison just put up a simple service on App Engine, json-time, that “exposes Python’s pytz timezone library over JSON.” I think that we will see a lot of these types of mini-Web services hosted on App Engine. Taking Python utility and making services from its goodness is an obvious choice.
June 22nd, 2008 at 9:59 am
Endpoint Resolver works great, especially on pages were there aren’t too many urls. Pages such as http://twitter.com/dalmaer which is full of links tend to break. This could get annoying. I remember the annoyance of tweet primal screams breaking pages. I never used tinyurls until Twitter and have since become quite comfortable with it (perhaps too comfortable). Injecting an optional expand toggle like they’ve implemented on http://summize.com might be a better approach in the long run.
June 22nd, 2008 at 3:54 pm
Good work. Only, I would have called it “rick unrolled” :D.
json-time: Yes, we need more json services, security risks notwithstanding. Time is one I recently looked into while writing a clock gadget, since as we all know, JS doesn’t really do much with timezones.
June 22nd, 2008 at 4:03 pm
PS It must be said this is a lot more comprehensible and practical than a certain related article that recently appeared on techcrunch!!!
June 23rd, 2008 at 4:36 am
Lovely! :)
June 23rd, 2008 at 6:12 am
“I would love to be able to just use XMLHttpRequest to dynamically hit the URL and look at the headers, but the same-origin policy stops me.”
It’s a good thing GreaseMonkey provides GM_xmlhttpRequest to get around such a blocker then:
http://wiki.greasespot.net/GM_xmlhttpRequest
:)
June 23rd, 2008 at 2:07 pm
Good script, ty. But I’ve bit changed it. Long links (for ex – links to google maps) broke twitter design, so I’ve replaced:
newtext = newtext.replace(originalurl, resulturl, “g”);
with:
newtext = newtext.replace(”href=\”"+originalurl, “href=\”"+resulturl, “g”);
newtext = newtext.replace(originalurl, resulturl.length > 30 ? resulturl.substr(0,30)+”…” : resulturl, “g”);
Now it’s better. Hope this will be usefull for somebody.
June 24th, 2008 at 4:30 am
Nice work…very helpfull
July 6th, 2008 at 11:41 pm
@jeka911
nice. I tweaked it again and checked it in. I added a title=”full url” so if you hover on the … it shows up right there as well as at the bottom of the browser.
Cheers,
Dion
July 6th, 2008 at 11:43 pm
@Dan Brook,
Very true. If you just use it for Twitter, that would be a more efficient way to go. This way it works for everything (including GWT). The server side also does recursive gets to clear through multiple redirects.
Cheers,
Dion
December 9th, 2008 at 9:51 pm
Got pissed of with tinyurl’s today and remembered this alwesome post.
But your server can cat a little slow sometimes, I re read the article and saw that App Engine was no good.
Have you tought of AppJet ? (http://appjet.com/)
This could be an good example of Javascript on both sides with their awesome service. I’ll try it sometime, if I find the time…
March 22nd, 2009 at 5:07 pm
Nice work! That minimizes the risk of getting rickrolled ;)