Custom short URLs and WordPress SEO stuff

// May 18th, 2009 // General, Techie

If you use Twitter, chances are you’ve seen or have submitted URLs using short URL services like tinyurl.com or bit.ly. Because of the limited number of characters allowed in Twitter tweets, you simply have to use them to get your point accross. The problem with using these public services is that, to most people they have no clue really where the link is going to take them. Because of this, some places have started using their own short URLs; when you see the custom short URL, you know where you’re headed.

So, I wanted to have our own short URL domain for CliqueClack. Domains are cheap, but only if you use one of the primary top-level domains (.com, .net, .us …). If you try to use domai.nr to find domains, it will even search country-specific domains, many of which can cost hundreds of dollars to register, if you’re allowed to at all! While I would have loved to have cla.ck for our short domain, apparently the Cook Islands are picky with what you do with their domains. So, we settled with clak.us — it’s cheap and it makes sense: Clack Us!

Now, how to translate these short URLs to the posts on our sites. We could spend $100/yr. using awe.sm to do the routing for us, but I thought we could do this on our own. And I was right.

We use WordPress over on CliqueClack, and with at least WP, each post has an ID #. You could access the posts with the ID number rather than the permalink/slug URL (http://www.cliqueclack.com/tv/?p=12345), for example. The post IDs can increase pretty quickly when you’re posting as often as we do and, on top of that, all of the revisions of each post. For example, on CCTV we’re somewhere around ID #21382, but we only have just short of 2000 posts!

So, we can’t realistically access the posts in our short URL domain using the ID # of the post. It could get pretty long, pretty quickly. We could try making random strings for each post and write a plugin that does all of that, puts new entries into the database, etc., but then we’d have to go back through all the old posts too. Plus, it should be easier than that when each post has a unique ID #.

What I decided to do was use the PHP functions hexdec() and dechex() to shorten the post ID to something quite a bit smaller. For example, even if I hit an eight-digit ID like 10000000, in hex that’s 989680 — so I dropped off two characters there. For something more realistic for us, like a six-digit ID, that brings you down to five characters. http://clak.us/89680. Perfectly acceptable.

However, we’re currently not using WPMU, so we’ve got two blogs that I’d like clak.us to direct to using IDs. But how to translate non-unique IDs to the appropriate blog? Sloppy but works: I tack on the first character of the blog: “t” for TV, “f” for Food. So, with above: http://clak.us/t89680. Still not bad, and we’re not even close to six-digit IDs yet anyway.

Here’s the simple redirect script I use:

<?php

$clakus_log = "/path/to/clakus.txt";

$path = $_GET['id'];

$blog = "none";

$pre = substr( $path, 0, 1 );
$num = substr( $path, 1 );

switch ($pre) {

 case "t":
 $blog = "tv";
 break;
 case "f":
 $blog = "food";
 break;

}

$hexdec = hexdec( $num );

if( $blog == "none" ) {
 $realpage = "http://www.cliqueclack.com/";
} else {
 $log = fopen( $clakus_log, 'a' );
 $rightnow = date("D M j G:i:s T Y");

 $text = $rightnow . " - " . $blog . " - " . $hexdec . " - ";
 $text .= $_SERVER['HTTP_REFERER'] . "\n";
 fwrite( $log, $text );
 fclose( $log );
 $realpage = "http://www.cliqueclack.com/" . $blog . "/?p=" . $hexdec;
}

header("HTTP/1.1 301 Moved Permanently");
header('Location: ' . $realpage);

?>

To get clak.us traffic to route through that script, I have a .htaccess for its landing directory:

RewriteCond %{HTTP_HOST} clak.us
RewriteRule ^(.*)$ http://www.cliqueclack.com/clakus.php?id=$1 [R=301,L]

“clak.us.php” is the name of the page I gave you the code to above. Simple. Works. The only thing I’d like to work on soon is better logging of how the URLs are being used and who’s clicking on them. Right now the info goes to a simple text file I can check once in a while, which is fine for my needs for now.

One change I may make is directly accessing the database to pull the real permalink rather than access via the p=ID method, which would cut down on a redirect. Speaking of…

Lastly, I (so far) highly recommend getting the Permalink Redirect plugin for WordPress to help your SEO/Google juice. Otherwise, if you do what I did above, your redirects will show up as ’302′ redirects, which apparently Google doesn’t like and will consider your new tiny URLs to be duplicate content, should it crawl it when it’s linked from elsewhere.

UPDATE: Glad I thought of this sooner than later. You can actually make much smaller codes than using hexdec/dechex by instead using PHP’s base_convert() function. I just use a conversion of base10 (decimal, the post ID) to base36. Makes for even shorter codes! By the way, to get the codes to use for each post, just write a simple plugin to convert the ID of the post to base36 and place that URL in the post edit list (I placed mind right under the post status). If anyone needs it, I can publish how I did that.

Comments are closed.