Look Ma, I'm Famous!

Featured in The Sentinel

local+news,stoketraffic,the+sentinel,quotation Wednesday, 10th March 2010 at 13:15 No Comments

My quote from a few days ago has been featured in the local newspaper "The Sentinel". You can read the full article about Google Streetview coming to our counties on their website.

This, that, this and that.

Minor Fixes for Tweekly

tweeklyfm,php,twitter,twitter+api Tuesday, 9th March 2010 at 22:34 No Comments

I've just made a few change to tweekly.fm that fix recently raised issues. There is more coming in the next few days.

  • Better UTF8 implementation
  • Moved issue handling to Google Code
  • Minor typos/spelling
  • Fixed 'Remove my account' page
  • Fixed incorrect weekly totals going out
  • Fixed artist being repeated twice for single artist tweets
Play Nice or Get Blocked.

Really, robots.txt is There for a Reason

robots,web Saturday, 6th March 2010 at 14:31 No Comments

Quite a while back now in Internet history a system called ‘robots.txt’ was introduced to control the content that web crawlers and services are permissible to crawl. This file details a map which denotes what crawler/bot is allowed to crawl a site. At tweekly.fm I deal with a lot of user traffic and the complicated nature of our queries means that its quite intensive to generate the user pages.

There are multiple levels of caching and memcache is there too but even under peak times we still have to serve pages to users. The most infuriating thing is during the period in which we send tweets out each day for users, we get bombarded with requests from crawlers and bots. This slows both our service and our processing down but if the crawlers in question were to honour robots.txt as they should do then this wouldn’t even be an issue. It wouldn’t even be too bad if requests were staggered out over a period of time.

The worst offenders at first glance are TweetMeme, Radian6, PostRank, TwittUrls, MetaURI, Twingly, Page-Store & Chainn.

Look Ma, I'm Famous!

Tweekly.fm Featured in a Book

tweekly.fm,twitter Friday, 5th March 2010 at 12:01 No Comments

Sometime last week I was talking to Simon about Tweekly.fm and he informed me that Tweekly.fm was going to be feature in a book that's being written by Garin Kilpatrick. Garin’s book will document tools to be used with Twitter and we’re happy that we’re going to be featured. Its nice to see that the service being provided to end users is a good one and that we’re being recognised as a useful tool.

Caching at Mulitple Levels Added

Caching & ADOdb for Tweekly.fm

tweeklyfm,php,twitter,twitter+api,adodb,database Friday, 26th February 2010 at 21:59 No Comments

The best thing about building a project that gets popular is the same as the worst things about a project that grows rapidly. Tweekly.fm is currently gaining 1,500 users per month most of which decide to publish on a Sunday. This increases the load and run time of the delivery engine that pushes tweets outbound. The other side of an increased user base is that user pages get more popular. Although this is a good thing overall and shows our popularity - it also has its downsides.

The queries to show simple people and shared by counts are expensive to run. We’re indexing just short of 25,000 user records and running the queries in the original way brought the servers down to a sluggish speed. My response to this was to rewrite the entire user pages and introduce caching on a couple of levels.

The diagram below shows the original connections between our servers and our users. As the diagram shows, there was no caching present through the system. This presented problems under load and ended up with the user pages of the site being temporarily postponed.

Uncached

After a little research and crunching of numbers I came down to the conclusion that caching points at both the database server and the website (abstraction layer). Implementing the ADOdb layer underneath the existing database links has proven to be a great success. Its boosted response times over 300%. The next diagram shows the new caching points added the the system.

Cached

Point A is the abstraction layer at the site level. At peak times, the queries issued here are cached for an hour. This scales back to 30 minutes at low load times. Point B is the connection between the web server and database server. Queries are cached at the database constantly via the MySQL Query Cache. Finally Point C is the connection to the Twitter API. All calls are cached for 30 minutes where possible, but due to the nature of the API and service in general I’m not currently caching too much here. What is not shown on the diagram is the connection to the last.fm API because this data is automatically cached and updated when needed.

 

Effects of Changes for Users

The effects of these changes will be noticeable by all users because the front-end of the site will be a lot snappier and responses should be quicker too. Hopefully this is just one of many changes that can improve the system for everyone.

Got Traffic News? I'd like You to Share.

Social Traffic Alerting

stoketraffic,stoke,twitter Thursday, 25th February 2010 at 10:14 2 Comments

I spent some time yesterday extending the StokeTraffic platform to include support for eye witness reports provided by users of the service. This now means that the service will include real time eye witness reports of problems and conditions of the roads in and around Stoke on Trent. All users are invited to participate in this extension of the service by using the new feature. If you’re travelling in or around Stoke on Trent and spot a problem then you can alert us by posting a tweet similar to the following:

@stoketraffic rep A500 Extremely busy due to congestion

If your report is accepted it’ll be posted live within a few moments. Please be aware that reports are limited to 100 characters and there are measures in place to prevent abuse to this service. I’m looking for suggestions, features and also of any issues. Get in contact with me here.

Bad Robots, Bad.

Tweekly.fm Updates

tweeklyfm,twitter,last+fm Monday, 22nd February 2010 at 03:51 No Comments

I've made a couple of changes to the tweekly.fm service tonight. Firstly, any accounts that have revoked OAuth permission for the app will now be removed in the daily tweet cycle (in turn, lowering out tweet output too). The other change I've made is quite a significant one. When we post out our weekly tweets search engines and spiders start indexing our site immediately.

This extra traffic along with the usual traffic to user pages pushes our server to the max and that is what results in the 403 errors and server errors. I've now blocked certain spiders and robots from accessing our site. You should notice much more responsiveness when using the site during the time that the system is sending weekly tweets out.

I've also added tweekly.fm to the Wikipedia page of Twitter apps and services.

Sitting around isn't fun.

Complaint to First Group

complaint,first+bus,stoke+on+trent Monday, 22nd February 2010 at 02:21 2 Comments

Dear First:

I’d like to say that the issues raised within this letter have been ongoing for a while and it is not a one-off problem. On the 16th February 2010 my partner and I boarded the bus in Hanley to travel back home to milehouse. We boarded at 15:09 and arrived in Newcastle Bus Station at 15:24. At this point our driver left the bus and closed the door leaving us sitting on the bus by ourselves.

This is not only a neglect of our safety and your responsibility but a gross inconvenience and waste of our time. As a regular customer I understand that at some points during the day you need to change drivers and this is an accepted fact. What isn’t acceptable however is that a relief driver did not return to the bus until 15:49. At this point, a driver came and asked us to move from the bus that we were on to another bus in the depot. That was a period of 25 minutes that my partner and I had to sit with no indication of the situation. I’d like to know why this happened and what you have done to prevent it happening in the future. I would also like to think that I should be compensated for this event. I’ve attached a copy of one bus ticket from this journey to assist you in tracing the drivers responsible. If I do not receive a sufficient answer from your customer service department then I will be forced to take an alternative route to resolve my issues.

My second issue is with smoking within the enclosed shelters at the bus shelters. As a public place it is illegal for people to smoke there but they constantly do. Why do you not place signs to signify it is a no smoking zone and why is this not enforced. As the bus station is run by you, its your responsibility to enforce the smoking ban in public places.