The most popular and prolific codebase I’ve created and worked with has been the best way to post last.fm to twitter, the wonderful tweekly.fm. Each day, the service publishes hundreds and thousands of social media updates to Facebook and Twitter.
The Challenges & Self-DDOSing
The biggest challenge with having a system that posts so many updates is that every time we post an update, there is a link to the users profile which is then visited and spidered. The effect this has on the website and service is massive.
It’s the equivalent of a lower scale Distributed Denial-of-Service attack.
In late 2013, tweekly.fm had become massively popular. We had users from some of the largest tech companies in the world using our service. Scalability once again became an issue. Towards the end of the year, Last.fm provided us with a beefier hosting solution which eased the load we would be under daily and allowed the service to expand.
To alleviate the DDOS effect the first few versions of the service would post social updates sequentially with a predefined sleep between each outbound post. This worked well when there was only a few hundred users onboard but as the service grew it became impossible to post all social updates within the 24 hour period we required them to go out at.
One of the first libraries that I’d used to post Twitter updates was twitter-async by Jaisen Mathai. This introduced me to multi-curl and then I discovered rolling-curl. Being able to post multiple tweets was a great advancement, but this brought with it an unintended consequence of amplifying the DDOS feeling for our servers.
In January 2013, we moved away from our Last.fm hosted solution to a dedicated server and a dedicated database server. This allowed massive expansion for a short period of time. We quickly ended up encountering more issues with scalability and more importantly cost. Nearly 99% of our user base consists of free users that are shown advertising. We were fast heading past revenue for server costs.
Around October 2013 I discovered iron.io almost by accident. I’d recently begun rewriting tweekly.fm into the excellent Laravel framework. I was testing Laravel 4’s queuing systems and noticed a reference to iron.io. After reading more into the IronMQ product – I came across IronWorker.
The difference IronWorker provided for tweekly.fm cannot be understated. It allows us the create updates, package them up to be sent and then queue en-masse into an IronWorker queue. These are then processed in batches and an entire days updates can be sent out in a matter of minutes.
Sunday is the busiest day of the week for tweekly.fm. Regularly for a year now, we’ve been pushing out over 200,000 updates. That’s 8,333 updates an hour or 138 a minute. This would take over 24 hours sequentially, around 18 hours with multiple curl calls and takes just over 40 minutes with IronWorker at a fraction of the cost.
I was able to remove one of the servers and save on the hosting cost – this alone reduced our costs by half.
The exceptional service, support and price is worth it alone. Mix that in with the fact costs were halved – I’m not too sure how you can look anywhere else when needing to run PHP workers for your large scale projects.