Black Friday is a very important day for us in Engineering. We see more traffic than any other day in the year and it’s a real test of our ability to stay reliable at scale, so we started to plan months in advance to ensure that it was a quiet night for us, and we could sit back and watch the graphs with a beer!
Now that we’re out the other-side, we’d like to share with you what we did, what we learnt and how we made sure the night was a success for us and our vendors.
The great thing about predictable big days of load is just that: they’re predictable with no surprises… in theory. We knew that our platform could comfortably deal with the load it had been under in the past but Black Friday would be our highest volume day by far, so we wanted to make sure the platform had been under such load in controlled conditions before the big night.
To do this, we used a combination of JMeter and Locust.io to create realistic fake load against the checkout and our order processing systems, so we could stress them upto and beyond the levels we’d expect to see.
We already had NewRelic and Cloudwatch in place to tell us if we saw any slowdown, so once we had the tests set up we ramped up the load against our production-like test environment and watched closely…
After some scaling of our infrastructure in AWS, we could see that the core Checkout and Order Processing parts of the platform performed well under load beyond what we’d expect on Black Friday, so we looked for any opportunities to optimise further.
Although the platform coped well under extreme load, we saw that there were some reasonably simple changes we could make to increase our order and license processing throughput.
When we’re at our most busy it’s vital that we continue to deliver a great experience to our vendors and their customers, so we focused on reducing the time taken for our Workers to go through their queue of things to process.
By doing this we were able to re-run the same test again, and see that we’d dramatically improved our ability to process orders from the queue, so we were confident that we’d have a calm and peaceful evening.
Our Engineers at Paddle are passionate that they should support the code that they write, so with only a small amount of convincing via free pizza everyone agreed that they’d stay around on Friday night to watch the platform, and address any unexpected issues.
Due to all the hard work that we put in during the preceding weeks we were able to sit together and watch our platform process huge amounts of load without breaking a sweat, delivering a very successful night for our vendors. A great achievement!
By eliminating all the predictable problems when it comes to high traffic situations, we were able to confidently go into our busiest weekend ever, and now know in detail where our Platform flies and where there is room for improvement. Bring on next year…