Last week our product development team learned first-hand why “eating our own dogfood” is a best-practice for understanding customer use cases and creating technology that solves real problems.
Crashed without a trace
Monitoring and optimizing AWS deployments requires a robust collection system. So when our watchdog system picked up (in our development environment) that our collectors were crashing once or twice a day, we were a bit concerned. Although we had just added some new features in development, we were confident that the underlying computing resources were sufficient. So we started a debug process to see if there were any error messages. But absolutely nothing came up.
Hey, don’t we have access to Cloudyn?
As a last resort, one of our developers suggested we check our very own Cloudyn reports on EC2 instances supporting our development environment. So to Cloudyn’s performance monitoring tools we turned, and lo and behold, the reason for the crashes became crystal clear. We had maxed out on both memory and swap.
Our first step in remedying this was to increase swap on this particular machine. However, as paging was occurring much too frequently, continued crashes showed that this move was insufficient. The Cloudyn report (shown in screenshot below) confirmed this, so we decided to buy m1.large, thereby ending all the crashing. (As a side-note, using two m1.mediums instead of one m1.large can be an excellent option where feasible. Our data shows that while pricing for either option is identical, the output is 30% higher with the two m1.mediums.)
Be your own customer
I believe the key takeaway for developers and companies is to not just develop and “test” your product. Use it as often as possible in real-life situations. Not only will you be able to pickup on usability and other issues, but you might just discover new value and use cases that you had not thought of before.