In a previous post, we discussed Knewton’s in-house deployment tool, Knewton Crab Stacker (KCS). To summarize, KCS is a command line tool used to make deployment of CloudFormation (one of Amazon’s services) easier. It saves our engineers from banging their heads against their desks when trying to deploy their services.
So what exactly makes KCS such a valuable, can’t-live-without-it tool? In this post, we’ll take a look at some of the many KCS commands that make the lives of a Knewton engineer easier.
Normally, when you want to ssh into an EC2 instance, you have to go through a long and arduous process to find the instance’s public DNS name, then locate your own ssh key for that instance, and then finally type out the command that lets you ssh into the instance. You have to do this every single time you want to ssh. As you may imagine, this gets annoying fast.
To make this whole process simpler, we have a KCS command that does everything for you. All you have to do is specify which stack and target environment you’re trying to ssh into, and then KCS will take care of the rest. It will find the public DNS of the instance, find your ssh key, and finally ssh-es into the box for you. Besides being a huge time saver, my favorite part about this command is that it adds colors to the instance’s terminal. Colors make everything better.
Often while working on a service, we will make modifications to the instance (which we get into by using the awesome KCS ssh command). But when you make modifications, inevitably something gets messed up. No one wants their instance to be messed up, so you have to restart it. This usually involves relaunching the stack the instance is a part of, and twiddling your thumbs while you wait.
Here at Knewton, we like to save time, so we created a command that allows us to essentially restart our instance. We call this command kick.
Underneath the hood, kick gets the address of the instance we want to kick, ssh-es into the instance, and re-runs cfn-init (the command that is first run when the instance is created). This re-downloads the needed resources and configures everything you need using Chef. After kicking an instance, the instance is essentially brand new and ready for more tinkering.
A very common scenario that an engineer comes across is when he’s made a change to a service, has finished testing it locally, and then wants to test it on a real stack. To do this using just CloudFormation, we would have to first upload our new build to S3, then update our stack to use the new build. Updating a stack takes quite a bit of time, anywhere from a couple to ten-plus minutes. That’s a lot of waiting around.
That’s why we invented roundhouse kick. Roundhouse kick does everything you need to update the version of your service without having to relaunch your stack.
Here’s how it works: first, it will upload your build to S3. Next, it will do what we call an in-place update of the stack. Instead of launching new instances as a regular update would do, an in-place update just updates the existing instances. The time saved with the in-place update makes up the majority of the time saved. After updating the stack, KCS will then kick all the instances of the stack, which, in effect, restarts the stack and grabs the new version of the service you uploaded earlier. We like to think we made Chuck Norris proud with roundhouse kick.
“Chuck Norris can upload his new build to S3, update his stack, and kick his stack all at once.”
Grab logs and Failure logs
Sometimes you roundhouse kick your stack too hard and it stops working (there’s no such thing as a soft roundhouse kick). To find out what’s wrong, you have to ssh into the instance and check the logs. But there are many logs. And you’ll probably have forgotten where all of these logs are located.
Don’t worry — KCS has got you covered.
With a simple command, you can get all of the logs from your instance in a nicely bundled tarball. To do this, KCS knows the location of your logs thanks to some coordination with the Chef recipes that set up the logging system. After determining these locations, KCS will then perform an scp command with all the needed arguments to retrieve all the files. Now you can find out why your stack couldn’t handle the roundhouse kick.
What’s Next for KCS?
Even with all the cool commands that KCS has, there’s always room for improvement. People want KCS to run faster, have more features, and be invincible to bugs. When there’s a bug in a new release of KCS (which are unfortunately inevitable), the deployment team gets bombarded with complaints from disgruntled KCS users. We then work to fix everything, and try to get a new release of KCS out. But even when we do release a new KCS, not everyone remembers to upgrade their version and we continue to get complaints. We ask them to check their version and then we find out they aren’t on the latest version. An upgrade then fixes the issue. This is annoying and unnecessary for both KCS users and the deployment team.
To solve this problem, we created KCSServer — the website version of KCS, which has been my baby during my summer internship. Since KCSServer is a website, we don’t have to worry about people having different versions of KCS. We can very easily make changes to KCSServer without having to worry about getting people to install the latest version.
Migrating KCS to a website also provides many other benefits. One of the main issues we wanted to address was the speed of KCS. As a command line tool, KCS is pretty slow. For a command (such as describing a stack), KCS has to make a call to Amazon after determining all the proper credentials, and then once it retrieves the information, it has to output everything in a readable format for the user. With KCSServer, we can make this command much faster by utilizing a cache. A command has to be run once. Then, for all other times the command is run, KCSServer can just retrieve the output from the cache (of course, we update the cache as needed). This reduces the latency of a command from a couple of seconds to milliseconds. Considering that our rapidly-growing team of engineers uses KCS a lot, these seconds saved will quickly become hours, then days of developer time saved.. Another added benefit? With some CSS, we can make KCSServer look a whole lot more pleasant to look at than the dull terminal.
What’s the Take-Away?
Hopefully after reading about how we at Knewton use KCS to maximize our efficiency, you’ll start thinking more about how to eliminate inefficiencies in your own deployment process, or any process for that matter. Hopefully you’ll start asking yourself, “What’s slowing me down at doing my job?” and “What can I do about it?” Then you can go out there and create your own version of KCS. Don’t forget to give it an awesome name.