IT Event Filters, Zendesk Integrations Reduce Incident Pain
The new xMatters release is called Enduro, and the race is on! This blog post is going to explore some of these awesome features in a bit greater detail and put them into the real(ish)-world context of a leading ice cream delivery service. We’ll highlight IT event management flood control filters, incident notification response options, Zendesk actions in the Flow Designer workflow builder, and more!
Nomad Creamery is the Uber for frozen slab ice cream, combining all the creamy flavors, wonderful toppings, mix-ins, and tasty goodness delivered right to your door. To support the mobile app for ordering, and the logistics and inventory aspects of the ice cream business, Nomad runs a network of backend APIs and services. Keeping with the transformations in the industry, the teams have adopted the DevOps practices for building and running these APIs. As the engineering department started to grow, they needed a solution for getting important information into the hands of people who needed to know and enabling those people to quickly drive workflow forward.
The Nomad Creamery IT folks are long-time users of the xMatters services, so it was a natural transition to adopt the xMatters platform for the alerting and integration needs of the DevOps side of the barn. To monitor status and performance of the APIs, the teams decided to run with Runscope, a cloud-based API monitoring and testing tool that can provide detailed information on the performance of the tested APIs.
The Kubernutty Buddies team at Nomad have configured Runscope to monitor the backend APIs used by the delivery app and web interface. Looks like all greens across the board…. For now.
After running through the xMatters Onboarding course, the team lead for Kubernutty Buddies, Greg Smart, configured the new Runscope to xMatters integration (released in Enduro!) so his team could get handy notifications like this, which includes several helpful response options.
The Acknowledge response will — to everyone’s relief — stop the device and group escalations. The other response option is Rerun, which will retrigger the test in Runscope.
One morning, halfway through his first cup of coffee, Greg was alerted by the sounds coming out of his NOC dashboard.
He remembered the xMatters Enduro release also included a feature to add sounds to the Communications Center for new IT events. As the “critical hit” sounded several times, the whole Runscope dashboard went red. All the APIs started failing. Badly.
Alas, it was The Day DNS Died. APIs and websites all over the net were failing. Fortunately, Runscope and xMatters were still functioning! Normally, such a catastrophic and systemwide failure would also flood all the recipient devices, but the default Event Flood Control kicked in and saved the team from 31 events (and counting)!
Whew. Two cheers for IT event flood control! But that is old news. The Enduro release adds greater control over what will trigger a flood. So now, developers can have fine-tuned control over when the flood controls kick in, right down to the event properties. So, for the Runscope integration, adding the team_name criteria would add another level of discernment for a flood.
After recovering from the major DNS outage, the social team at Nomad Creamery had a “chat” with the CTO. They requested to be informed of any major updates, as the #somadcreamery hashtag had started trending. Social was called in to do damage control, and a heads-up would have been helpful. Nomad fans sure are passionate about getting their ice cream, and it’s not fun to trend for failures. The Social team just moved to using Zendesk for task and incident management, so the Kubernuttyiers were stoked to see the new Zendesk actions in the Flow Designer canvas!
Holy cow, not just one, but three new actions: Create a Ticket, Update a Ticket, and Add a Comment. The latter can be configured to post private or even public comments to the ticket.
After tinkering a bit, Greg is able to build out a couple of useful flows on the canvas for the Runscope notifications.
The first is fired from a response trigger titled Share with Social, which will create a new ticket in Zendesk. The other flow, fired from the event comments trigger, allows the recipient to send updates to the Zendesk ticket via comments he enters in the mobile app or the xMatters web UI.
The configuration inside the step shows how easy it is for Greg to drag, drop, and click to save time and headaches when services are impacted. He can be left to the task of actually fixing the problem, but also has quick access to pass updates to the social team.
That’s it for this edition of Arcade Tales. We looked at the new packaged Runscope integration and the new rules editor in Event Flood Control, then we saw how simple it is to deliver information across team lines and into Zendesk. And these were just a few of the awesome features our friends in the Great White North have been cooking up. Our latest release is chock full of cool stuff such as response highlighting in the All Events Report, some sweet User Upload updates, a whole app load of iOS mobile updates, and more. Stay tuned for more!
Does the Nomad Creamery story resonate with you and how your teams work? Post your feedback in the comments. We’d love to hear how you guys use these features today or plan to in the future! To try xMatters for yourself, race to xMatters Free and use it free forever.