Be The IT Hero With Nagios
Hundreds of thousands of professionals worldwide find Nagios an invaluable tool in their daily work. Our favorite stories are the ones that show how much Nagios has helped an IT admin and their department or company.
Have your own story to tell about how Nagios helped make you an IT Hero?
Reducing Downtime and Increasing Productivity with Nagios
Before our company headquarters implemented Nagios, the IT team was always one step behind trying to fix critical problems in our main processing area (dubbed the “Core”). This area has 23 touch screen monitors powered by thin clients. These screens are crucial for the technicians to easily retrieve and enter donor data. Previously, IT would have no knowledge of a screen being down until someone working in the Core notified us. This all changed with Nagios.
With Nagios now monitoring all 23 screens, we can now easily see when a thin client stops responding or loses power, and we can react very quickly. The powerful reporting tools built into Nagios also help us track the overall reliability of the thin clients. Screen downtime has been reduced drastically and Core productivity is up.
Nagios has helped our department drastically, and because we are a non-profit, Nagios seemed like the right direction to go.
Tom Ashworth – NuBlue
Using Strong Infrastructure Monitoring to Help Treat Patients
A few years back I started working for a big hospital with around 3.500 employees in the south of The Netherlands as a system and network administrator. That same year the hospital decided to build an Electronic Patient Document system itself. This was a very big commitment. After years of building, testing, and growing it proved to be a big success and our hospital’s patient administration has been paperless since 2008 (the first hospital in the Netherlands to do so!).
The IT department that was responsible for the network and system administration was not big and the people had to work hard to keep everything running. There was not a lot of time available for research and development, everything was about the now and the problems that had be solved.
Me and three other enthusiastic guys started to make a plan to do everything that we could to improve the reliability of the infrastructure, make it more flexible and cut costs wherever possible. The goal was to get to the point where we could say that we are a professional data center that could easily compete with a commercial one.
A part of the plan was to use Nagios as a monitoring system. The main reasons we chose Nagios were that Nagios is very flexible, that you have full control over the tool and that you do not have to depend on a supplier and hire expensive consultants. The return on investment is very high. It was important to us that the Open Source community around Nagios is very active.
After we started to use Nagios we got results very quickly. A lot of the simple and time-consuming daily routine tasks were now automated. We started to send text messages for certain critical problems. Known problems that happened daily and needed manual intervention were solved automatically by creating scripts which are run after Nagios detects the problem. A lot of systems in the hospital are connected by interfaces and monitoring these interfaces helped improve the stability and solve problems before the customer noticed that there even was a problem.
We started to use graphs so we could analyze the information we got from our systems. Now we are able to make plans for future storage expansions based on valid figures. We can see at what time a certain server has CPU or memory resource problems and how many users are logged into the databases. We monitor the speed of the access to databases, if tables need to be expanded. We use the graphs to cut costs of expensive resources like storage and memory for Virtual Servers. No more staring into the darkness and there is time available for R&D!
Nagios has become an important tool for the IT department of the hospital. It helps us to make sure the doctors and nurses can rely on the automated systems that play such a big role in their daily job. So you can say that Nagios helps us treat our patients!
Roderick Derks – St.Elisabeth Hospital
Solving Problems Before Users Notice Issues
I deployed Nagios for large retail company where I’m working as full-time sys admin. Currently there are 53 monitored hosts with 423 services. Before I deployed Nagios there was lots of problems when the services would go down and we were informed post factum by user’s support, which creates large number of problems.
Now we are proactively react on any problem before it strikes users. I’m very proud to be a part of such a powerful monitoring system, which is helping us every day. Thank you guys for the great software!
Andrey Mitroshin – Akamit Systems
Power Outages Can’t Stop The National Radio Astronomy Observatory
Power outages happen. They’re a fact of life for IT admins. And when they do, I turn to Nagios to see what’s up, what’s down and how long things have to live.
When a storm knocked out power to our leg of the local substation, our combined monitoring solution of a Sensaphone IMS-400 and Nagios Core alerted me to the power outage. By the time I got my home DSL up again, a quick glance at the Nagios status screen told me one of the UPSes in the router room had already failed. I began the shutdown procedure remotely, relying on Nagios to tell me what was still up. Eventually my remote connection died as the VPN server’s UPS tanked. However, since Nagios sends SMS pages over a separate internet connection, I continued to get status updates on my mobile phone.
I knew from the last glance at my laptop that I had about an hour left on the main UPS. I drove in to the office and discovered 2 other admins had also been alerted and were responding on-site. We ran our shutdown checklist as quickly as possible, all the while Nagios reporting the status of our datacenter.
When our power came back the next day, we ran the startup checklist, starting with the Nagios server which we used to verify that we hadn’t forgotten anything and that everything came back up smoothly. We had no data loss and we were all heroes!
Thank you, Nagios!
Josh Malone – National Radio Astronomy Observatory
Nagios Helps Empower People, Save Money at ValueClick
I came to ValueClick over 8 years ago to build the network operations organization which includes monitoring and notification for the company. I was tired of using “enterprise” monitoring solutions which just were not “open” and scalable for a companies needs. The company started using Netsaint, and quickly moved to Nagios when the upgrade was available.
Nagios has made me a hero within our company as its given us the ability to monitor and notify our real time ad serving environment, serving over 5 billion ads per day. We currently have over 15,000 elements within our Nagios configuration and through automation and great integration with our RT system we have designed and implemented a real-time notification engine for over 30 distinct escalation groups within the company. In addition, thanks to Nagios we are able to empower our over 300 engineers to develop their own monitoring checks and implement them into the Nagios system, giving us “best of breed” monitoring within the company.
We are most likely one of the largest online service based companies in the US, not requiring a 24X7 on-site team (NOC) thanks to complete automation and “accountable monitoring”. This saves the company money and also puts the alert into the hands of the person responsible for fixing it.
Thank you Nagios!
Michael Lydon – VP, Network Operations @ ValueClick
Nagios Helps Keep It Cool at Columbus Museum of Art
We employ Nagios in-house because we’re non-profit and the price is right! Nagios has saved the day many times for us. Here are a few examples.
We had servers in a closet due to lack of data center. We had a ceiling HVAC unit to cool the room and it would die often. We actually lost a server because of it. Well, we purchased a temperature sensor that works with nagios and installed it. Within the week the unit failed and we got an alert within 10 minutes because the temperature rose so quickly. We were able to get service for the unit quickly and come up with measures to lower the heat to keep things running. Certainly saved the day!
We also started monitoring our website with nagios because it would go down often and we wouldn’t know it. So, with the monitoring in place, we get immediate text messages when it’s down and we can notify our design company to kick it. This has been a life saver.
I don’t think I could do my job without nagios.
Thomas Deliduka – Columbus Museum of Art
Nagios Empowers Users, Keeps Customers Happy at Sunrise Banks
Prior to having Nagios if any of our resources were down, we wouldnt know about it until the user or customer informed us that they were unable to do something. After dealing with several irate end users and customers we found Nagios. Then we were notified any time a resource was experiencing an issue before anyone even noticed. The downside was that without having everyone in the IT Dept knowledgeable in Linux, Nagios could only be managed by a few people. Then came the release of Nagios XI. It is so easy to use, even the managers can use Nagios now. They are able to add their own devices, alerts, etc. So now they dont have to rely on the few that know Linux to manage Nagios. That’s how Nagios made me a hero!
Jared Bird – Network Administrator, Sunrise Banks
Nagios Helps AmberWyvern Studios Be Proactive
One of our clients was having difficulty tracking down random network events – disconnects, high latency, improper network usage, across their LAN and WAN. Their IT infrastructure had experience several upgrades in the 4 1/2 years we have been providing IT support services, including new T1s, new Cisco switches, fiber optic cable run to connect buildings in their CT headquarters, new domain controllers in CT and in their factory in MA. They wanted, nay, needed a solution that was both cost effective (especially in this economy), and quick to deploy.
Nagios fit the bill perfectly. Using a custom-built Linux box, running Fedora 11, and NSClient++ and several plug-ins for the Cisco switches, we have provided them unprecedented visibility of their IT infrastructure. Nagios provides a real-time status map of every aspect of their network, both in CT and in MA. Before we were only able to react to events and anomalies. Now we are proactively monitoring the network.
This yielded results! We found issues in their physical layer of their network; poorly terminated cable drops, several rogue hubs and switches employees put in place to share their network port, poorly configured NICs. Nagios made us a hero in the eyes of our client.
Thank you for creating such an awesome tool.
William Bulman – CEO, AmberWyvern Studios LLC
Nagios Saves News Year Eve For Admins at Argentina Economics Ministry
Some admins – including myself – have to work on New Years Eve. On December 31st of 2006 things went very wrong with our datacenter. I was at home at 3:00 pm and an SMS arrived. I thought that someone was sending nice happy New Year message, but i was wrong! Nagios was telling that the datacenter was going down because the power went out!
Thanks to Nagios I could solve the problem before midnight and celebrate New Year with my family!
Luis Aparicio, Ministerio de Economia – Argentina