Speakers And Presentations

Speakers and presentations are subject to change at any time.

Ethan Galstad: Founder @ Nagios

Ethan is the creator of Nagios from its earliest days in 1999 under the name “NetSaint”. He currently serves as the President of Nagios Enterprises and is involved in product development, architectural design, world domination strategies, and various management duties that seem to take more time than they should. During his free time he enjoys reading and working on his homestead getaway.

The keynote will cover new developments in Nagios and where things are going in the future. Topics covered will include commercial products, Open Source projects, and community initiatives.

New and Upcoming Nagios Solutions
Ethan will demonstrate new Nagios products and discuss their capabilities and benefits. Upcoming product roadmaps and release schedules will also be announced.

Andrew Widdersheim: Systems Administrator at INetU Managed Hosting

Andrew has been a Systems Administrator at INetU Managed Hosting for the last three years and has worked solely on the Nagios implementation and environment there. As part of his work, he created a one step build process for a large scale HA Nagios deployment.

Nagios Is Down And Your Boss Wants To See You
Handling Nagios in a large scale environment requires the ability to manage releases, scaling issues and failures. The OS community provides many ways to solve these problems. This talk will cover how RPMs, SVN, Pacemaker, DRBD, MySQL, NDO, pnp4nagios, rrdcached, NagioSQL and Nagios Core were used together to provide a system that:

  • Can handle ten’s of thousands of active service checks
  • Trend performance data
  • Node failures and recovery in under 5 minutes
  • Emergency fixes moved to production through release management in under 2 hours

During this session we will cover overall design issues encountered during implementation and lessons learned in production.

Dave Josephsen: Systems Administrator and Author

2002 Called; They Want Their RRDTool Shell Scripts Back
This will be a short history of visualization options for Nagios, ending with an in-depth introduction of Graphite and some real-world integration advice.

Stop Being Lazy And Write An Event Broker Module Already
This will be a class/instructional talk about writing NEB modules for novice C programmers, or programmers with a passing familiarity with C.

Ludmil Miltchev: Tech @ Nagios

Ludmil is a member of the tech team at Nagios Enterprises. He likes Linux and Open Source, and enjoys testing software, and finding bugs.

Bulk Management Of Hosts And Services In Nagios XI
The talk will cover bulk management tools in Nagios XI – including what tools are available and how they can be used. Bulk import/cloning of hosts and services will be covered, along with bulk user management tools.

Yancy Ribbens: Tech @ Nagios

Yancy is a member of the Technical Support Team at Nagios Enterprises. He holds a degree in computer networking, will be finishing his 2nd degree this year in mathematics, and plans to purse further education in the area of Computer Science. While working his way through school, Yancy has nearly 5 years industry experience, providing technical support and contributing to development projects. Yancy is an avid cyclist and former cat3 racer, who enjoys long bike rides and running his favorite dog Chinook.

Automatic Deployment And Management Of Windows Agents
The presentation will demonstrate the usage of a new Nagios utility that can automatically deploy monitoring agents to Windows boxes with minimal work on the part of the administrator. This tool can be used for deploying new agents or updating existing agents across multiple machines – in or outside of a domain.

Advanced Windows Monitoring With WMI, Powershell and VBscript
The presentation will cover the basics of writing Nagios plugins in Powershell and VBscript in order to extend the Windows monitoring capabilities of Nagios. The presentation will cover agentless monitoring of Windows machines using WMI.

Scott Wilkerson: Developer @ Nagios

Scott is the Technical Support Manager and Developer at Nagios Enterprises. He has a degree in Computer Programming as well as almost 20 years experience in the IT industry, 12 in senior management. Away from work he enjoys vacationing with his family, hunting & fishing.

Passive Monitoring Solutions For Remote Networks And Mobile Devices
The presentation will cover the installation, usage, and benefits of NRDS – a cross-platform (Windows, Linux, *NIX, OSX) passive agent for Nagios with automatic update support and support for remote configuration management.

Nicholas Scott: Developer @ Nagios

Nick Scott is a member of the tech team at Nagios Enterprises. He attends the University of Minnesota studying Electrical and Computer Engineering. In his free time he enjoys hacking in Python, taking things out of context and analytical number theory.

Advanced Data Analytics For Nagios
The presentation will cover advanced data analytics – including capacity planning, trend prediction, event correlation – and the tools that can be used for analysis.

Netflow Monitoring and SNMP Trap Management With Nagios
The presentation will cover monitoring Netflow data and SNMP traps using Nagios Network Analyzer and NSTI.

Sam Lansing: Tech @ Nagios

Sam is a member of the tech team at Nagios Enterprises. He has been a geek since he could pick up a screwdriver and open a desktop case, and is currently attending Minneapolis Community and Technical College for a portion of his Software Development degree. Most of his time aside from work is spent with one part studying, one part gaming and one part hanging out with room mates and family.

Automating Windows Application Testing and Problem Resolution With Nagios
The presentation will cover the method for automating application testing on Windows machines, as well as proactive problem resolution of Windows services and applications using event handlers.

Mike Guthrie: Lead Developer @ Nagios

Mike Guthrie is the lead developer at Nagios enterprises and has developed new features and addons for Nagios Core, Nagios XI, and Nagios Fusion. Mike does the bulk his programming in PHP and particularly enjoys front-end web development and data visualizations. When he’s not at work he enjoys spending time with his family, being outside, and working on his house.

Nagios XI 2012
This presentation will cover new features and capabilities in the new 2012 release of Nagios XI, including: improved auto-discovery, scheduled reports, bulk management tools, and new components and wizards.

Nagios Fusion 2012
This presentation will cover new features and capabilities in the new 2012 release of Nagios Fusion – a product designed to allow for greater scalability of distributed setups that other methods. The presentation will also cover challenges with other distributed monitoring solutions (such as DNX and ModGearman) and how Fusion helps overcome these problems.

Nathan Broderick: CEO/Founder @ AI Vector

Nathan is the founder and CEO of AI Vector LLC – a company which specializes in System administration and monitoring for companies both large and small. AI Vector has set up monitoring for several companies including Nu Skin International to monitor our systems locally and overseas.

Bringing Nagios XI Into Your Business
This presentation will cover to best way to migrate to Nagios XI, as well as many of the benefits that XI brings you over Nagios Core.

Eric Loyd: CEO @ Bitnetix Incorporated

Eric Loyd is the founder and CEO of Bitnetix Incorporated – an IT consulting and systems integration company dedicated to providing highly customizable VOIP telephone systems and expert advice in all areas of technology, IT, and computers to not-for-profit organizations and small businesses. Mr. Loyd has 25 years of IT experience and 10 years in senior management, and the Nagios platform he designed for monitoring the kodak.com server farm in 2004 is still in use today. He is also a musician, photographer, amateur astronomer, and aspiring serial entrepreneur.

Nagios Implementation Case: Eastman Kodak
This presentation will cover a Nagios implementation that was done at Eastman Kodak. It could alternatively be titled “How Predictive Failure Recovery let our support staff get a full night’s sleep.”. This talk will demonstrate how Kodak implemented Predictive Failure Recovery to increase the quality of life of its of overworked support staff by decreasing the number of on-call pages received in the middle of the night (and day). Topics covered will include:

  • A history of Eastman Kodak’s kodak.com web server infrastructure
  • Why Kodak chose Nagios to monitor kodak.com
  • What the initial hurdles were in this complex server environment
  • How we leveraged SSH to solve remote server issues
  • How we manage Nagios configuration files
  • Using a common event handler
  • Integrating Nagios into Operational Procedures
  • Using SSH to execute checks on remote machines as active checks instead of passive checks
  • Using cfengine, rsync, and makefiles to manage, distribute, and update configuration files
  • How a naming scheme for machine names, service names, and parameters included in those names allow a single event handler to do ‘heavy lifting’ for complex tasks with a simple API.
  • Using custom scripts to directly manipulate the Nagios command file, thus implementing a basic Nagios API for other operational procedures to simplify overall system operations and application support

The talk will focus on techniques regarding notification, escalation, and automatic event handler processing that enabled Nagios to solve most of the routine problems that would occur prior to notifying operations staff. This allowed us to dramatically decrease pages at all hours (but especially in the middle of the night) and let our staff be more restful and more productive as a result.

Andreas Ericsson: System Designer @ op5

Andreas is one of the Nagios Core developer. He started fiddling with Nagios back in 2003. In this words: “In 2009 I was granted the highest honor available in this community when I became one of the core developers in 2009. Sort of like being knighted in the days of yore 😉 In my spare time I like to work out, play beachvolleyball, kick back with some playstation gaming or read fantasy books. All of it can, and often is, combined with hanging out with my friends or my girlfriend. I work as a system designer and programmer. If I could choose profession though, I’d be a gardener in the summertime, a stunt helicopter pilot when there’s need for that and a programmer during the long, cold swedish winters.”

Nagios Core Worker Processes
This presentation will provide an in-depth look at how Nagios Core worker processes work, the API’s that make them tick and how additional addons can be quickly developed using those API’s.

Redundant and Load-Balanced Monitoring With Merlin
This presentation will provide a technical explanation of how Merlin works and why everyone should want to use it.

John Murphy: Server Engineer @ Kmart Australia

John has worked at Kmart Australia for the last three years as a Server engineer and is responsible for the deployment and maintenance of the company Nagios installation. John is also a current Nagios community MVP and all-around IT enthusiast.

Rational Configuration Design To Prevent Irrational Problem Solving
This session is aimed at discussing configuration and automation design considerations for a manageable and scalable Nagios monitoring solution. Included will be the following topics:

  • Designing your config to reduce administrative overhead.
  • Accounting for change and new integration tasks.
  • When to acknowledge you’ve gone one band aid fix too far.
  • Using your network infrastructure setup to take config automation shortcuts.

Troy Lea: Senior VAULT Infrastructure Engineer @ Strategic Group

Troy Lea is a self-described jack of all trades. His background is in Microsoft products starting in 1995 with DOS and Windows 3.11. From Troy: I have worked for various IT consulting companies supporting small businesses who needed IT support and have also spent some time working for a dedicated IT department in the engineering sector. In 2006 I implemented a hosted environment using VMware ESX and Windows 2003 / Exchange 2003 based on the “Microsoft Hosted Messaging And Collaboration” solution. These days everyone calls this cloud computing. This is the environment I continue to maintain and upgrade. In 2009 when looking at monitoring products for our hosted environment we came across Nagios XI. This is where I saw how flexible the product was and I started created Wizards for Nagios XI. Since then I have also created documentation, plugins and components for Nagios XI. To an administrator that is new to Nagios XI it can get a bit confusing trying to learn how to configure Nagios. You might download a plugin from the Nagios Exchange, but what is the next step? When I work out how to monitor xyz device in my environment I then am able to turn around and write a Nagios XI wizard for that xyz device. When someone downloads this wizard from the Nagios Exchange all they need to do is step through the wizard and after a few mouse clicks they are monitoring the same xyz device in their environment. This is where my passion lies in creating configuration wizards, making things easier for other administrators out there! I also enjoy a good game of darts, music festivals and social dynamics.

Custom Wizards, Components and Dashlets in Nagios XI
This session will cover how to customize and write your own configuration wizards, components, and dashlets for Nagios XI. As part of the presentation, Troy will cover several of the XI projects he has created thus far.

Daniel Wittenberg: System Architect @ Insurance Giant

Dan has been a Unix/Linux admin for over 15 years and a Netsaint / Nagios user for almost 10 of those. Having worked as a consultant in many industries has given him a broad range of monitoring expertise. He has written many custom plugins and event brokers and also contributed patches and updates to many others. He is also an avid open source promoter and uses it whenever possible.

Scaling Nagios Core at Fortune 50 Company
In this presentation we will be covering all aspects of Nagios needed to scale to a large environment with over 35,000 devices and 1.4 million service checks. We will look at hardware, operating system, Nagios Core, plugins, and configurations that you can use in a large scale deployment. We’ll also cover some performance tuning guidelines that you can use to find some of your bottlenecks and where you can look to improve your configuration. Some plugins covered include PNP4Nagios, NSCA, NRPE, DNX, Livestatus / Multisite, Puppet, Splunk, and Cacti.

Nathan Vonnahme: Sr. Software Engineer @ Banner Health

Nathan has been using Nagios since 2006 to monitor highly available IT systems at a hospital in Alaska. He likes hooking heterogeneous systems together to make them greater than the sum of their parts.

Monitoring The User Experience For Availability and Performance
Monitoring the actual user experience can be an important addition to your monitoring coverage. By measuring the availability and performance of your “finished product” you can benchmark the effects of production changes and catch problems that may not be covered by lower-level monitoring. This talk will demonstrate frontend monitoring of legacy Windows apps delivered via Citrix as well as front-end web app monitoring.

Writing Custom Nagios Plugins
Custom Nagios plugins are easy to write and they open up huge possibilities for Nagios. This demo/workshop will walk you through writing your first custom check script. We will start using Perl and the Nagios::Plugin toolkit, but move on to examples in other scripting environments including AutoIt and Node.js. Bring your favorite language and editor!

Todd Groten: Sr. Network Operations Engineer @ Activision Blizzard

Operations Engineer by day, gamer by night. Todd Groten has been active in the I/T industry for over 28 years, with 15 of those in the specialty of system/application monitoring. He was a faithful NetSaint user for a period of time, when he stepped away from the Linux world to become engrossed in the Windows Server monitoring universe, eventually specializing in Systems Center Operations Manager, from its humble NetIQ beginnings, while working at Dell in Round Rock, TX, through all its various iterations to date, working for Microsoft in Redmond, WA. Todd was then asked to jump back into the Linux world and relocated from Redmond back to sunny Southern California to join Activision’s Call of Duty:Elite operations team, as the lead Service Reliability Engineer, where he created the Elite Service monitoring infrastructure from the ground up, using Nagios XI. When not writing plugins, administrating Nagios, leading the NOC team or architecting the next piece of CoD:Elite infrastructure, Todd likes to Play Modern Warfare 3, Geocache, practice studio photography and sometimes even go gold prospecting.

Case Study: Monitoring Call of Duty: Elite …or How To Dynamically Scale Monitoring in the Cloud
Presenting a case study on monitoring Activision’s Call of Duty: Elite. A high-traffic, highly scale-able, deep stats tracking social network and web service for all Call of Duty gamers. Explaining how we monitor our dynamically scaled servers in the cloud and detailing the trials & tribulations with automating the adding & removing of servers in Nagios for that kind of highly fluid environment.

Robert Bolton: Systems Administrator @ University of Utah

For the past eight years Robert V. Bolton has worked as a system administrator, focusing on IT infrastructure monitoring and automation. He specialize in the following Open Source tools: Nagios, Cacti, and Cobbler. Currently he works for the Center for High Performance Computing at the University of Utah as a Linux and network administrator. Robert is also pursuing his bachelors degree in computer engineering form the University of Utah.

Custom SNMP OID Creation for System Monitoring
SNMP provides and excellent means to gathering system information. Tools such as Nagios and Cacti use this information for status monitoring and trending. However some system information is not provided through SNMP. In these situations both Nagios and Cacti have methods to obtain this data, but it is not easily shared between the two applications. This presentation introduces an method that uses Python and the snmp_passpersist module to create a custom branch in the OID tree structure that can be easily accessed by both Nagios and Cacti.

John Sellens: System Administrator @ SYONEX

John Sellens has been a system administrator for over 25 years, and has been teaching tutorials on Nagios and monitoring since 2001. He has implemented Nagios in many different environments over the years, and has created or “improved” more plugins and related tools than he can remember.

Nagios and Another Layer of Indirection
There’s an old saying that “you can solve any problem in computer science with another layer of indirection”. That goes double for Nagios. Because of the modular design of Nagios, it is easy to extend it in many ways, using a number of mechanisms. Many of these mechanisms make use of commands external to the Nagios core (plugins, notifications, etc.) and provide ample opportunity for applying “another level of indirection” for increased functionality. In this talk we will look at ways in which Nagios can be extended through indirection, with examples including tried and true techniques as well as new tools, including:

  • the use of “negate”, “check_snmpexec”, “check_allstorage”, and other plugins
  • “mb_divert” for diverting plugin execution to other servers/locations
  • “tellito” for routing notifcations over multiple mechanisms
  • “genoa” for formatting and composing notifications depending on the content

Workshop: “Non-Obvious” Nagios (2 Hours)
The Nagios system and network monitoring engine is very well known, and used in many networks, large and small. Basic installation and initial configuration of Nagios is usually fairly straightforward, but making use of some of the more advanced or obscure aspects of Nagios can make your monitoring much more effective. This class will cover some of the more advanced features and abilities of Nagios and its related tools, which are especially useful in larger or more complex environments, or for higher degrees of automation or integration with other systems. The class will provide an introduction to installing and using Nagios, some of the most popular extensions available, and information on customizing Nagios in your own environment.

Marcel Hecko: Solutions Specialist @ blava.net

Finished masters degree at the Robert Gordon University in Scotland in industrial and product design. Have been an early adopter of wifi networks in Slovakia and founding member of one of the first community driven wifi networks in the world. Life hacker, visionary and community oriented person developing wifi networks for deprived areas in Scotland, UK and Bratislava, Slovakia. Have been given an Open Source Excellence Award from the University of Edinburgh for use of open source technologies in government environment and being an active member of the open source community. Early member of Slovakias first Hackerspace – Progressbar. Currently freelance consultant for open source and especially open source monitoring solutions for banking sector in central europe including Slovakia, Ukraine and Poland. Developer of NagMap – popular extension for Nagios for network visualization.

Importance of Visual Representation of Monitoring Data
Nagmap has been a very popular visual extension to Nagios during the last several years. It visualizes data so they are easily readable and users can make very quick decisions without making technical digging. Why we forget about helpdesk people and why are custommers frustrated when they don’t see actual status of environment. Why has NagMap been more successful then other visualization tools and what can be improved. Get USERS of Nagios get more interested in extending their installations according their visual needs. Being motivational in stressing the importance of visual perception in operations environment. Seeing is believing.

Jason Cook: Systems Engineer @ Verisign

I have worked as a technologist and various system administration type roles for over 15 years with a focus on web technologies, systems automation, and systems management. My history with Nagios goes back to Netsaint, and my current responsibilities include the design and support of Verisign’s Nagios infrastructure.

Nagios and Mod Gearman In A Large-Scale Environment
An examination of Nagios, Mod Gearman, and Merlin in a large-scale environment. Discussions will include performance tuning, scaling considerations, performance on VMs, and challenges encountered along the way.

Nicolas Brousse: Lead Operations Engineer @ TubeMogul

Nicolas is a Lead Operations Engineer at Tubemogul. He worked for the past 12 years in many industry leading french start-up. From web hosting to online video services like Multimania, Lycos, Kewego. Nicolas gained experienced working with heavy traffic and large user databases. By joining Tubemogul – a Brand-Focused Video Marketing Company – he helped growing the infrastructure from 20 servers to over 700 servers and handle over 10 billions HTTP requests per day.

Optimizing your Monitoring and Trending tools for the Cloud
Nowadays most start-up are using cloud solution, while some will go from public cloud to hybrid solution, all have to deal with fast growing infrastructure. In this presentation, I will go over few solutions that we implemented at TubeMogul while growing from 20 servers to over 700 servers in 4 years and dealing with over 10 billions HTTP requests a day. With so many informations and data every day, it’s hard to get a good read of what really matter and to alert the right person. You will learn how we integrated Nagios with Google Calendar for easy on-call rotation management, how we centralized our Nagios information from 5 different DC to a common dashboard, how we make daily maintenance report for pro-active action.

Luis Contreras: Nagios Community Leader – Dominican Republic

Luis Contreras works as SAP Basis Netweaver Administrator. He is involved in several Opensource Organizations like: Nagios Community Leader for Dominican Republic, Latinux from Venezuela as Linux Certification Coordinator, CISL (Congreso Internacional de Software Libre) from Argentina as Ambassador for Dominican Republic for this congress. He has worked as System Administrator for important company as IKEA at Dominican Republic. Since 2005 he has given conferences about Linux, Nagios and Security over Linux in some universities in his country. He’s also working on founding his own company for offering Linux & Nagios support.

Success Cases: Nagios Implementations in the Dominican Republic
This presentation will cover three success cases of Nagios implementations and how each case helped the business in saving money and being more efficient in solving problems.

Dave Williams: Technical Architect @ Bull Information Systems Ltd

I have worked in the IT industry for over 35 years, mostly as a Software Engineer or Sysadmin on systems from IBM Mainframes to Apple II. Currently designing and implementing technology solutions involving networking / storage / computing resources for numerous clients across Europe.

Experiences with Embeding Nagios on RaspberryPi
Given the increase of interest in micro format systems this presentation will describe an implementation of Nagios on the RaspberryPi – a credit card sized $25 computer. This will also demonstrate the ability to cluster these devices for horizontal scaling of capacity. At least one device will be on show for demonstrations.

Bryan McLellan: Technical Program Manger, Open Source @ Opscode

As one of the original Chef developers, I now manage open source projects for Opscode. I have over a decade of experience in IT and Web Operations, and managed Nagios installations for many of those years.

Configuring Nagios With Chef
Don’t let you Nagios configurations get out of date while you’re busy building out more more systems. I’ll show how to use search based infrastructure with Chef to automatically configure Nagios.

Sheeri Cabral: Database Administrator @ Mozilla

Sheeri K. Cabral has a master’s degree in computer science specializing in databases from Brandeis University and a background in systems administration. Unstoppable as a volunteer and activist since age 14, Cabral founded and organizes the Boston, Massachusetts, USA, MySQL User Group and is the creator and co-host of OurSQLCast: The MySQL Database Community Podcast, available on iTunes. She was the first MySQL Oracle ACE Director, and is the founder (and current treasurer) of Technocation, Inc., a not-for-profit organization providing resources and educational grants for IT professionals. She wrote the MySQL Administrator’s Bible and has been a technical editor for high-profile O’Reilly books such as High Performance MySQL 2nd Edition and CJ Date’s SQL and Relational Theory.

Alerting on MySQL with Nagios
There are many Nagios plugins for MySQL, but this plugin optimizes statistics gathering, so that adding variable checks does not add a linear amount of database stress. This session will walk you through how to use and extend this set of Nagios plugins to more effectively monitor your MySQL instance. There is a whitepaper at http://dev.palominodb.com/docs/nagios_plugins_2011_04.pdf

Alex Solomon: Co-Founder and CEO @ PagerDuty

Alex Solomon is the co-founder and CEO of PagerDuty, the leading SaaS incident management solution which allows IT organizations to add on-call team scheduling and phone/sms/email alerts to most monitoring systems, including Nagios.

Managing Your Heroes: The People Aspect of Monitoring
While solutions like Nagios exist to give IT organizations amazing detail to help analyze everything from monitoring an entire IT infrastructure to immediate notification of when system failures occur, it is still up to the people within that organization to respond – YOUR Heroes.

In this session, Alex Solomon, co-founder and CEO of PagerDuty, will examine:

  • The elements that are needed to create a great on-call team
  • Best practices related to scheduling on-call teams to respond to outages
  • Review the importance of escalated alerts through sms, email and phone
  • Scheduling of teams within different organizations, time-zones and internationally
  • Integrating 3rd party applications into Nagios via an API to extend and tailor Nagios to fit your needs

Mike Weber: Lead Trainer @ SpiderTools.com

Mike is the lead trainer at CyberMontana Inc. – the company behind SpiderTools.com. The company was established 13 years ago with the goal of providing Linux training for a reasonable cost. The company trains up to 200 people a week using virtual classrooms and live on site training including Nagios Basic and Advanced classes.

Object Inheritance: The Foundation of Nagios Management
This session will cover the basics of how objects are inherited in Nagios including: local vs. inherited variables, chaining, precedence in multiple sources, incomplete object definitions, custom variables, canceling inheritance, additive inheritance and hostgroups as an illustration.

Monitoring Linux with NRPE
The standard for monitoring Linux is NRPE. We will example basic use of NRPE and how to modify commands and use scripts that can be incorporated into the use of NRPE. We will also look at the various agents to help understand the differences and advantages of using these agents.

High Availability For Nagios XI
When bad stuff happens it is always nice to have a failover option with Nagios. We will look at one option in how to create HA. Goals of this project:

  • create master/slave relationship
  • master is normally functioning XI which sends history to the slave
  • slave does not check services, hosts nor does it send notifications (when master is running)
  • slave monitors master and when master fails slave takes over with current history
  • slave enables host and service checks when it becomes the master and enables notifications
  • slave disables all checks and notifications when master comes online
  • slave updates history of master when master comes on line

10 Quick Steps To Disaster With Nagios
We will look at a few topics that can make your life miserable, make you look bad and make you wish you were a Windows administrator. The goal will be not only to point out some of these but also to talk about how to fix or avoid these issues to help in Nagios performance and functionality.

  • inheriting aberrations with objects
  • hoping bad things won’t happen
  • monitoring none existent ports on switches
  • ignoring/encouraging system warnings
  • mangling users and contacts
  • encouraging non-accountability for changes
  • abusing Nagios XI wizards
  • disregarding network relationships
  • importing infectious diseases
  • overestimating human intelligence

Anders Haal: Founder and Sr Consultant @ Ingby

Anders Haal is an IT professional with focus on project management and software development in the area of IT and business operations. He is one of the founders to the consultancy company Ingeng^rsbyn AB, a company with focus on business and software development. Anders are involved in two open source projects, bischeck (www.bischeck.org) and socbox (gforge.ingby.com), both projects related to surveillance and monitoring with Nagios. Anders holds a degree in MsEE from the Royal Institute of Technology in Stockholm, Sweden. In his spare time he likes biking, snowboarding, telemark skiing and golf together with family and friends.

Why Dynamic and Adaptive Thresholds Matter
The presentation will cover why dynamic and adaptive thresholds is key when monitoring something that is a bit more complex then monitoring when 80 % of disk space is used. The presentation will go through a number of use cases where threshold logic needs to be:

  • Based on historical data
  • Aggregated from multiple service(s) performance data and/or state
  • Adaptive when the threshold is process related
  • Based on the time of the day and other calender profiles
  • Expressed as a mathematical function based on series of data to calculate average, derivative, sum, etc.

In the presentation we will show how this has been achieved using the Bischeck open source framework, www.bischeck.org, integrated with Nagios.

Fernando Honig: System Administrator @ Intel

Fernando is a young enthusiastic person in the monitoring side. He has 10 years of Linux System Administration and MySQL DBA experience and has worked for EDS and IBM during last 5 years.

Nagios Distributed Monitoring for Web Applications using WebInject
This presentation will cover how we have integrated Nagios and WebInject in a distributed monitoring infrastructure in AWS EC2 using event handlers for making extra checks from different locations setting a threshold for each location and fire an status for each situation. We also call different APIs from other monitoring tools like Gomez Networks and Pingdom and use that as feeds. For notifications we integrated CallWithUS (For VoIP calls) and Clickatell for SMS alerts.

Kishore Jalleda: Director Of Operations @ IMVU

8+ years experience running large scale 24×7 Web Operations using nagios for monitoring and alerting. Current working as “Director of Operations” for IMVU ( http://www.imvu.com/about/ ). Really enjoy playing sports like tennis, cricket, ping pong and Volleyball. Love fast cars :).

Nagios in the Agile/Devops/Continuous Deployment World
Running Nagios in an Agile Continuous Deployment (CD/CI) shop like IMVU ( See: http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment-at-imvu-doing-the-impossible-fifty-times-a-day/) can be a different than running it in a more traditional software development shop. The biggest challenge comes from the fact that things change very quickly, break frequently, go silently unnoticed, features add up instantly, failures cascade without much warning, etc. This means one really needs to adapt and tune their monitoring and alerting processes, procedures and infrastructure to meet the demands of being truly agile.

This presentation is aimed at exposing how IMVU uses Nagios as its core monitoring and alerting solution for running and enabling our $50M+ online business. We monitor everything from system level metrics like CPU to application metrics like how fast a counter is expected to increment for a new feature that was rolled out to 1% of customers to business level metrics like revenue, registrations, logins / sec and much much more.

It tries to project how at IMVU we try to limit Nagios to doing what it does best and decoupling the rest for scalability and agility reasons. It also tries to show how we have been successful in getting other teams ( like engineering, devops, marketing, data, customer service, product, etc to use/adopt/embrace it in some way shape or form to make them more self sufficient and agile.

Alexis Le-Quoc: Co-Founder @ Datadog

I’m the cofounder of Datadog (http://www.datadoghq.com), a monitoring SaaS that aggregates alerts, and more generally IT events and metrics from on-premise apps as well as cloud services. Nagios is one of our most successful integrations and we analyze and detect patterns in our customers’ alerts to help them improve their service quality.

Javascript Meets Nagios: Interactive Data Exploration for Post-Mortems
Nagios does an excellent job at monitoring services, escalating to and alerting people. We as people have, on the other hand, strong cognitive biases that over-represents recent issues (what happened last night, last week). These biases can get in the way of fixing the right things in the long-term. In this talk I present a few different visualizations of the Nagios data that have helped practitioners ground their post-incident actions. Some of these visualizations are aimed at finding daily patterns otherwise buried in notification emails. Others survey longer timeframes to gauge the amount of progress in terms of check coverage, check usefulness, full nights of sleep and other useful metrics. All these visualizations are built in javascript, on top of open-source graphing libraries. No prior experience is required, beside familiarity with Nagios.

A Deep Dive into Nagios Analytics
As an aggregator of Nagios alerts (among other things) for a growing number of IT organizations we have a fairly unique vantage point into patterns of Nagios is used in the field. In our data set we find answers to questions such as:

  • when do alerts tend to happen (time- and day-wise)?
  • what is the typical MTTR?
  • is there a correlation between team size, infrastructure size and Nagios coverage?
  • is there a correlation between infrastructure complexity and alert storms?

This will be a data-driven session that should be of interest to all Nagios practitioners of all levels of expertise, who might want to gauge their own experience against what we have observed.

Paloma Galan: Nagios Community Leader – Spain

Paloma Giraudo Galan has a wide knowledge of testing and monitoring System IT with Open-Source solutions. She has worked as a project Manager in Madrid for the last 6 years and has experience in pre-sales of monitoring and testing applications. She is finishing the last details of her first start-up application (www.sinbarreras_1870.com) that will allow people with disabilities to explain those places they can go (cinemas, restaurants, etc).

Case Study: Nagios Deployment In Spain
This presentation will cover Nagios success cases in Spain, where Nagios is deployed, and how it is used.

  • Business sectors where Nagios is used
  • Different monitoring tools used in Spain
  • Deployment in banking and journalistic environments

Jared Bird: Network Security Administrator @ Healthcare Organization

Jared Bird currently works during the day maintaining a respectable level of security at a large local healthcare organization in the Minneapolis/St Paul area. He has a passion for everything security related and in his spare time he enjoys breaking things, bending the rules, and developing a plot for world domination.

Nagios: Providing Value Throughout The Organization
This talk will discuss how Nagios can be used to provide value to several areas of an organization. Providing value to
areas such as security, audit and compliance in additition to the traditional infrastructure teams including ways that Nagios can assist in achieving compliance with several standards/regulations such as PCI, SOX, HIPAA, etc. will be discussed.

BoF Sessions

BoF (Birds of a Feather) sessions are informal, ad-hoc sessions where attendees discuss issues that are of interest to them.

NOTE: There will be a signup sheet for BoF sessions at the conference. Feel free to add your own session to the signup sheet and host a BoF session with other attendees to discuss ideas and issues you’re interested in.

BoF Session: Nagios Certification Q&A

We’ll discuss questions relating to Nagios Certification – including the certification tests, requirements, and any other issues you’d like to bring up.

BoF Session: Nagios Partner Programs Q&A

Interested in becoming a Nagios partner? Are you an existing partner that has questions or ideas about our program? Join us at this BoF, where we’ll discuss items relating to our Nagios partner programs – including our reseller program.