Server monitoring

The Web Space Bar

The Web Space Bar rep
Company Rep
Joined
Jun 14, 2012
Messages
101
Reaction score
35
Location
Johannesburg
Good afternoon fellow hosters.
Id like to hear from you , what you guys use to be notified of server outages , even in the wee morning hours.
I believe that a hosting company should know about an outage before a customer tells you about it ,
with this in mind ,
We are using pingdom and an app on android called HostMonitor.

Anything else you guys are using ?
 
Nagios. Integrated with Pingdom (for off-site) and Pagerduty (for sms/on-call).
 
We use the Icinga fork of Nagios for our primary service monitoring with notifications to our jabber server and email. Our monitoring node is hosted in the UK, but at another data center from our UK hosting operations. We also have another Icinga node at MWEB that monitors the same hosts/services as the UK node.

We also also use a tool called Cacti to graph our servers load, memory, mail queues, mail traffic, spam count etc. This is located at MWEB so that stats from our primary hosting location are collected fast. This is used more for pro-active monitoring (spam, mail queues, load etc).

Pingdom is also used for additional ping monitoring. We've been using the Nagios + Pindgom setup for the last 7 years and have recently started using Cacti graphs. This has worked pretty well for us :)
 
Cheap solution I found : http://tagbeep.com/

We use it to monitor our our servers via responses from accessing their IP's. Works quite well and costs nothing :)
 
tagbeep.com and uptimerobot.com are very nice free services, both offer twitter direct messages and email alerts.

Monitoring however is not just about ping and http responses, or even tcp connections; it's also about valid data being presented and system resources and availability. Ideally you want a monitoring system that can react to snmp traps as well as do snmp polls. Plus give you trend analysis on the statistics.
 
I use ServersCheck for all my monitoring needs, it covers round about 90% of all my monitoring needs and it's free for up to 40 monitors.

Some of its features:

>>CAPABILITIES<<


Environmental Checks
TEMPERATURE: Monitors the temperature (°C or F) and alerts you when needed. Requires a temperature sensor.
HUMIDITY: Monitors the humidity in the air and alerts you when needed. Requires a humidity sensor.
FLOODING: Informs when an area becomes wet or flooded. Requires a flooding sensor and the .NET framework.
POWERUP: Informs when a power outage occurs. Requires a power sensor and the .NET framework.
Click here to purchase an environmental sensor.

Network Related Checks
TCP: verify if a computer responds on a certain port (Eg: web server on port 80)
PING: send a ping command and retrieves response times.
PINGAVG: send 2 Windows® ping commands and check the average response time. Requires Windows® running in English.
TRACERT: performs a traceroute to check connectivity to a target host. Include IP´s and Exclude IP´s are supported.
DNS: performs a DNS lookup to verify if the domain name matches the defined IP address.
NTP: check to see if the Network Time Protocol (NTP) server replies to request and sends NTP protocol compliant data

Internet Related Checks
URL: check if a web page contains a predefined string.
URL Image: check if a url to see if the image is returned correctly.
URL Exists: reads a file or URLs and verifies if the URLs exist.
SSL CERT:
HTTP Status: verifies the HTTP status code returned by the web server.
HTTP Header: verifies if the predefined HTTP Header (or a part of it) can be found in the response from the web server.
DOMAIN EXPIRY:
FTP: make an FTP connection and login to a remote server.
FTP FILE Exists: make an FTP connection and login. Returns an error if the specified file can not be found.
FTP FILE Found: make an FTP connection and login. Verifies if files can be found or not in the specified directory.
NNTP: test a connection to a news server.

Network Traffic Related Checks
TRAFFIC: monitor inbound or outbound network traffic.
BANDWIDTH: monitor bandwidth usage by combining inbound and outbound traffic.

Mail Related Checks
POP3: connection and login to a mailbox (POP3 enabled).
SMTP: send a test mail through an outgoing mail server (SMTP Server)
SMTP->POP3: send a test mail through an outgoing mail server (SMTP Server) and then connect to the defined POP3 server and check if the test email was well received. It can also verify if a predefined string can be found in the email headers or body.

SNMP TRAP Receiver Checks
The following check types are only available in the Premium Edition or Monitoring Appliance. Trial of these features will expire in 21 days.
SNMPTRAP: Receive alerts on SNMP Traps sent by devices to the ServersCheck software.

SNMP GET Checks
SNMP: verifies numeric values returned by SNMP enabled devices.
SNMPSTRING: verifies string values returned by SNMP enabled devices.

Media Server Checks
RTSP: Verifies if a RTSP media stream exists
MMS: Verifies if a MMS (Windows Media Server) stream exists and can be played

VOIP Checks
SIP: Verifies if SIP Server replies to connection request and its response time (to determine Latency and Jitter)

Virtualization
The following check types are only available in the Premium Edition or Monitoring Appliance. Trial of these features will expire in 21 days.
VMWARE-HOSTHEALTH: Monitors key metrics from a VMWare ESX/ESXi (v3+) Host including CPU, memory and DataStores disk space availability

Cloud Computing
The following check types are only available in the Premium Edition or Monitoring Appliance. Trial of these features will expire in 21 days.
AMAZONCLOUDWATCH: Monitor virtual server metrics from Amazon EC2´s Cloud Watch
WINDOWSAGENT: connects to remote installed ServersCheck Windows Agent to retrieve values like disk space, memory, process, services and WMI.
LINUXHEALTH: This check will verify response time, CPU usage, available memory and available disk space.
LINUXPROCESS: verifies if a process is still running on a remote Linux/Sun Solaris/HP-UX system.

Applications Specific Checks
The following check types are only available in the Premium Edition or Monitoring Appliance. Trial of these features will expire in 21 days.
LOTUS: tries to make a connection to a Notes Server and open a specified Notes Database.
BLACKBERRYSERVICES: Monitors Black Berry Enterprise Server services
BLACKBERRYTCP: Monitors TCP ports used by Black Berry Enterprise Server

Windows Based Checks
WINDOWSHEALTH: This check will verify response time, CPU usage, available memory and available disk space.
WMI: monitors any WMI property value on the target host
PERFCOUNT: allows you to monitor on values from performance counters on a remote Windows computer.
SERVICES: verifies if a service is still running on a remote host.
PROCESS: checks if a process is running or not.
PROCESSMEM: checks the memory usage of a process.
PROCESSCPU: checks the CPU usage of a process.
EVENTLOG: creates an alert when an error is found in the event log.
REGISTRY: creates an alert when the matching registry entry can not be found.
CPU: tracks the CPU usage and generates an alert when exceeding a treshold.
MEMORY: monitors the available memory.
DRIVESPACE: alerts you when the free space on a drive is too low.
ANTIVIRUS: alerts you when the antivirus is out of date on the remote host. Requires Windows XP SP2 or higher.

Linux/Unix Based Checks
LINUXHEALTH: This check will verify response time, CPU usage, available memory and available disk space.
PROCESS: verifies if a process is still running on a remote Linux/Sun Solaris/HP-UX system.
LINUX DISK: verifies the free disk space on a remote Linux (only) based system.
LINUX MEMORY: verifies the free memory on a remote Linux (only) based system.
LINUX CPU: verifies the CPU usage on a remote Linux (only) based system.
LINUX SNMP: enables you to trap LINUX values such as CPU Load, Free Swap Space, Partition Sizes and more.

Database Checks
ODBC: tests a connection to ODBC compliant databases (MS SQL Server, Access, Informix, Sybase, Oracle...)
ODBCSQL: connects to a database, executes the SQL statement and the verifies the value of a field.
ODBCSQLDURATION: connects to a database, executes the SQL statement and the verifies the query time.
ORACLE: test a connection to an Oracle server (Read more)
ORACLE SNMP: retrieves key Oracle parameters through SNMP.
MYSQL: test a connection to a MYSQL server.

Hardware Vendor Checks
CISCOWORKS (SNMP): monitors values returned by a Cisco Works Server through SNMP.
DELL OPEN MANAGE (SNMP): monitors values returned by a DELL server running the Open Manage software.

File Checks
FILE: verifies if a file exists or not at the specified location.
FILE CONTENT: raises an alarm if a particular content can not be found in the file.
FILE NEGATIVE CONTENT: raises an alarm if a particular content can be found in the file.
FILE AGE: raises an alarm if the age of a file does not match the defined condition.
FILE SIZE: verifies the size of a file and raises an alert if the conditions are not met.

Power Monitoring
POWERUSAGE: Monitors power usage of a ServersCheck PDU (total and individual outlets)
Click here to purchase ServersCheck PDU´s.

Security Sensors Checks
AXISSECURITY: Trigger alerts generated by events from AXIS camera´s
SECURITY: Monitor security sensors connected to the ServersCheck Security Bus (Motion, Door contact, Smoke, Glass Break ...)
Click here to purchase server room security sensors.

Special Checks
EXTERNAL: enables you to execute custom checks and be alerted on it.
RULES: Alert when one or more monitors being monitored failed



https://www.serverscheck.com
 
Last edited:
Good afternoon fellow hosters.
Id like to hear from you , what you guys use to be notified of server outages , even in the wee morning hours.
I believe that a hosting company should know about an outage before a customer tells you about it ,
with this in mind ,
We are using pingdom and an app on android called HostMonitor.

Anything else you guys are using ?

I wrote my own application to ping and notify me :p
 
A really cool solution is PC Monitor. I use it to monitor my server and other systems from my phone. Systems supported include Window, Linux and Mac, and almost all devices are supported (Android, iPad, iPhone, WP7, WP8, and even Win8 Metro). Real-time alerts on just about anything you desire ... services, ports, hardware state, network, CPU and disk usage, browse disks, etc. Even active screen and webcam snapshots, and alert when a USB device is un/plugged. From my Lumia 900 I can start and stop systems, chat with active users, issue system commands, check for and install update, reboot, shutdown, wake/sleep a system, etc, etc - and see the state of all my systems. Not free, but very professional.
 
Last edited:
Top
Sign up to the MyBroadband newsletter
X