Download: version 1.00 Beta Change log Repository Latest update: 21 May 2021
- PowerShell 2.0 or later
- .NET 3.5 or later
- Windows 7 / Windows Server 2008 or later
Monitoring tool entirely based on PowerShell. It is agent-less and relies on WinRM over SSL. The functionality is similar to Nagios and Ansible. Custom scripts/plugins are supported like NRPE plugins.
According to ITIL priority matrix there are 5 types of priorities and they are implemented into the monitoring.
P1 = Critical
P2 = High
P3 = Medium
P4 = Low
P5 = Planning
All logs are stored locally on the respective remote machine. The dashboard client (it’s a PS script too) is usually ran on end-users workstation to observe the state of the servers. The client establish connection to remote computers, fetches the logs and shows the alerts as the example below.
The initial setup is quite straightforward process. It is agent-less but the architecture of the monitoring requires configuring WinRM (for secure connection) and a job in Task Scheduler (run scripts on every N minutes). Default settings could be changed whenever you want, before or after the initial setup.
Download the provided ZIP archive above and extract it in a permanent location. Open folder ‘PSRM-Host’ and run batch file ‘Setup.bat’. The purpose of this batch file is to bypass the execution policy without modifying.
To change the path of the installation, just move folder ‘PSRM-Host’ to a new path and run ‘Setup.bat’ again.
This is the diagram how it works.
Dashboard client (PSRM-Client)
Go to folder PSRM-Client and open file Computers.txt. This contains the connection details of the remote machines you are about to monitor. The file must contains following information.
- IP or hostname of the computer.
- Port number of WinRM HTTPS. Don’t specify it if default port 5986 is configured.
- <Domain>\<Username> or <Username>
- Task name in Windows Task Scheduler.
Example: The IP address of my server is 192.168.1.12. I kept the default port 5986 of WinRM so I don’t have to specify it after the IP address. The username is administrator and its password is ‘P@55w0rd!‘. The task name in Windows Task Scheduler is PSRM. Then in Computers.txt I have to type the following.
192.168.1.12 administrator P@55w0rd! PSRM
For some reason, if I change the port to 4433, the line looks like this:
192.168.1.12:4433 administrator P@55w0rd! PSRM
If the computer belongs to a domain or it is a domain controller:
192.168.1.12:4433 dev\administrator P@55w0rd! PSRM
Just separate the records with spaces or tabs.
To start the client, run batch file PSRM-Client.bat (simply double click). It will prompt for elevated permissions. If your machine requires to run only signed PS scripts or the execution is restricted, the batch file avoids the execution policy and runs PSRM-Client.ps1.
Client script establish Remote PS Session to the configured computers and reads _alert.log file. All events are provided in human readable format and gives enough information where is the problem and what is the priority. The dashboard is a console based and looks the same as shown in previous picture.
To logon into the respective machine where an alert is coming from, mark the connection field (or double click) and press Enter. To logon into multiple computers, hold Alt button and drag the mouse cursor to mark several connections.
After pressing Enter, single or multiple (depending on your selection) Remote PS Session window will appear. Don’t worry about duplicate entries, remote connection is established only for unique records. This is how it looks like.
How it works:
- Enter button copies the selection into the clipboard.
- Check if Enter button is still hold. Hold it for at least 100 milliseconds.
- Check if ‘PSRM-Client’ window is the active one (focused window).
- Check if the copied text matches any connection details from the dashboard.
If all of the above conditions are met, the script starts new process and enters into a Remote PS Session. To avoid flooding of many console windows while Enter button is still hold on, the clipboard will be cleared immediately.
The configuration file is called Config.ini, located in the root folder. Contains very basic settings for the initial setup. If any value is modified after the host setup, the changes will be automatically applied as per the job schedule in Windows Task Scheduler (3 minutes is default value). All changes are being tracked and logged in file “Logs\_config.log“. In case of failures, there will be a standard PS error, giving information which script(s) throws errors, line number and exception message.
Here the supported settings. Check the comments in Config.ini file for more details.
- SSL Validity period. Default value is 1 month. The certificate will be automatically renewed 7 days before the expiry date.
- Inbound firewall rule name of WinRM service.
- WinRM HTTPS port number. Default one is 5986. Could be anything up to 65535.
- Name of the task in Windows Task Scheduler. It runs the monitoring scripts.
- How frequently to run this task. Default setting is 3 minutes.
- Log retention policy. Default value is 31 days which means logs older than 31 days will be deleted.
The monitoring rules are defined in Rules.ini file.Check the comments inside for more details. It is intended to get information which components/elements should be checked. Also sets the status of entire monitoring. Here are the supported key features:
- Set monitoring status: production, development, maintenance.
- Production – Log all issues and show them in the monitoring client (dashboard).
- Development – Log all issues but don’t show them in the dashboard.
- Maintenance – Stop the entire monitoring.
- Automatic Services in not running state.
- Include/Exclude services from in the monitoring. Wildcard is supported.
- Include non automatic services into the monitoring. For example: Windows*
- CPU load
- Process utilization – Includes a group of processes related to a parent one.
- Memory utilization – Physical and Virtual (physical + pagefile).
- Free disk space on local drives.
Logs are located in folder Logs. Here are the common files.
- _alert.log – Show ongoing issues. These logs are being read by PSRM-Client.
- _config.log – Tracks for configuration changes, when and what has been modified.
- _status.log – Status of the monitoring. Also shows errors of the monitoring scripts.
- ‘yyyy’-‘MM’-‘dd’.log – History of all alerts with exact date, time and runtime.
False Positive Alerts
This is another way to exclude something from the monitoring. The respective config file is called False-Positive.ini, located in the root folder.
If something constantly throws alerts but it has been considered as a false positive, then you don’t want it in the monitoring dashboard.
- Open file _alert.log and copy the line of the false positive alert.
- Open file False-Positive.ini and paste it there.
4|Service|Low: Manual non-delayed service is Stopped: WEPHOSTSVC | Windows Encryption Provider Host Service
Write custom monitoring scripts
Since the monitoring tool is very basic one, the need of custom scripts is inevitable. All of them must be located in folder Scripts. Organize your custom scripts in sub-folders to avoid a messy setup. The names of folders and scripts could be anything you want. The monitoring looks for .ps1 files recursively in folder Scripts and executes them in a sequence.
The templates how to write custom scripts are placed in folder ‘Templates‘. The output must be return in following format.
<Priority Number>|<Component Name>|<Output Message>
Here is an example how the output must look like.
1|Storage|Critical: Disk D:\ is used on 100% | 0GB free of 5.09GB disk capacity
Custom script may return multiple alerts. All messages must meet above standard. If everything is fine and no alert is needed, just return OK:
How to use templates
1. Write script with single output.
First copy file Output-Single.ps1 into folder Scripts. It is a simple file counter which get the amount of files and folders on C:\ drive.
Replace the string in variable $component with something relevant. It’s the component/application name you are going to check, for example “MS-SQL” or “IIS”.
Between <BEGIN> and </END> tags is your actual code. In this template, the number of files are assigned in variable $count.
Next step is to define the desired thresholds in variables $P1, $P2, $P3, $P4, $P5. According to ITIL there are no standard thresholds because every organization has different demand, so they must define reasonable thresholds in terms of their infrastructure.
In variable $Message type what should be returned in case of issues.
Final step is to check the boundary of variable $count. If everything is fine, the script returns ‘OK’ and no alert is triggered. Otherwise the defined message will be returned, along the component name and priority.
2. Write script with multiple outputs
Copy file Output-Multiple.ps1 in folder Scripts. The logic is similar as previous script. It returns multiple alerts in case of issues. ‘OK’ message is returned if particular check is fine.
3. Handling Errors and Non-Standard Output
Following scripts are for demonstration purpose to show how the monitoring handles errors and non standard output. Just copy these files in same folder to check the dashboard’s alerts.
- Output-Error.ps1– Contains only wrong commands and return many null objects. No meaningful output, only errors. Dashboard shows info which script is not working as expected.
- Output-NonStd.ps1– Shows message “Test Message” and which script it’s coming from. Since the output didn’t met the standards, there is no component name or reasonable priority.