The system LEDs are shown in the upper right hand corner providing the status on server health and other various critical operations of the flow collection and reporting architecture. Preceding the LEDs is the letter P indicating that this is a Primary reporting server. In a single server installation of Scrutinizer, it will always be P. In a distributed server environment, this letter can alternatively be S for the Secondary reporting server. The LED’s color and the icon indicate the status of the LED:
- Operational – Green (checkmark)
- Degraded – Orange (exclamation point)
- Critical – Red (X)
The LED status is based on the most critical level of severity of any item within the detail of the LED modal. All columns within each LED modal are both sortable and searchable.
Scrutinizer Server Health¶
This LED provides vital server statistical information such as CPU utilization and memory and disk storage availability. The information below is available per Server, with the color change occurring at the threshold levels defined.
All of the entries in the Server Health LED modal link to trend reports except for the Free DB % data entry.
|CPU||CPU Utilization percentage||<=60%||<=85%||>85%|
|Free Memory||Available Free memory||>=2GB||>=1GB||<1GB|
|Free Disk DB*||Available Free Disk space for database||>10%||>4%||>2.5%|
|Free DB%||Percent of disk storage that is still available for database||N/A||N/A||N/A|
|DB Latency||Server to server database latency (in milliseconds)||<250ms||>=250ms||N/A|
|API Latency||Server to server reporting latency (in milliseconds)||<1000ms||>=1000ms||N/A|
|Clock Drift||Server to server clock difference (in seconds)||0s||<>0s||N/A|
Free Disk DB
If disk space drops below 10% of available space (with a minimum of 10GB) and Auto History Trimming is selected in Admin>Settings>Data History, Scrutinizer will automatically start trimming historical data until space available is greater than 10% again.
If disk space hits 2.5% of available space, the collector will stop saving flows. In the event that the collector stops, a utility can be run that expires historical data to free up space. Go to Admin>Settings>Data History, and adjust the current retention settings.
On the server, access the interactive scrut_util prompt with the following command:
At the scrut_util prompt, run:
SCRUTINIZER> expire history
When the above command runs, it looks at the settings in the master data history configuration, then purges historical data based on the current time. After the above has completed, run the following command to restart the Plixer Flow Collector service. This will will cause the system to begin receiving and processing flows again.
SCRUTINIZER> services plixer_flow_collector start
Scrutinizer Software Health¶
The following information is available per Server from this LED:
- Role – primary/secondary/collector (secondary and collector used in distributed server environments)
- API – Reporting (web) interface, Up/Green means the web server is running. Down/Red means the web server is down.
- Database – Up/Green means the database is running. Down/Red means the database is down.
- Collector - Up/Green means the collector service is running. Down/Red means the collector service is down.
- Alarms – Up/Green means the Alarms/syslog service is running. Down/Red means that the Alarms service is down.
- Version – current running Scrutinizer version
- Flow Rate – flows per second
- MFSNs – Missed Flow Sequence Numbers (see Scrutinizer exporter health LED for more information on MFSNs)
Scrutinizer Exporter Health¶
This LED reports on the data collected by the Plixer flow collector service. The flow collector receives data from network devices, processes it, and stores it in the appropriate database tables. The collector is also responsible for rolling raw 1 minute data into 5 minutes, 30 minutes, 2 hours, 12 hours, 1 day, and 1 week intervals. Currently the collector service supports NetFlow v1, v5, v6, v7, v8, v9, IPFIX and sFlow v2, v4 & v5 as well as jFlow, cflowd, NetStream and others.
To run the collector from the command prompt (i.e. CLI) type:
The output from this command is for internal use only.
Information available per exporter:
Collector – IP Address of the server collecting the flows
Exporter – IP Address/name of exporter
Flows – flow collection rate (green=flows/orange=none)
Packets – packet rate (green=packets/orange=none)
MFSN – This led turns orange if Missed Flow Sequence Numbers exceed 300 in an interval.
If only one or a few of the flow sending devices report high MFSNs, it is likely the network or the flow exporting device that is dropping or skipping flows. If all devices report high MFSNs, it is likely to be the collector that is dropping flows. To improve performance, make sure the server hardware meets the minimum requirements. Visit the Vitals dashboard for trending details of the server.
Max Flow Duration –
This LED turns orange if the collector is receiving flows with a total flow duration beyond 60 seconds. Make sure these Cisco or similar commands have been entered on the flow exporting device (e.g. routers or switches):ip flow-cache timeout active 1ip flow-cache timeout inactive 15
Learn more about the ip flow-cache timeout commands.
Templates – template count
Last Flow – timestamp of last flow received