It's time once again for OpenNMS On the Horizon.
Since last time, we worked on documentation (database reports, external auth, the glossary, GraphML, Helm flows, installation, logging, performance data, OIA, poller threads, provisioning, SCV, and the SnmpCollector), CI/CD for Horizon and Horizon Stream, Lombok, Kafka alarm sync, stress-metrics
, inventory management in Stream, SNMP OPAQUE types, GRPC, Docker multi-arch support, Topology, Flow Elasticsearch support, Stream persistence, bridge topology, PostgreSQL credential encryption, time-series metric deduplication, Keycloak login, requisition metadata editing, ALEC web UI and training API, heatmaps.
Github Project Updates
Internals, APIs, and Documentation
- Jason continued to flesh out Horizon Stream CI/CD workflows.
- Morteza worked on optimizing some Horizon CircleCI builds.
- Mark Mahacek worked on documentation for external auth, GraphML, performance data, SCV, and the SnmpCollector.
- Chandra worked on documentation for OIA and poller thread consumption.
- I audited the current set of flaky tests in CircleCI and updated the code to match.
- Morteza worked on dynamically generating CircleCI configs based on branch name and other metadata.
- Mark Frazier worked on some Lombok-related changes in Horizon Stream.
- Bonnie did more work on improvements to installation and provisioning documentation.
- Alex fixed a bug where outstanding alarms may not re-sync to Kafka in some situations.
- Freddy did some work on the
stress-metrics
command. - Marcel worked on a bunch of documentation relating to logging.
- Freddy wrapped up his fixes to graceful shutdown of Telemetryd.
- Mark Frazier worked on inventory-handling (provisioning) in Stream.
- Dmitri did more work on OPAQUE SNMP data type handling.
- Arthur worked on modularizing some GRPC-related code in Stream.
- I fixed some issues in multi-arch Docker image generation.
- Antonio worked on some fixes to topology smoke tests.
- James fixed an error in error reporting in the flow Elasticsearch repository code.
- Arthur worked on DAO/persistence in Stream.
- Emily worked on adding new stuff to the glossary in the Horizon docs.
- Antonio fixed an issue with bridge topology discovery.
- Andrew worked on some updates to database report documentation.
- Christian worked on encrypting stored PostgreSQL credentials.
- Patrick worked on metric deduplication in the OIA time-series API.
Web, ReST, UI, and Helm
- Alberto worked on updated Helm documentation for the Flow datasource.
- Mike worked on Keycloak login stuff for Stream.
- Yang Li did more work on user management REST endpoints in Stream.
- Scott added support for multi-line text entry in requisition metadata in the web UI.
- Chinh Le did more work on the new Vue-based topology UI.
- Pushkar did more work on a fix for event and alarm search.
- Anya worked on a web UI plugin for ALEC.
- Benjamin worked on a REST endpoint for storing permissions for training submission.
- Christian fixed an issue in the heatmap code.
Contributors
Thanks to the following contributors for committing changes since last OOH:
- Christian Pape
- Patrick Schweizer
- Pushkar Suthar
- Mark Frazier
- Jason Berry
- Emily Marsh
- Morteza Ershad-Manesh
- Anya Rybalova
- Andrew Konstantaras
- Benjamin Janssens
- Jeffrey-David Kapp
- Benjamin Reed
- Arthur Naseef
- Jesse White
- Dustin Frisch
- Antonio Russo
- Mike Rose
- Chandra Gorantla
- Mark Mahacek
- Chinh Le
- Yang Li
- James Hutchinson
- Dmitri Herdt
- Lars Schreiber
- Alberto Ramos
- Alex May
- Freddy Chu
- Scott Theleman
- Marcel Fuhrmann
- Bonnie Robinson
- Dino Yancey
Releases and Roadmap
Horizon 30 Released
Release 30.0.0 is the first in the Horizon 30 series, introducing a number of new features, most notably a preview of a new web UI, and the ability to back up infrastructure device configs.
The codename for Horizon 30.0.0 is Nutria.
Breaking Changes
OpenNMS Plugin/Integration API (OIA) Updated to 1.0.0
The OIA version required has been updated to 1.0.0, following its first stable release. Plugins intended to run in OpenNMS must implement version 1.0.0 (or higher).
New Configuration Management API
A new API has been introduced for accessing and manipulating configuration, including moving configuration from XML files into the database. The initial implementation proof-of-concept converts the provisiond-configuration.xml
to the new API.
The OpenNMS installer will import your existing provisiond-configuration.xml
file on upgrade.
This will happen automatically, but if you rely on programatically manipulating the Provisiond configuration you will need to convert your code to use the config management REST API instead.
Docker Images
The Horizon and Sentinel Docker images are now based on a minimal install of Ubuntu, rather than CentOS. Symlinks are provided to match the old paths in /opt
but it’s possible you will run into subtle differences when transitioning.
Collectd Strict Interval
The org.opennms.netmgt.collectd.strictInterval
setting now defaults to true
.
Previously, Collectd would not reschedule collection for a device until after the previous collection completes. This means that if OpenNMS is collecting at a 5-minute interval, and it takes 1 minute to collect the data, then the next collection will start 6 minutes after the previous collection was launched.
The new default behavior is to always schedule collection as a predictable interval.
You can change to the previous behavior by creating a property file in $OPENNMS_HOME/etc/opennms.properties.d/
with the contents: org.opennms.netmgt.collectd.strictInterval=false
.
New Features and Improvements
New UI (Early Access)
Work has begun on creating a new UI with an eye towards making more common workflows easier.
It uses Vue 3 and the Feather Design System. You can try it out by clicking "UI Preview" in the navigation bar of the web UI.
It is also now possible to write OIA plugins that extend the new UI.
Device Config Backup
Initial support has been added for performing configuration backups of infrastructure devices like routers and switches. Backups are performed as part of polling the device, and can be viewed (and triggered) in the web UI.
OIA Supported on Minion and Sentinel
The OpenNMS Plugin API can now be used to extend Minion and Sentinel. A subset of APIs are supported, as appropriate for each platform.
Secure Credentials Vault
You can now validate credentials stored in the SCV with the scv-validate
Karaf command.
Additionally, support for encrypted credentials has been extended to more places inside OpenNMS, most notably in Metadata interpolation. Also, a REST API has been added for accessing and updating the SCV.
Flows and Nephron
It is now possible to configure thresholding on flow data.
Polling, Metadata, and Collection
- The XML collector can now treat a collected value as an enumerated value, which lets you convert strings into integers to store as a gauge.
- It is now possible to passively "collect" data from incoming events as time-series data, including those that come from traps or syslog. The eventconf has additional options to configure what data to collect
from parameters including regular-expression matches. - The BgpSessionMonitor can now be configured to use a custom OID prefix for devices that publish peer tables in a non-standard location.
Additions or updates to graphs and collections have been made for:
- F5 Devices
- Flows
- Node Exporter
- Prometheus
- Windows Exporter
REST API
- Improvements have been made to the criteria querying API to support "Multi-And" and regexp restrictions, allowing for queries involving multiple event parameters, or complex string matching.
Documentation
An unspeakable amount of work has gone into documentation improvements and additions across the board.
Notable additions include:
- Developer documentation for OSGi in OpenNMS, the OpenNMS Plugin API (OIA), the config management API, device config backup APIs, and the Health REST service.
- Operation documentation updates relating to SNMP property extenders, performance data and collection, thresholding, the log file viewer, SCV, and the new UI preview.
- Documentation improvements regarding "housekeeping" and other administrative tasks, alarms, Business Service Monitoring, Passive Status Monitoring, and more.
Important Internal Changes
- Kafka components have been updated to version 3.0.0
- Our embedded Karaf has been updated to version 4.3.6
OpenNMS Plugin API 1.0 Released
The first officially stable release of the OpenNMS Plugin API (a.k.a. OIA) came out last week.
From 1.0 forward, it will follow Semantic Versioning.
Since OIA 0.6, the following changes have occurred:
Improvements and Features
- Guava has been updated to
30.1.1-jre
- updated to be supported on Minion and Sentinel
- extend the new Horizon Vue-based UI through OIA
- key-value store support was added
- Secure Credentials Vault support was added
- event-sourced collection config support was added
Breaking
- requires JDK 11 or higher
InterfaceToNodeCache
API was changed to removeStream<Integer> getNodeIds(String location, InetAddress ipAddr)
Helm 8 Released
Helm 8 contains updates to the core to use Grafana 8, the start of a move to
TypeScript, many optimizations, and a number of new features.
- use an optimized bulk query when fetching string properties from OpenNMS
versions that support it - convert much of the codebase to use native promises rather than angularJS
wrappers - support converting NaN values to 0 when querying flow data
- support swapping ingress and egress on flow data at query time
- support for new filters for node, location, applications, hosts, and
conversations - fixed an issue with missing flow data when ingress and egress are
inconsistently available - many documentation improvements and additions
Alec 2.0 Released
ALEC (Architecture for Learning Enabled Correlation) version 2 contains a number of updates, most notably some new scoring strategies.
- ALEC now requires JDK 11.
- ALEC now users OIA 1.0.0, which means it requires Horizon 30 or higher, and future Meridian 2023 or higher.
- In addition to the existing Set Intersection, Peer, and Matrix scoring strategies, ALEC now supports ARI and AMI strategies as well.
Upcoming July Releases
OpenNMS is on a monthly release schedule, with releases happening on the second Wednesday of the month.
The next OpenNMS release day is July 13th, 2022.
We currently expect updates to Horizon 30 and all supported Meridians.
Next Horizon: 31 (Q4 2022)
The next major Horizon release will be Horizon 31.
Since Horizon 30 was just released, there is nothing concrete on the roadmap for Horizon 31 yet.
Stay tuned for details when they come.
Next Meridian: 2023 (Q1 2023)
Meridian 2023 is early in its development cycle, but you can expect it to contain, at the very least, the work that's going into Horizon 30.
Disclaimer
Note that this is just based on current plans; dates, features, and releases can change or slip depending on how development goes.
The statements contained herein may contain certain forward-looking statements relating to The OpenNMS Group that are based on the beliefs of the Group’s management as well as assumptions made by and information currently available to the Group’s management. These forward-looking statements are, by their nature, subject to significant risks and uncertainties.
...We apologize for the excessive disclaimers. Those responsible have been sacked.
Mynd you, møøse bites Kan be pretti nasti...
We apologise again for the fault in the disclaimers. Those responsible for sacking the people who have just been sacked have been sacked.
Until Next Time…
If there’s anything you’d like me to talk about in a future OOH, or you just have a comment or criticism you’d like to share, don’t hesitate to say hi.
- Ben
Resolved Issues Since Last OOH
- HELM-317: Grafana Datasource expressions - Flows
- HELM-325: Document Grafana datasource expressions - Flows
- HELM-327: Performnce DS Query type Attribute selector broken
- HS-35: Add endpoints to REST API server for user management
- HS-56: Create a service for pushing Requisition data to the provision module
- HS-78: UX Writing - Password Reset Emails
- HS-91: SCAN for inventory changes
- HS-98: Tour through provisiond in Horizon
- HS-99: Writing QA Check
- HS-112: Interview identified users for the activity board
- NMS-8861: Admin guide lacks a chapter on logging
- NMS-13484: Document the Grafana Image Renderer plugin's dependencies
- NMS-13574: Migrate External Auth into docs
- NMS-13785: Error responses are not handled correctly when handling ElasticSearch responses
- NMS-13918: Map Pins Missing Since Upgrade
- NMS-14003: Telemetryd does not shut down gracefully
- NMS-14018: SNMP MIB imports to handle OPAQUE data type implementation
- NMS-14059: Add support for pre-authorization via HTTP header (to be used with pre-authentication)
- NMS-14128: DCB: Error reporting needs love
- NMS-14129: DCB: Debug script execution
- NMS-14154: Documentation for OIA changes
- NMS-14230: Create release notes content for H30
- NMS-14231: Create release notes content for OIA, Alec, HELM to release along with H30
- NMS-14255: DCB - Document impact of DCB on poller thread consumption
- NMS-14282: Allow multi-line metadata
- NMS-14299: DCB UI: Hover over Running Config in DCB takes longer
- NMS-14321: Kafka-Producer Alarm Resync Failing Post Entire Kafka Cluster Outage
- NMS-14322: Bridge Topology Discovery Mismatch
- NMS-14326: fix smoke tests: GraphRestServiceIT
- NMS-14343: features/topology: tooltip - PowerGrid (D3/Circle layout)
- NMS-14359: Modify host, zone and requisition name field validation
- NMS-14374: Fix Smoke Test for GraphMLTopologyIT
- NMS-14377: features/topology: contextmenu - PowerGrid (D3/Circle layout)
- NMS-14378: Snmp Link Up does not clear Snmp Link Down