Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Introduction

To achieve its purpose, correlating user information with network performance data, WiFiMon needs RADIUS and/or DHCP logs to be streamed in an Elasticsearch structure. For that purpose, an ELK cluster was built on VMs. A total of five VMs were used, with three of them defined as Elasticsearch master-eligible and data nodes, one VM configured as coordinating node where Kibana was installed too, and another one dedicated to Logstash.

...

Note
titleNOTE

To implement this setup in your environment:

  • Run the commands as root user.
  • Replace the IPs and FQDNs mentioned here with your own ones.

VMs Specifications

This setup consists of five VMs each of them having the specifications shown in the following table:

PropertyValue
CPUs4
Memory8 GB
Storage100 GB
Network1 Gbps
Architecturex86_64
OSCentOS 7

Cluster Setup

Setting up an ELK cluster means installing the software packages implementing its components. Some configuration must also be done as a preparation, before starting with the configuration of the cluster itself.

Anchor
DNS and Roles
DNS and Roles
DNS and Roles

The following table shows the DNS configuration and the role each machine plays in the cluster.

...

Cluster node is considered to be the one that joins the cluster. In this setup, cluster nodes are the master-eligible/data nodes and the coordinating node. The pipeline node is not, it doesn’t join the cluster.

Package Installation

A cluster is a collection of nodes.

...

All the packages implementing the cluster's components (elasticsearch, logstash, kibana, filebeat) must be of the same version. This setup is about version 7.8.0.

System Configuration

Each node’s hostname is set to its FQDN, according to the values shown in the VMs DNS table. This value is referenced in the configuration file of Elasticsearch, and is also used in certificates for hostname validation.

...

Note
titleNOTE

The node wifimon-kibana.example.org is also used for monitoring purposes, and this is the reason port 9200/tcp is open in the firewall.

There's no need to open port 9200/tcp for querying the cluster, this can only happen locally by applying Elasticsearch REST API commands at the cluster node you are currently logged in. For more information on querying the cluster see Cluster Exploration.

This setup uses firewalld for the implementation of firewall. On each node a "wifimon" custom zone is created to hold the specific configuration. On wifimon-kibana.example.org node, furthermore, some configuration goes into public zone to allow access for the kibana platform and the cluster components.

...

Note
titleNOTE

In the wifimon-components ipset, 10.10.10.111 and 192.168.1.15 are the IPs of the servers where Filebeat agents are installed – see the configuration of wifimon-logstash.example.org below. For the other components, their IPs are described at 04 Streaming Logs Into ELK Cluster 148085622 section.

On wifimon-node{1,2,3}.example.org:

...

Code Block
firewall-cmd --zone=wifimon --list-ports 
5044/tcp

firewall-cmd --zone=wifimon --list-sources
10.10.10.111 192.168.1.15

SSL/TLS Certificates

The cluster communication is secured by configuring SSL/TLS encryption. The elasticsearch-certutil was used to generate a CA certificate, utilized for signing the certificates of the cluster components. This utility comes with the elasticsearch installation, and in this case was used the one installed in the wifimon-kibana.example.org node.

...

For more information on elasticsearch-certutil see its documentation page.

Cluster Configuration

Configuring a cluster means configuring the nodes it consists of, which in turn means defining cluster-general and node-specific settings. Elasticsearch defines these settings in configuration files located under the /etc/elasticsearch directory.

Anchor
jvm_options
jvm_options
JVM Options

JVM options are defined in the /etc/elasticsearch/jvm.options file. By default Elasticsearch tells JVM to use a heap of minimum and maximum of 1 GB size. The more heap available, the more memory it can use for caching, however it is recommended to use no more than 50% of the total memory.

...

Note
titleNOTE

On a running elasticsearch instance:

If the command "systemctl -l status elasticsearch.service" produces the warning:

OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.

then (according to JEP 291) comment out the option "-XX:+UseConcMarkSweepG" and set the option "-XX:+UseG1GC".

If the file "/var/log/elasticsearch/wifimon_deprication.log" contains warnings like:

transport.publish_address was printed as [ip:port] instead of [hostname/ip:port]. This format is deprecated and will change to [hostname/ip:port] in a future version. Use -Des.transport.cname_in_publish_address=true to enforce non-deprecated formatting.

then proceed with the recommendation, that is set the option "-Des.transport.cname_in_publish_address=true".

Master-Eligible / Data Nodes

In a heavy data traffic cluster of many nodes, it is recommended to have the master-eligible and data nodes separated and dedicated to their own role. In this setup, however, there are three nodes configured as having both functionalities.

By default a node is a master-eligible, data, and ingest node, which means (a) it can be elected as master node to control the cluster, (b) it can hold data and perform operations on them, and (c) it is able to filter and enrich a data document before being indexed. Having a dedicated pipeline node with filtering/enriching capabilities there’s no need for the ingest feature, it has been however enabled because it is used for monitoring purposes.

Note
titleNOTE

Elasticsearch keystore should be configured before running this configuration.

...

For more information about the aforementioned settings see Node, Network Settings, Important discovery and cluster formation settings, and Secure a cluster.

Coordinating Node

A coordinating node is a node that has node.master, node.data, and node.ingest settings set to false, which means you are left with a node actually behaving as a load-balancer, routing the requests on the appropriate nodes in the cluster.

...

Below is the configuration of wifimon-kibana.example.org as an Elasticsearch coordinating node. It follows the same pattern as the master-eligible/data nodes, but with their functionalities set to false.

Note
titleNOTE

Elasticsearch keystore should be configured before running this configuration.

...

Code Block
title/etc/elasticsearch/elasticearch.yml
cluster.name: wifimon
node.name: ${HOSTNAME}
node.master: false
node.voting_only: false
node.data: false
node.ingest: false
node.ml: false
cluster.remote.connect: false
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: wifimon-kibana.example.org
discovery.seed_hosts: [
    "wifimon-node1.example.org",
    "wifimon-node2.example.org",
    "wifimon-node3.example.org"
]
xpack.security.enabled: true
xpack.security.http.ssl.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: full
xpack.security.http.ssl.key: /etc/elasticsearch/certs/kibana.key
xpack.security.http.ssl.certificate: /etc/elasticsearch/certs/kibana.crt
xpack.security.http.ssl.certificate_authorities: /etc/elasticsearch/certs/ca.crt
xpack.security.transport.ssl.key: /etc/elasticsearch/certs/kibana.key
xpack.security.transport.ssl.certificate: /etc/elasticsearch/certs/kibana.crt
xpack.security.transport.ssl.certificate_authorities: /etc/elasticsearch/certs/ca.crt
xpack.monitoring.enabled: true
xpack.monitoring.collection.enabled: true

Setup Passwords

Elasticsearch comes with built-in users configured, each of them having a set of privileges but with their passwords not set, and consequently unable to be used for authentication.

...

For more information on Built-in users follow the link.

Kibana Platform

Kibana is a browser-based interface that allows for searching, viewing, and interacting with the data stored in the cluster. It’s a visualization platform for creating charts, tables, and maps to represent the data. Kibana should be configured in an Elasticsearch node. The configuration of Kibana is done by editing the /etc/kibana/kibana.yml file.

Note
titleNOTE

Kibana keystore should be configured before running this configuration.

...

For more information on Kibana configuration settings, see Configuring Kibana.

Anchor
cluster_exploration
cluster_exploration
Cluster Exploration

Even though it is possible to explore the cluster by using the Kibana platform, this section is about querying the cluster by using the REST API provided by Elasticsearch. The querying commands are executed in wifimon-kibana.example.org node and the user elastic is used for authentication.

...

Start the elasticsearch instance on wifimon-node1.example.org node and query the cluster again. The wifimon-node1.example.org will join the cluster and the status of the cluster will become green, while wifimon-node3.example.org continues to be the master node.

Filebeat Configuration

Filebeat monitors log files for new content, collects log events, and forwards them to Elasticsearch, either directly or via Logstash. In Filebeat terms one speaks about a) the input which looks in the configured log data locations, b) the harvester which reads a single log for new content and sends new log data to libbeat, and c) the output which aggregates and sends data to the configured output. For more information see Filebeat overview.

...

Code Block
title/tmp/dhcp_sample_logs
Jun 18 19:15:20 centos dhcpd[11223]: DHCPREQUEST for 192.168.1.200 from a4:c4:94:cd:35:70 (galliumos) via wlp6s0
Jun 18 19:15:20 centos dhcpd[11223]: DHCPACK on 192.168.1.200 to a4:c4:94:cd:35:70 (galliumos) via wlp6s0

File Output

As mentioned above, Filebeat will be firstly configured to dump the output in a file. Below is shown the configuration file of Filebeat for each agent. It configures an input of type log, which is needed to read lines from log files. There's also the output which configures the path and the filename to dump the data in, and finally the section of processors to drop some fields Filebeat adds by default, and to add the logtype field used in the Logstash beats-pipeline output.

RADIUS Server

The following is the Filebeat configuration on the RADIUS server, which dumps the data in the /tmp/sample_logs_output.json file.

...

The logs are located in the message field. The logtype field holds the radius value, thus differentiating these events from the dhcp ones when arriving at Logstash pipeline.

DHCP Server

The following is the Filebeat configuration on the DHCP server, which dumps the data in the /tmp/sample_logs_output.json file.

...

The logtype field contains the dhcp value, thus differentiating these events from the radius ones, when arriving at Logstash pipeline.

Filtering Log Events

Apart from adding or dropping named fields, processors can also be used to filter log events when certain criteria are met. For example, to send out only the log events containing the value Eduroam in the NAS-Identifyer field, the following configuration could be applied.

...

For more information on configuring processors see Filter and enhance the exported data.

Logstash Output

This section shows how to configure Filebeat’s logstash output to feed the pipeline node.

Note
titleNOTE

Filebeat keystore should be configured before running this configuration.

...

The hosts setting specifies node and port where Logstash service listens for incoming log events. The ${key_passphrase} references the passphrase of filebeat.key stored in Filebeat keystore -- it's about mutual SSL/TLS authentication, the client (Filebeat) is forced to provide a certificate to the server (Logstash) or the connection won't be established.

...

The above command loads the template from wifimon-kibana.example.org node where elasticsearch is installed. Detailed information is written in the Filebeat log file.

Monitoring

The Kibana platform allows for monitoring the health of Filebeat service. For this to happen, the following configuration must be added in the /etc/filebeat/filebeat.yml file.

Note
titleNOTE

Filebeat keystore should be configured before running this configuration.

...

The ${beats_system_password} references the password of the beats_system built-in user which is stored in Filebeat keystore.

Logstash Configuration

Logstash is a data collection engine with real-time pipelining capabilities. A Logstash pipeline consists of three elements, the input, filter, and output. The input plugins consume data coming from a source, the filter plugins modify the data as specified, and the output plugins send data to a defined destination. In this setup data comes from Filebeat agents, with their logstash output configured to feed the Logstash instance on port 5044/tcp.

Note
titleNOTE

Logstash keystore should be configured before running the configurations provided here.

JVM Options

The JVM Options for Logstash are defined in the /etc/logstash/jvm.options file. The configuration is the same as the one configuring the JVM Options of Elasticsearch.

Logstash Settings

Logstash settings are defined in the /etc/logstash/logstash.yml file, which contains the following:

...

Note
titleNOTE

If you get in the Logstash logs the warning:

[2020-07-22T13:09:07,993][WARN ][logstash.outputs.elasticsearchmonitoring][.monitoring-logstash] ** WARNING ** Detected UNSAFE options in elasticsearch output configuration!
** WARNING ** You have enabled encryption but DISABLED certificate verification.
** WARNING ** To make sure your data is secure change :ssl_certificate_verification to true

then you can ignore it. According to https://github.com/elastic/logstash/issues/10352 it's about a false warning.

Logstash Pipelines

Logstash pipelines are defined in the /etc/logstash/pipelines.yml file, which contains:

...

For each pipeline, an id and the configuration file is defined. The beats-pipeline functions as a gate receiving logs from both (radius and dhcp) streams, and then forwarding these logs to the proper pipeline.

Anchor
beats_pipeline
beats_pipeline
Beats Pipeline

As mentioned above, the beats-pipeline acts as receiver / forwarder of log-events coming from RADIUS and DHCP streams. It doesn’t configure any filter element, but the input and output ones.

...

The beats plugin configures Logstash to listen on port 5044. It also provides settings for SSL/TLS encryption and forces the peer (Filebeat) to provide a certificate for identification. The output defines which pipeline to forward the data to, based on the value of logtype field sent from Filebeat agent.

RADIUS Pipeline

The radius-pipeline is configured in the /etc/logstash/conf.d/radius-pipeline.conf file. It receives RADIUS log-events sent from the beats-pipeline.

...

The output defines the stdout plugin which dumps the filtered data in the standard output, allowing for testing a data flow of Filebeat → Logstash → Logstash_STDOUT.

DHCP Pipeline

The dhcp-pipeline is configured in the /etc/logstash/conf.d/dhcp-pipeline.conf file. It receives DHCP log-events sent from the beats-pipeline.

...

The output defines the stdout plugin which dumps the filtered data in the standard output, allowing for testing a data flow of Filebeat → Logstash → Logstash_STDOUT.

Streaming to STDOUT

Having Filebeat agents configured to feed Logstash, whose pipelines are configured to dump data to STDOUT, makes it possible to test a data flowing through Filebeat → Logstash → Logstash_STDOUT.

...

You may have noticed in the output of radius-pipeline that the value of NAS-IP-Address have been changed from private IP to 162.13.218.132 (www.geant.org). This was done intentionally in order to see the results of geoip filter, which gives nothing for private IPs.

Streaming Logs Into Cluster

Until now the streaming of data has been triggered manually by using the sample data. This allowed for testing the configuration of Filebeat and Logstash, and also having a first view of results.

This section is about configuring the components pointing to real data files and implement a streaming through the path Filebeat → Logstash → Elasticsearch.

Filebeat Inputs

In the /etc/filebeat/filebeat.yml file under the filebeat.inputs, the paths should now point to the full path in the filesystem where the RADIUS or the DHCP logs are located.

...

Multiple files can be given to paths setting as a list or as a glob-based pattern.

Create User and Role

In order to send log events to the cluster, the user logstash_writer with the role logstash_writer_role must be created. The role assigns the cluster permissions of monitor and manage_index_templates, and privileges of write and create_index for radiuslogs and dhcplogs indices. Granted with these permissions, the logstash_writer user is able to write data into the index.

...

Code Block
set +o history
curl -X POST --cacert /etc/elasticsearch/certs/ca.crt --user elastic 'https://wifimon-kibana.example.org:9200/_security/user/logstash_writer?pretty' -H 'Content-Type: application/json' -d'
{
  "username": "logstash_writer",
  "roles": ["logstash_writer_role"],
  "full_name": null,
  "email": null,
  "password": "some-password-goes-here",
  "enabled": true
}
'
set -o history

Logstash Output

On radius-pipeline and dhcp-pipeline configuration files, the output should be configured to send data to Elasticsearch cluster. This is done by configuring the Logstash output elasticsearch plugin.

...

Code Block
curl -XGET --cacert /etc/elasticsearch/certs/ca.crt --user elastic 'https://wifimon-kibana.example.org:9200/_cat/indices/radiuslogs?v'
curl -XGET --cacert /etc/elasticsearch/certs/ca.crt --user elastic 'https://wifimon-kibana.example.org:9200/_cat/indices/dhcplogs?v'

ILM Configuration

The intention of WiFiMon is not to keep the logs forever, they are only needed for a limited period of time. New log events keep coming so, after that time period has passed, the old logs should be deleted.

Logs are stored in the radiuslogs and dhcplogs indices. The index lifecycle management is achieved by creating and applying ILM policies, which can trigger actions upon indexes based on certain criteria. More information about ILM can be found at ILM Overview page.

Create Policy

This setup is about deleting the index when it’s one day old. Run the following command in the wifimon-kibana.example.org node to create the wifimon_policy policy.

...

Code Block
curl -XGET --cacert /etc/elasticsearch/certs/ca.crt --user elastic "https://wifimon-kibana.example.org:9200/_ilm/policy/wifimon_policy?pretty"

Apply Policy

The policy must be associated with the indexes upon which it will trigger the configured actions. For this to happen the policy must be configured in the index template used to create the index.

...

Code Block
curl -XGET --cacert /etc/elasticsearch/certs/ca.crt --user elastic "https://wifimon-kibana.example.org:9200/_template/wifimon_template?pretty"

Logstash Output

The Logstash elasticsearch output plugin provides settings to control the Index Lifecycle Management. Include the ILM settings on radius-pipeline and dhcp-pipeline configuration files, so that the elasticsearch output plugin becomes:

...

Restart the logstash service to apply the new settings.

Keystores

In order not to have sensitive information hardcoded in the configuration files and just protecting that information with filesystem permissions, it is recommended to make use of keystores provided by the Elasticsearch components.

Anchor
elasticsearch_keystore
elasticsearch_keystore
Elasticsearch

To configure Elasticsearch keystore run the following commands on each cluster node.

...

Code Block
/usr/share/elasticsearch/bin/elasticsearch-keystore list
keystore.seed
xpack.security.http.ssl.secure_key_passphrase
xpack.security.transport.ssl.secure_key_passphrase

Anchor
kibana_keystore
kibana_keystore
Kibana

To configure Kibana keystore run the following commands on wifimon-kibana.example.org node.

...

Code Block
sudo -u kibana /usr/share/kibana/bin/kibana-keystore list
server.ssl.keyPassphrase
elasticsearch.username
elasticsearch.password

Anchor
logstash_keystore
logstash_keystore
Logstash

To configure Logstash keystore run the following commands on wifimon-logstash.example.org node.

...

Code Block
/usr/share/logstash/bin/logstash-keystore --path.settings /etc/logstash/ list
fingerprint_key
logstash_system_password
logstash_writer_password
pkcs8_key_passphrase

Anchor
filebeat_keystore
filebeat_keystore
Filebeat

To configure Filebeat keystore run the following commands on the servers where Filebeat is installed.

...

Code Block
filebeat keystore list
beats_system_password
key_passphrase

References

The following links were very useful while writing this material and performing the tests mentioned in it.

...