Table of Contents |
---|
Introduction
To achieve its purpose, correlating user information with network performance data, WiFiMon needs RADIUS and/or DHCP logs to be streamed in an Elasticsearch structure. For that purpose, an ELK cluster was built on VMs. A total of five VMs were used, with three of them defined as Elasticsearch master-eligible and data nodes, one VM configured as coordinating node where Kibana was installed too, and another one dedicated to Logstash.
...
Note | ||
---|---|---|
| ||
To implement this setup in your environment:
|
VMs Specifications
This setup consists of five VMs each of them having the specifications shown in the following table:
Property | Value |
---|---|
CPUs | 4 |
Memory | 8 GB |
Storage | 100 GB |
Network | 1 Gbps |
Architecture | x86_64 |
OS | CentOS 7 |
Cluster Setup
Setting up an ELK cluster means installing the software packages implementing its components. Some configuration must also be done as a preparation, before starting with the configuration of the cluster itself.
Anchor | ||||
---|---|---|---|---|
|
The following table shows the DNS configuration and the role each machine plays in the cluster.
...
Cluster node is considered to be the one that joins the cluster. In this setup, cluster nodes are the master-eligible/data nodes and the coordinating node. The pipeline node is not, it doesn’t join the cluster.
Package Installation
A cluster is a collection of nodes.
...
All the packages implementing the cluster's components (elasticsearch, logstash, kibana, filebeat) must be of the same version. This setup is about version 7.8.0.
System Configuration
Each node’s hostname is set to its FQDN, according to the values shown in the VMs DNS table. This value is referenced in the configuration file of Elasticsearch, and is also used in certificates for hostname validation.
...
Node | Open Port |
---|---|
wifimon-node{1,2,3}.example.org | 9300/tcp |
wifimon-kibana.example.org | 9200/tcp, 9300/tcp, 5601/tcp |
wifimon-logstash.example.org | 5044/tcp |
Port 9300/tcp is used for internal communication between cluster nodes. Port 5044/tcp is where Logstash listens for beats of log events sent from Filebeat. Port 5601/tcp is used to access Kibana platform from the browser. Port 9200/tcp is used to query the cluster.
Note | ||
---|---|---|
| ||
To query the cluster, the The node wifimon-kibana.example.org is also used for monitoring purposes, and this is the reason port 9200/tcp is used. This port is not opened open in the firewall. There's no need to open port 9200/tcp for querying , the cluster, this can only be queried happen locally by applying Elasticsearch Elasticsearch REST API commands at the cluster node you are currently logged in. For more information on querying the cluster see Cluster Exploration. |
This setup uses firewalld for the configuration implementation of firewall. On each node a "wifimon" custom zone is created to hold the specific configuration. On wifimon-kibana.example.org node, furthermore, some configuration goes into public zone to allow access for the kibana platform and the cluster components.
On wifimon-kibana.example.org:
Code Block |
---|
firewall-cmd --zone=public --list-ports 5601/tcp firewall-cmd --zone=wifimonpublic --list-rich-portsrules 9300/tcp firewall-cmd --zone=wifimon --list-sourcesrule family="ipv4" source ipset="wifimon-components" port port="9200" protocol="tcp" accept firewall-cmd --ipset=wifimon-components --get-entries 10.0.0.1/32 10.0.0.2/32 10.0.0.3/32 |
On wifimon-node1.example.org:
Code Block |
---|
10.0.0.5 10.10.10.111 192.168.1.15 firewall-cmd --zone=wifimon --list-ports 9300/tcp firewall-cmd --zone=wifimon --list-sources 10.0.0.1 10.0.0.2/32 10.0.0.3/32 10.0.0.4/32 |
On wifimon-node2.example.org:
Code Block |
---|
firewall-cmd --zone=wifimon --list-ports
9300/tcp
firewall-cmd --zone=wifimon --list-sources
10.0.0.1/32 10.0.0.3/32 10.0.0.4/32 |
Note | ||
---|---|---|
| ||
In the wifimon-components ipset, |
On wifimon-node{1,2,3}On wifimon-node3.example.org:
Code Block |
---|
firewall-cmd --zone=wifimon --list-ports 9300/tcp firewall-cmd --zone=wifimon --list-sources 10.0.0.1/32 10.0.0.2/32 10.0.0.3 10.0.0.4/32 |
On wifimon-logstash.example.org:
Code Block | |
---|---|
firewall-cmd --zone=wifimon --list-ports 5044/tcp firewall-cmd --zone=wifimon --list-sources 10.10.10.111/32 10192.10168.101.15/32 | |
Note | |
NOTE | |
In the configuration of Logstash firewall, |
SSL/TLS Certificates
The cluster communication is secured by configuring SSL/TLS encryption. The elasticsearch-certutil was used to generate a CA certificate, utilized for signing the certificates of the cluster components. This utility comes with the elasticsearch installation, and in this case was used the one installed in the wifimon-kibana.example.org node.
...
For more information on elasticsearch-certutil see its documentation page.
Cluster Configuration
Configuring a cluster means configuring the nodes it consists of, which in turn means defining cluster-general and node-specific settings. Elasticsearch defines these settings in configuration files located under the /etc/elasticsearch directory.
Anchor jvm_options jvm_options
JVM Options
jvm_options | |
jvm_options |
JVM options are defined in the /etc/elasticsearch/jvm.options file. By default Elasticsearch tells JVM to use a heap of minimum and maximum of 1 GB size. The more heap available, the more memory it can use for caching, however it is recommended to use no more than 50% of the total memory.
...
Note | ||
---|---|---|
| ||
On a running elasticsearch instance: If the command "systemctl -l status elasticsearch.service" produces the warning:
then (according to JEP 291) comment out the option "-XX:+UseConcMarkSweepG" and set the option "-XX:+UseG1GC". If the file "/var/log/elasticsearch/wifimon_deprication.log" contains warnings like:
then proceed with the recommendation, that is set the option "-Des.transport.cname_in_publish_address=true". |
Master-Eligible / Data Nodes
In a heavy data traffic cluster of many nodes, it is recommended to have the master-eligible and data nodes separated and dedicated to their own role. In this setup, however, there are three nodes configured as having both functionalities.
By default a node is a master-eligible, data, and ingest node, which means (a) it can be elected as master node to control the cluster, (b) it can hold data and perform operations on them, and (c) it is able to filter and enrich a data document before being indexed. Having a dedicated pipeline node with filtering/enriching capabilities there’s no need for the ingest feature, it has been however enabled because it is used for monitoring purposes.
Note | ||
---|---|---|
| ||
Elasticsearch keystore should be configured before running this configuration. |
...
For more information about the aforementioned settings see Node, Network Settings, Important discovery and cluster formation settings, and Secure a cluster.
Coordinating Node
A coordinating node is a node that has node.master, node.data, and node.ingest settings set to false, which means you are left with a node actually behaving as a load-balancer, routing the requests on the appropriate nodes in the cluster.
...
Below is the configuration of wifimon-kibana.example.org as an Elasticsearch coordinating node. It follows the same pattern as the master-eligible/data nodes, but with their functionalities set to false.
Note | ||
---|---|---|
| ||
Elasticsearch keystore should be configured before running this configuration. |
...
Code Block | ||
---|---|---|
| ||
cluster.name: wifimon node.name: ${HOSTNAME} node.master: false node.voting_only: false node.data: false node.ingest: false node.ml: false cluster.remote.connect: false path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch network.host: wifimon-kibana.example.org discovery.seed_hosts: [ "wifimon-node1.example.org", "wifimon-node2.example.org", "wifimon-node3.example.org" ] xpack.security.enabled: true xpack.security.http.ssl.enabled: true xpack.security.transport.ssl.enabled: true xpack.security.transport.ssl.verification_mode: full xpack.security.http.ssl.key: /etc/elasticsearch/certs/kibana.key xpack.security.http.ssl.certificate: /etc/elasticsearch/certs/kibana.crt xpack.security.http.ssl.certificate_authorities: /etc/elasticsearch/certs/ca.crt xpack.security.transport.ssl.key: /etc/elasticsearch/certs/kibana.key xpack.security.transport.ssl.certificate: /etc/elasticsearch/certs/kibana.crt xpack.security.transport.ssl.certificate_authorities: /etc/elasticsearch/certs/ca.crt xpack.monitoring.enabled: true xpack.monitoring.collection.enabled: true |
Setup Passwords
Elasticsearch comes with built-in users configured, each of them having a set of privileges but with their passwords not set, and consequently unable to be used for authentication.
...
For more information on Built-in users follow the link.
Kibana Platform
Kibana is a browser-based interface that allows for searching, viewing, and interacting with the data stored in the cluster. It’s a visualization platform for creating charts, tables, and maps to represent the data. Kibana should be configured in an Elasticsearch node. The configuration of Kibana is done by editing the /etc/kibana/kibana.yml file.
Note | ||
---|---|---|
| ||
Kibana keystore should be configured before running this configuration. |
...
For more information on Kibana configuration settings, see Configuring Kibana.
Anchor | ||||
---|---|---|---|---|
|
Even though it is possible to explore the cluster by using the Kibana platform, this section is about querying the cluster by using the REST API provided by Elasticsearch. The querying commands are executed in wifimon-kibana.example.org node and the user elastic is used for authentication.
...
Start the elasticsearch instance on wifimon-node1.example.org node and query the cluster again. The wifimon-node1.example.org will join the cluster and the status of the cluster will become green, while wifimon-node3.example.org continues to be the master node.
Filebeat Configuration
Filebeat monitors log files for new content, collects log events, and forwards them to Elasticsearch, either directly or via Logstash. In Filebeat terms one speaks about a) the input which looks in the configured log data locations, b) the harvester which reads a single log for new content and sends new log data to libbeat, and c) the output which aggregates and sends data to the configured output. For more information see Filebeat overview.
...
Code Block | ||
---|---|---|
| ||
Jun 18 19:15:20 centos dhcpd[11223]: DHCPREQUEST for 192.168.1.200 from a4:c4:94:cd:35:70 (galliumos) via wlp6s0 Jun 18 19:15:20 centos dhcpd[11223]: DHCPACK on 192.168.1.200 to a4:c4:94:cd:35:70 (galliumos) via wlp6s0 |
File Output
As mentioned above, Filebeat will be firstly configured to dump the output in a file. Below is shown the configuration file of Filebeat for each agent. It configures an input of type log, which is needed to read lines from log files. There's also the output which configures the path and the filename to dump the data in, and finally the section of processors to drop some fields Filebeat adds by default, and to add the logtype field used in the Logstash beats-pipeline output.
RADIUS Server
The following is the Filebeat configuration on the RADIUS server, which dumps the data in the /tmp/sample_logs_output.json file.
...
Code Block | ||
---|---|---|
| ||
{"@timestamp":"2020-06-28T13:07:37.183Z","@metadata":{"beat":"filebeat","type":"_doc","version":"7.8.0"},"logtype":"radius","message":"Sun Mar 10 08:16:05 2019\n\tService-Type = Framed-User\n\tNAS-Port-Id = \"wlan2\"\n\tNAS-Port-Type = Wireless-802.11\n\tUser-Name = \"username@example.org\"\n\tAcct-SessionId = \"82c000cd\"\n\tAcct-Multi-Session-Id = \"CC-2D-E0-9A-EB-A3-88-75-98-6C-31-AA82-C0-00-00-00-00-00-CD\"\n\tCalling-Station-Id = \"88-75-98-6C-31-AA\"\n\tCalledStation-Id = \"CC-2D-E0-9A-EB-A3:eduroam\"\n\tAcct-Authentic = RADIUS\n\tAcctStatus-Type = Start\n\tNAS-Identifier = \"Eduroam\"\n\tAcct-Delay-Time = 0\n\tNASIPtNAS-IP-Address = 192.168.0192.22111\n\tEvent-Timestamp = \"Mar 8 2019 08:16:05 CET\"\n\tTmpString-9 = \"ai:\"\n\tAcct-Unique-Session-Id = \"e5450a4e16d951436a7c241eaf788f9b\"\n\tRealm = \"example.org\"\n\tTimestamp = 1552029365"} |
The logs are located in the message field. The logtype field holds the radius value, thus differentiating these events from the dhcp ones when arriving at Logstash pipeline.
DHCP Server
The following is the Filebeat configuration on the DHCP server, which dumps the data in the /tmp/sample_logs_output.json file.
...
The logtype field contains the dhcp value, thus differentiating these events from the radius ones, when arriving at Logstash pipeline.
Filtering Log Events
Apart from adding or dropping named fields, processors can also be used to filter log events when certain criteria are met. For example, to send out only the log events containing the value Eduroam in the NAS-Identifyer field, the following configuration could be applied.
...
For more information on configuring processors see Filter and enhance the exported data.
Logstash Output
This section shows how to configure Filebeat’s logstash output to feed the pipeline node.
Note | ||
---|---|---|
| ||
Filebeat keystore should be configured before running this configuration. |
...
The hosts setting specifies node and port where Logstash service listens for incoming log events. The ${key_passphrase} references the passphrase of filebeat.key stored in Filebeat keystore -- it's about mutual SSL/TLS authentication, the client (Filebeat) is forced to provide a certificate to the server (Logstash) or the connection won't be established.
...
The above command loads the template from wifimon-kibana.example.org node where elasticsearch is installed. Detailed information is written in the Filebeat log file.
Monitoring
The Kibana platform allows for monitoring the health of Filebeat service. For this to happen, the following configuration must be added in the /etc/filebeat/filebeat.yml file.
Note | ||
---|---|---|
| ||
Filebeat keystore should be configured before running this configuration. |
...
The ${beats_system_password} references the password of the beats_system built-in user which is stored in Filebeat keystore.
Logstash Configuration
Logstash is a data collection engine with real-time pipelining capabilities. A Logstash pipeline consists of three elements, the input, filter, and output. The input plugins consume data coming from a source, the filter plugins modify the data as specified, and the output plugins send data to a defined destination. In this setup data comes from Filebeat agents, with their logstash output configured to feed the Logstash instance on port 5044/tcp.
Note | ||
---|---|---|
| ||
Logstash keystore should be configured before running the configurations provided here. |
JVM Options
The JVM Options for Logstash are defined in the /etc/logstash/jvm.options file. The configuration is the same as the one configuring the JVM Options of Elasticsearch.
Logstash Settings
Logstash settings are defined in the /etc/logstash/logstash.yml file, which contains the following:
...
Note | ||
---|---|---|
| ||
If you get in the Logstash logs the warning:
then you can ignore it. According to https://github.com/elastic/logstash/issues/10352 it's about a false warning. |
Logstash Pipelines
Logstash pipelines are defined in the /etc/logstash/pipelines.yml file, which contains:
...
For each pipeline, an id and the configuration file is defined. The beats-pipeline functions as a gate receiving logs from both (radius and dhcp) streams, and then forwarding these logs to the proper pipeline.
Anchor | ||||
---|---|---|---|---|
|
As mentioned above, the beats-pipeline acts as receiver / forwarder of log-events coming from RADIUS and DHCP streams. It doesn’t configure any filter element, but the input and output ones.
...
The beats plugin configures Logstash to listen on port 5044. It also provides settings for SSL/TLS encryption and forces the peer (Filebeat) to provide a certificate for identification. The output defines which pipeline to forward the data to, based on the value of logtype field sent from Filebeat agent.
RADIUS Pipeline
The radius-pipeline is configured in the /etc/logstash/conf.d/radius-pipeline.conf file. It receives RADIUS log-events sent from the beats-pipeline.
...
The output defines the stdout plugin which dumps the filtered data in the standard output, allowing for testing a data flow of Filebeat → Logstash → Logstash_STDOUT.
DHCP Pipeline
The dhcp-pipeline is configured in the /etc/logstash/conf.d/dhcp-pipeline.conf file. It receives DHCP log-events sent from the beats-pipeline.
...
The output defines the stdout plugin which dumps the filtered data in the standard output, allowing for testing a data flow of Filebeat → Logstash → Logstash_STDOUT.
Streaming to STDOUT
Having Filebeat agents configured to feed Logstash, whose pipelines are configured to dump data to STDOUT, makes it possible to test a data flowing through Filebeat → Logstash → Logstash_STDOUT.
...
You may have noticed in the output of radius-pipeline that the value of NAS-IP-Address have been changed from private IP to 162.13.218.132 (www.geant.org). This was done intentionally in order to see the results of geoip filter, which gives nothing for private IPs.
Streaming Logs Into Cluster
Until now the streaming of data has been triggered manually by using the sample data. This allowed for testing the configuration of Filebeat and Logstash, and also having a first view of results.
This section is about configuring the components pointing to real data files and implement a streaming through the path Filebeat → Logstash → Elasticsearch.
Filebeat Inputs
In the /etc/filebeat/filebeat.yml file under the filebeat.inputs, the paths should now point to the full path in the filesystem where the RADIUS or the DHCP logs are located.
...
Multiple files can be given to paths setting as a list or as a glob-based pattern.
Create User and Role
In order to send log events to the cluster, the user logstash_writer with the role logstash_writer_role must be created. The role assigns the cluster permissions of monitor and manage_index_templates, and privileges of write and create_index for radiuslogs and dhcplogs indices. Granted with these permissions, the logstash_writer user is able to write data into the index.
...
Code Block |
---|
set +o history curl -X POST --cacert /etc/elasticsearch/certs/ca.crt --user elastic 'https://wifimon-kibana.example.org:9200/_security/user/logstash_writer?pretty' -H 'Content-Type: application/json' -d' { "username": "logstash_writer", "roles": ["logstash_writer_role"], "full_name": null, "email": null, "password": "some-password-goes-here", "enabled": true } ' set -o history |
Logstash Output
On radius-pipeline and dhcp-pipeline configuration files, the output should be configured to send data to Elasticsearch cluster. This is done by configuring the Logstash output elasticsearch plugin.
...
Code Block |
---|
curl -XGET --cacert /etc/elasticsearch/certs/ca.crt --user elastic 'https://wifimon-kibana.example.org:9200/_cat/indices/radiuslogs?v' curl -XGET --cacert /etc/elasticsearch/certs/ca.crt --user elastic 'https://wifimon-kibana.example.org:9200/_cat/indices/dhcplogs?v' |
ILM Configuration
The intention of WiFiMon is not to keep the logs forever, they are only needed for a limited period of time. New log events keep coming so, after that time period has passed, the old logs should be deleted.
Logs are stored in the radiuslogs and dhcplogs indices. The index lifecycle management is achieved by creating and applying ILM policies, which can trigger actions upon indexes based on certain criteria. More information about ILM can be found at ILM Overview page.
Create Policy
This setup is about deleting the index when it’s one day old. Run the following command in the wifimon-kibana.example.org node to create the wifimon_policy policy.
...
Code Block |
---|
curl -XGET --cacert /etc/elasticsearch/certs/ca.crt --user elastic "https://wifimon-kibana.example.org:9200/_ilm/policy/wifimon_policy?pretty" |
Apply Policy
The policy must be associated with the indexes upon which it will trigger the configured actions. For this to happen the policy must be configured in the index template used to create the index.
...
Code Block |
---|
curl -XGET --cacert /etc/elasticsearch/certs/ca.crt --user elastic "https://wifimon-kibana.example.org:9200/_template/wifimon_template?pretty" |
Logstash Output
The Logstash elasticsearch output plugin provides settings to control the Index Lifecycle Management. Include the ILM settings on radius-pipeline and dhcp-pipeline configuration files, so that the elasticsearch output plugin becomes:
...
Restart the logstash service to apply the new settings.
Keystores
In order not to have sensitive information hardcoded in the configuration files and just protecting that information with filesystem permissions, it is recommended to make use of keystores provided by the Elasticsearch components.
Anchor | ||||
---|---|---|---|---|
|
To configure Elasticsearch keystore run the following commands on each cluster node.
...
Code Block |
---|
/usr/share/elasticsearch/bin/elasticsearch-keystore list keystore.seed xpack.security.http.ssl.secure_key_passphrase xpack.security.transport.ssl.secure_key_passphrase |
Anchor | ||||
---|---|---|---|---|
|
To configure Kibana keystore run the following commands on wifimon-kibana.example.org node.
...
Code Block |
---|
sudo -u kibana /usr/share/kibana/bin/kibana-keystore list server.ssl.keyPassphrase elasticsearch.username elasticsearch.password |
Anchor | ||||
---|---|---|---|---|
|
To configure Logstash keystore run the following commands on wifimon-logstash.example.org node.
...
Code Block |
---|
/usr/share/logstash/bin/logstash-keystore --path.settings /etc/logstash/ list fingerprint_key logstash_system_password logstash_writer_password pkcs8_key_passphrase |
Anchor | ||||
---|---|---|---|---|
|
To configure Filebeat keystore run the following commands on the servers where Filebeat is installed.
...
Code Block |
---|
filebeat keystore list beats_system_password key_passphrase |
References
The following links were very useful while writing this material and performing the tests mentioned in it.
...