Ntop is a small, engineering-driven company, debt-free with low fixed costs, with many customers (most of which are individuals and small companies, but some key players in networking embed our software in their products), profitable in business for more than 10 years. High Performance Network Monitoring Solutions based on Open Source and Commodity Hardware.
nProbe (via its export plugin) supports ElasticSearch flows export. Setting up nProbe for the ElasticSearch export is a breeze, it just boils down to specifying option --elastic. For example, to export NetFlow flows collected on port 2058 (--collector-port 2058) to an ElasticSearch cluster running on localhost port 9200, one can use the following
nProbe will take care of pushing a template to ElasticSearch to have IP fields properly indexed, and will also POST flows in bulk to maximize the performance.
Recently we’ve done several improvements to the nProbe performance (you need to use the latest dev nProbe version) when it comes to export flows to ElasticSearch and therefore we believe it is time to publish some official numbers.
Performance tests have been run on an Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz machine with 16GB RAM with both nProbe and ElasticSearch:
OS: Ubuntu 16.04.6 LTS
nProbe v.8.7.190712 (r6564)
In order to measure the export performance, we’ve pushed NetFlow at increasing rates using pfsend as described in another post and we’ve disabled nProbe internal caches (--disable-cache).
We’ve seen that the maximum number of flows per second that a single nProbe instance (but remember you can instantiate one instance per-core on a multicore system, all sharing the same license) can export to ElasticSearch is approximately 45,000 flows per second. Above that threshold, flows will be dropped, that is, it won’t be possible to bulk-POST the incoming NetFlow fast enough.
For the sake of completeness, this is the full nprobe command used in the tests
We’ve also extended nProbe export stats shown when using option -b=1 to accurately report the rates. This allowed us to make the measurements and will also allow you to accurately monitor the performance of nProbe. Note that the drops you are seeing below are normal as we pushed nProbe above its limit to see the maximum successful flow export rate.
The main advantage of the direct export to ELK instead of using intermediate tools such as LogStash is that you can do it more efficiently and without having to configure too many intermediate components. Please also note that you can obtain similar figures when using the nProbe to export flows towards Kafka using the export plugin.
Continuous packet recorders are devices that capture raw traffic to disk, providing a window into network history, that allows you to go back in time when a network event occurs, and analyse traffic up to the packet level to find the exact network activity that caused the problem.
n2disk is a software application part of the ntop suite able to capture traffic at high speed (it relies on the PF_RING packet capture framework, able to deliver line-rate packet capture up to 100 Gbit/s) and dump traffic to disk using the standard PCAP format (which is used by packet analysis tools like Wireshark and ntopng). Network traffic is recorded permanently and the oldest data is overwritten as disk space fills up, in order to provide continuous recording and the best data retention time.
Besides storing network data to disk, n2disk can also:
Index and organize data in a timeline, to be able to retrieve traffic searching for packets matching a specific BPF filter in a selected time interval.
Compress data to save disk space (if you compile pcap-based applications on top of PF_RING-aware libpcap, any application compatible with the PCAP format can read compressed pcap files seamlessly).
Filter traffic, up to L7: you can discard traffic matching selected application protocols.
Shunt traffic: you can save disk space by recording only a few packets from the beginning of each communication for selected application protocols (e.g. encrypted or multimedia elephant flows).
Slice packets: the ability to reduce packet size by cutting them (e.g. up to the IP or TCP/UDP header).
In a previous post (part 1) we described how to build a 2×10 Gbit continuous packet recorder using n2disk and PF_RING, however it’s time to update it as a few years have past, new features have been added, and new capture and storage technologies are available.
Network Adapter: Intel vs FPGAs
All ntop applications, including n2disk, are based on PF_RING and can operate on top of commodity adapters (accelerated Zero-Copy drivers are available for Intel) as well as specialised FPGAs adapters like Napatech, Fiberblaze and others (the full list is available in the PF_RING documentation).
In order to choose the best adapter we need to take into account a few factors, including capture speed, features and price. Intel adapters are cheap and can deliver 10+ Gbps packet capture with 64 byte packets using PF_RING ZC accelerated drivers. FPGA adapters are (more or less, depending on the manufacturer) expensive, but provide, in addition to higher capture speed with small packet sizes, support for useful features like ports aggregation, nanosecond timestamping, traffic filtering. We can summarize it in a small table:
Link Speed / Features Required
Intel (e.g. 82599/X520)
2 x 10 Gbit Aggregation / Nanosecond Timestamp
FPGA (Napatech, Silicom)
FPGA (Napatech, Silicom)
What Storage System Do I Need?
If you need to record 1 Gbps, even a single (fast) HDD is enough to keep up with the traffic throughput. If you need to record 10+ Gbps, you need to increase the I/O throughput by using a RAID system with many drives. At ntop we usually use 2.5″ 10K RPM SAS HDD drives for compact systems (e.g. 2U form factor with up to 24 disks), or 3.5″ 7.2 KRPM SAS HDD for modular systems when rack space is not a problem and many units are required to increase data retention (in this case you need to use a RAID controller able to drive a SAS expander, which is able to handle hundreds of disks). More space in the storage system translates in a higher retention time and thus the ability to go further back in time to find old data.
The number of drives, combined with the I/O throughput for each drive, and the RAID configuration, determine the final I/O throughput you are able to achieve. The drive speed depends on the drive type and model, they can be summarized in the table below:
Sustained Sequential I/O
NVMe (PCIe SSD)
In order to record traffic at 10 Gbps for instance, you need 8-10 SAS HDDs in RAID 0, 10-12 disks in RAID 50. The RAID controller should have at least 1-2 GB of buffer onboard in order to keep up with 10+ Gbps. Alternatively you can use 3-5 SSDs, or 1-2 NVMe (PCIe SSD) drives. SSDs are usually used when concurrent write and read are required under intensive workload, to avoid that HDD’s seek time jeopardize the performance. Please make sure that you select write-intensive flash disks that guarantee great endurance over time.
At 20 Gbps at ntop we usually use 16-24 HDDs. At 40-100 Gbps you probably also need to use multiple controllers as most controllers are able to handle up to 35-40 Gbps sustained and you need to distribute the load across a few of them. In fact, since version 3.2, n2disk implements multithreaded dump, that means it is able to write to multiple volumes in parallel. This is also useful when using NVMe disks, as they are directly attached to the PCIe bus and lightfast, but they cannot be driven by a standard controller, thus you can use n2disk to write to many NVMe disks in parallel: we have been able to achieve 140 Gbps of sustained throughput using 8 write-intensive NVMe disks!
What CPU Do I Need?
Choosing the right CPU depends on a few factors.
First of all the adapter model. Intel adapters transfers packets one-by-one putting pressure on the PCIe bus and thus increasing the overall system utilisation with respect to FPGA adapter like Napatech or Silicom that are able to work in “chunk” mode (other NIC vendors such as Accolade for instance do not support it yet). FPGA adapters are also able to aggregate traffic in hardware at line-rate, whereas with Intel we need to merge packets on the host, and it is hard to scale above 20-25 Mpps in this configuration. A CPU with high frequency (3+ GHz) is required with Intel.
The second factor is definitely traffic indexing. You probably want to index traffic to accelerate traffic extraction and this requires a few CPU cores. In order to index traffic on the fly at 10 Gbps, 1-2 dedicated cores/threads are required (in addition to the capture and writer threads. At 40 Gbps you probably need 4-6 indexing threads. At 100 Gbps at least 8-10 threads.
In short, if you need to record 10 Gbps, a cheap Intel Xeon E3 with 4 cores and 3+ GHz is usually enough even with Intel adapters. If you need to record and index 20+ Gbps, you should probably go with something more expensive like an Intel Xeon Scalable (e.g. Xeon Gold 6136) with 12+ cores and 3+ GHz. Pay attention to the core affinity and NUMA as already discussed in the past.
How Much Does It Cost?
Continuous packet recorders on the market are expensive devices because they need fast/expensive storage systems and they are usually part of enterprise-grade solutions designed for high-end customers. At ntop we want to deliver everyone the best technology at affordable prices, and we recently updated our price list lowering down prices for the n2disk product (please check the shop for more info). Education and no-profit can use our commercial tools at no cost.
For further information about the n2disk configuration and tuning, please refer to the n2disk documentation.
Getting started with PF_RING can be a bit tricky as it requires the creation of a few configuration files in order to setup the service, especially when ZC drivers need to be used.
First of all it requires packages installation: PF_RING comes with a set of packages for installing the userspace libraries (pfring), the kernel module (pfring-dkms), and the ZC drivers (<driver model>-zc-dkms). Installing the main package, pfring, is quite intuitive and straightforward following the instructions available at http://packages.ntop.org , however installing and configuring the proper package when it comes to install the ZC driver for the actual NIC model available on the target machine can lead to some headache.
In fact, doing the driver configuration manually means (in this example we consider the ixgbe driver):
Creating a configuration file for PF_RING (/etc/pf_ring/pf_ring.conf).
Checking the model of the installed NIC.
Installing the proper dkms driver from the repository (ixgbe-zc-dkms)
Creating a configuration file for the NIC model (/etc/pf_ring/zc/ixgbe/ixgbe.conf), to indicate the number of RSS queues and other driver settings.
Creating a .start file for the NIC model (/etc/pf_ring/zc/ixgbe/ixgbe.start) to indicate that we actually want to load the driver.
Creating a configuration file for the hugepages (/etc/pf_ring/hugepages.conf)
Restarting the service.
In order to simplify all of this, since PF_RING 7.5, the pfring package includes the pf_ringcfg script that can be used to automatically install the required driver package and create the full configuration for the PF_RING kernel module and ZC drivers. With this method, configuring and loading the ZC driver for an interface is straightforward, it can be done in a few steps:
1. Configure the repository as explained at http://packages.ntop.org and install the pfring package which includes the pf_ringcfg script (example for Ubuntu):
apt-get install pfring
Note: it is not required to install any additional package like pfring-dkms or <driver model>-zc-dkms, pf_ringcfg will take care of that, installing selected packages according to what is actually required by the configuration.
2. List the interfaces and check the driver model:
Name: em1 Driver: igb [Supported by ZC]
Name: p1p2 Driver: ixgbe [Supported by ZC]
Name: p1p1 Driver: ixgbe [Supported by ZC]
Name: em2 Driver: e1000e [Supported by ZC]
3. Configure and load the driver specifying the driver model and (optionally) the number of RSS queues per interface:
Note: there are corner cases that require particular attention and that you handle with custom configuration. For example if you’re configuring a ZC driver for an adapter that you’re currently using as management, pf_ring does not reload the driver by default as it may break network connectivity. In this case you need to add the –force option when running the pf_ringcfg script, or follow the Manual Configuration section in the PF_RING User’s Guide.
Recently, we have introduced the concept of network and container visibility through system introspection and also demonstrated its feasibility with an opensource library libebpfflow. In other words, by leveraging certain functionalities of the linux operating system, we are able to detect, count and measure the network activity that is taking place on a certain host. We have published a paper and also presented the work at the FOSDEM 2019 and therefore a detailed discussion falls outside the scope of this post. However, we would like to recall that information we are able to extract is very rich and is absolutely not limited to mere byte and packet counters. For example we can determine:
All the TCP and UDP network communications, including their peers, ports, and status
TCP counters and also retransmissions, out-of-orders, round-trip times, and other metrics which are useful as a proxy for the communication quality
Users, processes and executables behind every communication (eg., /bin/bash , executed with root privileges, is contacting a malware IP address)
Container and orchestrator information, when available (e.g., /usr/sbin/dnsmasq is being run inside container dnsmasq which, in turn, belongs to Kubernetes pod kube-dns-6bfbdd666c-jjt75 within namespace kube-system)
By the way, do you know what is the really cool innovation behind all of this? Well, actually, is that we do not have to look at the packets to get this information out! This is why we also love to use the term packetless network visibility, which may seem an oxymore at first, but eventually it makes a lot of sense. Indeed, not looking at the packets is not only cool but it is also somehow necessary under certain circumstances. Think to multiple containers which are communicating together on the same host. Their packets would never leave the system and, thus, would never reach the network, making any mirror or TAP totally useless. In this case, having visibility into the inter-container communications would require an introspection-based approach such as the one we have proposed.
Ok so now that we have gone through a brief recap of our technology it is time to see it in action. To start you need two pieces:
nprobe-agent, a small application which integrates libebpfflow and is responsible for performing system introspection
ntopng, our visualization tool, which receives introspected data from nprobe-agent and visualizes it in an handy GUI
Configuration is straightforward. You can fire nprobe-agent with just a single option which basically tells it the address on which ntopng is listening for introspected data
# nprobe-agent -v --zmq tcp://127.0.0.1:1234c
In this example, we are going to use nprobe-agent and ntopng on the same host so we are safely using the loopback address 127.0.0.1 to make them communicate. Note, however, that this is not necessary as nprobe-agent and ntopng can run on two physically separate hosts and you can also run multiple nprobe-agent and let them export to the same instance of ntopng.
To collect data from nprobe-agent, ntopng can be started as follows
./ntopng -i tcp://*:1234c -m "192.168.2.0/24"
the -i option specifies on which port ntopng has to listen for incoming data (see the port is 1234, the same used for nprobe-agent) whereas the option -m specifies a local network of interest.
Once both applications are running, point your browser to the address of ntopng and you will start seeing network communications along with users, processes and container information. Cool, isn’t it?
Combining Network Packets with System Introspection
When you have packets, you can also combine them with data from system introspection. This is straightforward to do. You just have to indicate a packet interface in ntopng as a companion of the interface which is responsible for the collection of system introspection data from nprobe-agent.
For example, assuming an ntopng instance is monitoring the loopback interface lo in addition to receiving data from nprobe-agent as
./ntopng -i tcp://*:1234c -i lo -m "192.168.2.0/24"
We can declare the tcp://*:1234c as the companion of lo from the preferences as
From that point on, system-introspected data arriving at tcp://*:1234c will also be delivered to lo and automatically combined with real packets:
A few months ago at FOSDEM we introduced the concept of network and container visibility through system introspection and we released an opensource library based on eBPF that can be used for this scope. Based on this technology, we created a lightweight probe, nProbe Agent (formerly known ad nProbe mini), able to detect, count and measure all network activities taking place on the host where it is running. Thanks to this agent it is possible to enrich the information extracted with a traditional probe from network traffic packets, with system data such as users and processes responsible for network communications. In fact, this agent is able to extract and export a rich set of information, including:
TCP and UDP network communications (5-tuple, status).
TCP counters, including retransmissions, out-of-order packets, round-trip times read reliably from the Linux kernel without having to mimic them using packets.
The user behind a communication.
The process and executable behind a communication.
Container and orchestrator information (e.g. POD), if any.
For example, nProbe Agent gives you the answer to questions like: who is the user trying to download a file from a malware host? Which process is he running? From which container, if any?
nProbe Agent does all this without even looking at Network packets, in fact it implements a low-overhead event-based monitoring mainly based on hooks provided by the Operating System, leveraging on well-established technologies such as Netlink and eBPF. In particular eBPF support is implemented my means on the open source libebpfflow library we developed to mask eBPF complexity. This also allows the agent to detect communications between containers on the same host. nProbe Agent is able to export all the extracted information in JSON format over a ZMQ socket or to a Kafka cluster.
As eBPF requires modern Linux kernels, nProbe Agent is available only for Ubuntu 18.04 LTS and CentOS 7 (please upgrade your distro with the latest CentOS packages). If you just need basic system visibility information, there is also libebpflowexport a fully open-source tool that is also natively supported by ntopng out of the box.
If you are attending this event (we’ll have a booth at InfluxDays), or if you live in London and want to meet us, please show at the event to contact us so we can arrange an informal meeting and hear from you. We need feedback from our users so that together we can plan the future of ntop.
This is to announce the release of nProbe Cento 1.8 stable release. This is a maintenance release where we have made many reliability fixes and added new options to integrate this tool with the latest ntopng developments. We suggest all our users to update to this new release so you can benefit from the enhancements.
Added –json-labels option to print labels as keys with JSON
Added –tcp : option to export JSON over TCP export
Added –disable-l7-protocol-guess option to disable nDPI protocol guess
Support for ZMQ flows export with/without compression
Keepalive support with ZMQ
Fixed JSON generation and string escape
Fixed export drops when reading from a PCAP file
Fixed wrong detection of misbehaving flows
Fixed pkts and bytes counters in logged flows
Fixed license check when reading from PCAP file
Fixed size endianness in ZMQ export
Fixed ZMQ header version to be compatible with the latest ntopng
Most people think that SSL means safety. While this is not a false statement, you should not take it for granted. In fact while your web browser warns you when a certain encrypted communication has issues (for instance them SSL certificates don’t match), you should not assume that SSL = HTTPS, as:
TLS/SSL encryption is becoming (fortunately) pervasive also for non web-based communications.
The web browser can warn you for the main URL, but you should look onto the browser development console for other alerts (most people ignore the existence of this component).
As when TLS/SSL communications are insecure (see below for details) we are on a very bad situation as we believe we have done our best, but in practice SSL is hiding our data but is not implementing safety as attackers have tools to exploit SSL weaknesses. In the past weeks we have spent quite some time enhancing SSL support in both nDPI and ntopng. This is to make people aware of SSL issues on their network, understand the risks, and implement countermeasures (e.g. update old servers). What we have implemented in the latest ntopng dev version (that will be merged on the next stable release) is SSL handshake dissection for detecting:
Insecure and weak ciphers
Your communication is encrypted (i.e. you will see a lock on the URL bar but the date you exchange might be potentially decrypted).
Client/server certificate mismatch
You are not talking with the server you want to talk to.
Insecure/obsolete SSL/TLS versions
It’s time to update your device/application.
When a SSL communication is not satisfying all safety criteria, ntopng detected it, and triggers an alert. In essence we have implemented a lightweight SSL monitoring console that allows you (without having to install an IDS or similar application) to understand the security risks and fix them before it’s too late.
Below you can find a valid SSL communication: for your convenience we have highlighted the SSL detection fields (on a future blog post we’ll talk more about JA3).
When ntopng detect TLS/SSL issues, it reports them both in the flow
The goal of this post is not to scare the reader, but increase awareness in network communications and use ntopng for understanding the risks and implement countermeasures to keep your network safe.
Remember: you should not implement secure communications because you are scared of attackers, but because it’s the right thing to do for preserving your privacy.
The latest ntopng 3.9 dev gives you the possibility to choose whether to send telemetry data back to ntop. We collect and analyze telemetry data to diagnose ntopng issues and make sure it’s functioning properly. In other words, telemetry data help us in finding and fixing certain bugs that may affect certain versions of ntopng.
And don’t worry, we won’t use any data to try and identify you. However, if you want to, you can decide to provide an email address we can use to reach you in case we detect your instance has anomalies.
So which kind of telemetry data is sent? Currently, the only telemetry data sent to ntop are crash reports. That is, when ntopng terminates anomalously, a tiny JSON containing ntopng version, build platform, operating system and startup options is sent to notify us that something went wrong.
At any time you can see the status of the telemetry by visiting the “Telemetry” page accessible from the “Home” menu. You can all the details of the data that may be sent to our server, and also the most recent data which have been sent.
At any time you can consent or revoke the permission to send telemetry data, this is completely up to you. Deciding to send telemetry is a small act, but it has a great value for the community as it can foster a continuous improvement of ntopng. So please, visit the “Preferences” and choose to contribute!
One of the most difficult steps on a monitoring deployment scenario is to choose where is the best point where traffic has to be monitored, and what is the best strategy to observe this traffic. The main options are basically:
Port Mirroring/Network Tap
NetFlow/sFlow Flow Collector
Port Mirroring/Network Tap
Port mirroring (often called span port) and network tap have already been covered on a previous post. They are two techniques used to provide packet access that often are the best way to troubleshoot network issues as packets are often perceived as the ground truth (“packets never lie”). This means that we are able to have a “full packet visibility” because we have visibility of L2 (mirror) or L1 (taps). There are various types of hardware taps where the most complex is called network packet broker. A good introduction to this topic, Tap vs. Span, can be found on this deep dive article. Note that is you are monitoring traffic on a computer you have access to, you can avoid this technique by simply running your monitoring tool on this host: just be aware that you will introduce some extra load on the server and thus that you network communications might be slightly affected by your monitoring activities.
NetFlow/sFlow Flow Collector
In flow collection we have no direct access to packets with some little differences. In NetFlow/IPFIX is the probe running inside the router that clusters together similar packets according to a 5-tuple key (proto, IP/port src/dst) and computes metrics such as bytes, packets; in a way NetFlow/IPFIX “compresses” (not literally) traffic in order to produce a “digest” of network communications. In sFlow instead, the probe running inside the network switch emits samples that include “packet samples” that in essence are packets captured on switch ports cut to a snaplen (usually 128 bytes) and send to a sFlow collector encapsulated on sFlow packet format. When comparing sFlow to packet capture you have no full packet visibility (both in terms of packet length and ability to see all packets but just a sample), but on the other hand you have access to additional metadata such as the name of the authenticated user (e.g. via Radius or 802.1X) that made such traffic. This is a very important information that can be very helpful during troubleshooting or security analysis.
In flow collection, ntopng will show you flows collected by nProbe and sent to ntopng via ZMQ. This has the advantage of being able to monitor multiple NetFlow/sFlow/IPFIX exporters and combine all of them into a single ntopng instance. Doing the same with packets would have been much more complicated if doable at all. In this scenario, you need to keep in mind that you can see only prefiltered and presummarized traffic from the device that is sending flow: it mean that you can’t have “full packet visibility” but only a summarized version of it.
Which Option is the Best?
It is now time to decide the flow visibility strategy to follow, this based on your monitoring expectation. Taps are definitively a good options for packet-oriented people, but keep in mind that is possible to have also mixed scenarios where some networks are monitored using packets, and others with flows.
Physical or Virtual Monitoring Tools?
Often people ask us whether a physical box has advantages over a VM used to monitor traffic. IT trend is towards virtual but physical could help and could help you on the proof of concept scenario. Remember, there is not “best scenario” to follow. Virtual environment allow you to avoid possible hardware problems but require a dedicated physical NIC for TAP mode scenario, that isn’t always possible. Hardware could be easier but it can vary every time.
Technical requirements depends on what you need to see and collect, but the minimum should be:
Intel CPU with two cores
4 Gb of RAM
120 Gb of Disk Space. SATA or SSD depend on the traffic you need to verify but SSD is preferred.
1 NIC. 1 NIC only for Flow Collector mode. 2 NIC, at least for TAP Visibility.
Linux operating system. ntop builds prepackages packages for Debian, Ubuntu LTS and CentOS. You can choose the nitro you like, but if you ask us we suggest you to use Ubuntu LTS.
Most ntop software is open source, that for most people free of licensing fees. However even in the case of ntopng we offer premium versions that allows us to keep developing the product. Hence sot all the components are freely available so you need to choose the right deployment based on the budget or based on the feature you need. ntopng could run on community mode: it means that you can catch from the wire all the flows presented to ntopng via tap Interfaces but you are going to have limited functions and capabilities. If you choose to have all the features on, you need a simple ntopng Pro or Enterprise license.
Otherwise if you plan to add or to use flow collector mode, remember you need to buy a nprobe license to allow you to grab all the flows form devices and present them to ntopng, better if licensed so that you can have for instance full integration with other protocols such as SNMP. Probably, if you try both scenario, you will adopt a ntopng plus nprobe scenario (check main feature here https://www.ntop.org/products/traffic-analysis/ntop/).