Buy Percona ServicesBuy Now!

Testing Docker multi-host network performance

 | August 3, 2016 |  Posted In: Docker, MySQL


Docker multi-host networkIn this post, I’ll review Docker multi-host network performance.

In a past post, I tested Docker network. The MySQL Server team provided their own results, which are in line with my observations.

For this set of tests, I wanted to focus more on Docker networking using multiple hosts. Mostly because when we set up a high availability (HA) environment (using Percona XtraDB Cluster, for example) the expectation is that instances are running on different hosts.

Another reason for this test is that Docker recently announced the 1.12 release, which supports Swarm Mode. Swarm Mode is quite interesting by itself — with this release, Docker targets going deeper on Orchestration deployments in order to compete with Kubernetes and Apache Mesos. I would say Swarm Mode is still rough around the edges (expected for a first release), but I am sure Docker will polish this feature in the next few releases.

Swarm Mode also expects that you run services on different physical hosts, and services are communicated over Docker network. I wanted to see how much of a performance hit we get when we run over Docker network on multiple hosts.

Network performance is especially important for clustering setups like Percona XtraDB Cluster and  MySQL Group Replication (which just put out another Lab release).

For my setup, I used two physical servers connected over a 10GB network. Both servers use 56 cores total of Intel CPUs.

Sysbench setup: data fits into memory, and I will only use primary key lookups. Testing over the network gives the worst case scenario for network round trips, but it also gives a good visibility on performance impacts.

The following are options for Docker network:

  • No Docker containers (marked as “direct” in the following results)
  • Docker container uses “host” network (marked as “host”)
  • Docker container uses “bridge” network, where service port exposed via port forwarding (marked as “bridge”)
  • Docker container uses “overlay” network, both client and server are started in containers connected via overlay network (marked as “overlay” in the results). For “overlay” network it is possible to use third-party plugins, with different implementation of the network, the most known are:

For multi-host networking setup, only “overlay” (and plugins implementations) are feasible. I used “direct”, “host” and “bridge” only for the reference and as a comparison to measure the overhead of overlay implementations.

The results I observed are:

Client Server Throughput, tps Ratio to “direct-direct”
Direct Direct 282780 1.0
Direct Host 280622 0.99
Direct Bridge 250104 0.88
Bridge Bridge 235052 0.83
overlay overlay 120503 0.43
Calico overlay Calico overlay 246202 0.87
Weave overlay Weave overlay 11554 0.044


  • “Bridge” network added overhead, about 12%, which is in line with my previous benchmark. I wonder, however, if this is Docker overhead or just the Linux implementation of bridge networks. Docker should be using the setup that I described in Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host, and I suspect that the Linux network namespaces and bridges add overhead. I need to do more testing to verify.
  • Native “Overlay” Docker network struggled from performance problems. I observed issues with ksoftirq using 100% of one CPU core, and I see similar reports. It seems that network interruptions in Docker “overlay” are not distributed properly across multiple CPUs. This is not the case with the “direct” and “bridge” configuration. I believe this is a problem with the Docker “overlay” network (hopefully, it will eventually be fixed).
  • Weave network showed absolutely terrible results. I see a lot of CPU allocated to “weave” containers, so I think there are serious scalability issues in their implementation.
  • Calico plugin showed the best result for multi-host containers, even better than “bridge-bridge” network setup

If you need to use Docker “overlay” network — which is a requirement if you are looking to deploy a multi-host environment or use Docker Swarm mode — I recommend you consider using the Calico network plugin for Docker. Native Docker “overlay” network can be used for prototype or quick testing cases, but at this moment it shows performance problems on high-end hardware.


Vadim Tkachenko

Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks. Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products. He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition.


  • Docker 1.12 overlay driver supports enabling encryption for all traffic sent. This option is not on by default. Rather you specify it as a driver option when you create the network. I believe the option is called ‘encrypted’ according to this blog:

    I would be interested in seeing the relative performance of enabling encryption at a low level rather than at the app level.

    Could you rerun your benchmark against an overlay network with encryption enabled?

  • Docker overlay network uses VXLAN in unicast mode. Therefore, most of the work is done inside the kernel. Your kernel version is therefore a crucial information (but I don’t know if improvements have been made since its introduction in the kernel). Also, being an encapsulation technology, you’ll get better performance by lowering the MTU and using TCP MSS. Maybe Docker does that for you, but it’s worth a check.

      • I suspect the MTU setting as well. By default overlay driver uses 1450 & hence any traffic with larger packet size will result in frag & reassembly. In docker 1.12, driver mtu can be set per network during network creation via -o flag which will be used by the overlay driver to setup appropriate MTU for east-west traffic.
        If the measurement is done for north-south traffic (via default_gwbridge bridge), then the above driver option should also be used to create this default_gwbridge network manually.
        More information can be found under :

  • Can we add the following to the matrix

    coreos flannel (used by kuberntes)

    And ovs with vxlan (where each container is attached to ovs directly)

    And ovs attached to linux bridge which is attached to container.

  • Do you have this benchmark automated? I would love to experiment with my patchset which gives some of the same features, and functions as Docker bridge, but without the same overhead. If you’re interested in trying it out, let me know.

    • I do not have scripts automated, you need to run something like

      ./sysbench --test=tests/db/select.lua --oltp_tables_count=20 --oltp_table_size=5000000 --num-threads=100 --mysql-host=$HOST --mysql-user=sbtest --oltp-read-only=on --max-time=300 --max-requests=0 --report-interval=10 --rand-type=uniform --rand-init=on

      Do not forget to prepare (load data) before tests.

  • This is a good post, thank you! Could you clarify if the test were using the docker swarm mode (v1.12 or +)? I’m interested to see if it supports network plugins like calico for docker swarm mode. Thanks!

  • Hi, I work ok Weave Net. Hadn’t seen this page before. Must have been something seriously broken when you did that test; nobody has ever reported figures that bad. It would be helpful if you could open an issue or otherwise point to your methodology, so we can get that corrected.

  • Weave Net is perfectly able to deliver good performance. You can see at we measured throughput of 7.7GB/sec at EC2.

    Some options you can choose will lower performance – for instance turning on encryption – and Weave Net has multiple fall-back strategies when the default VXLAN will not work. But it should always be many times faster than what you reported.

    However, we are unable to test every scenario, and welcome input if you are able to document cases where we can improve.

    • One possibility is that the host machine does not have the OVS in the kernel but in the user space, which might explain the poor performance.

Leave a Reply