Testing Docker multi-host network performance

Docker multi-host networkIn this post, I’ll review Docker multi-host network performance.

In a past post, I tested Docker network. The MySQL Server team provided their own results, which are in line with my observations.

For this set of tests, I wanted to focus more on Docker networking using multiple hosts. Mostly because when we set up a high availability (HA) environment (using Percona XtraDB Cluster, for example) the expectation is that instances are running on different hosts.

Another reason for this test is that Docker recently announced the 1.12 release, which supports Swarm Mode. Swarm Mode is quite interesting by itself — with this release, Docker targets going deeper on Orchestration deployments in order to compete with Kubernetes and Apache Mesos. I would say Swarm Mode is still rough around the edges (expected for a first release), but I am sure Docker will polish this feature in the next few releases.

Swarm Mode also expects that you run services on different physical hosts, and services are communicated over Docker network. I wanted to see how much of a performance hit we get when we run over Docker network on multiple hosts.

Network performance is especially important for clustering setups like Percona XtraDB Cluster and  MySQL Group Replication (which just put out another Lab release).

For my setup, I used two physical servers connected over a 10GB network. Both servers use 56 cores total of Intel CPUs.

Sysbench setup: data fits into memory, and I will only use primary key lookups. Testing over the network gives the worst case scenario for network round trips, but it also gives a good visibility on performance impacts.

The following are options for Docker network:

  • No Docker containers (marked as “direct” in the following results)
  • Docker container uses “host” network (marked as “host”)
  • Docker container uses “bridge” network, where service port exposed via port forwarding (marked as “bridge”)
  • Docker container uses “overlay” network, both client and server are started in containers connected via overlay network (marked as “overlay” in the results). For “overlay” network it is possible to use third-party plugins, with different implementation of the network, the most known are:

For multi-host networking setup, only “overlay” (and plugins implementations) are feasible. I used “direct”, “host” and “bridge” only for the reference and as a comparison to measure the overhead of overlay implementations.

The results I observed are:

ClientServerThroughput, tpsRatio to “direct-direct”
Calico overlayCalico overlay2462020.87
Weave overlayWeave overlay115540.044


  • “Bridge” network added overhead, about 12%, which is in line with my previous benchmark. I wonder, however, if this is Docker overhead or just the Linux implementation of bridge networks. Docker should be using the setup that I described in Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host, and I suspect that the Linux network namespaces and bridges add overhead. I need to do more testing to verify.
  • Native “Overlay” Docker network struggled from performance problems. I observed issues with ksoftirq using 100% of one CPU core, and I see similar reports. It seems that network interruptions in Docker “overlay” are not distributed properly across multiple CPUs. This is not the case with the “direct” and “bridge” configuration. I believe this is a problem with the Docker “overlay” network (hopefully, it will eventually be fixed).
  • Weave network showed absolutely terrible results. I see a lot of CPU allocated to “weave” containers, so I think there are serious scalability issues in their implementation.
  • Calico plugin showed the best result for multi-host containers, even better than “bridge-bridge” network setup

If you need to use Docker “overlay” network — which is a requirement if you are looking to deploy a multi-host environment or use Docker Swarm mode — I recommend you consider using the Calico network plugin for Docker. Native Docker “overlay” network can be used for prototype or quick testing cases, but at this moment it shows performance problems on high-end hardware.


Share this post

Comments (15)

  • Kevin

    Docker 1.12 overlay driver supports enabling encryption for all traffic sent. This option is not on by default. Rather you specify it as a driver option when you create the network. I believe the option is called ‘encrypted’ according to this blog:


    I would be interested in seeing the relative performance of enabling encryption at a low level rather than at the app level.

    Could you rerun your benchmark against an overlay network with encryption enabled?

    August 3, 2016 at 4:53 pm
  • Vincent Bernat

    Docker overlay network uses VXLAN in unicast mode. Therefore, most of the work is done inside the kernel. Your kernel version is therefore a crucial information (but I don’t know if improvements have been made since its introduction in the kernel). Also, being an encapsulation technology, you’ll get better performance by lowering the MTU and using TCP MSS. Maybe Docker does that for you, but it’s worth a check.

    August 4, 2016 at 3:00 am
    • Vadim Tkachenko


      That’s a valid request.
      I am using kernel Linux 4.4.0-31-generic #50-Ubuntu SMP Wed Jul 13 00:07:12 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
      from Ubuntu 16.04.1 LTS

      August 4, 2016 at 3:21 am
      • Madhu

        I suspect the MTU setting as well. By default overlay driver uses 1450 & hence any traffic with larger packet size will result in frag & reassembly. In docker 1.12, driver mtu can be set per network during network creation via -o com.docker.network.driver.mtu=XXXX flag which will be used by the overlay driver to setup appropriate MTU for east-west traffic.
        If the measurement is done for north-south traffic (via default_gwbridge bridge), then the above driver option should also be used to create this default_gwbridge network manually.
        More information can be found under : https://github.com/docker/libnetwork/issues/1021

        August 9, 2016 at 9:32 am
        • Madhu

          Ofcourse its easier to try traffic with size < 1450 to confirm the performance bottleneck

          August 11, 2016 at 2:35 pm
  • Muayyad Alsadi

    Can we add the following to the matrix

    coreos flannel (used by kuberntes)

    And ovs with vxlan (where each container is attached to ovs directly)

    And ovs attached to linux bridge which is attached to container.

    August 5, 2016 at 9:28 am
  • sargund

    Do you have this benchmark automated? I would love to experiment with my patchset which gives some of the same features, and functions as Docker bridge, but without the same overhead. If you’re interested in trying it out, let me know.

    August 7, 2016 at 8:57 pm
    • Vadim Tkachenko

      I do not have scripts automated, you need to run something like

      ./sysbench --test=tests/db/select.lua --oltp_tables_count=20 --oltp_table_size=5000000 --num-threads=100 --mysql-host=$HOST --mysql-user=sbtest --oltp-read-only=on --max-time=300 --max-requests=0 --report-interval=10 --rand-type=uniform --rand-init=on

      Do not forget to prepare (load data) before tests.

      August 8, 2016 at 1:39 pm
  • Alexander Holbreich (@aHolbreich)

    thank you. Do you have any insight to latency of small requests? In my case it’s more about them.

    October 11, 2016 at 2:24 pm
  • linsununc

    This is a good post, thank you! Could you clarify if the test were using the docker swarm mode (v1.12 or +)? I’m interested to see if it supports network plugins like calico for docker swarm mode. Thanks!

    December 12, 2016 at 11:01 am
  • Bryan Boreham

    Hi, I work ok Weave Net. Hadn’t seen this page before. Must have been something seriously broken when you did that test; nobody has ever reported figures that bad. It would be helpful if you could open an issue or otherwise point to your methodology, so we can get that corrected.

    January 24, 2017 at 3:57 am
  • Bryan Boreham

    Weave Net is perfectly able to deliver good performance. You can see at https://www.weave.works/weave-docker-networking-performance-fast-data-path/ we measured throughput of 7.7GB/sec at EC2.

    Some options you can choose will lower performance – for instance turning on encryption – and Weave Net has multiple fall-back strategies when the default VXLAN will not work. But it should always be many times faster than what you reported.

    However, we are unable to test every scenario, and welcome input if you are able to document cases where we can improve.

    January 24, 2017 at 10:10 am
    • Kai

      One possibility is that the host machine does not have the OVS in the kernel but in the user space, which might explain the poor performance.

      February 2, 2017 at 11:00 am
      • Bryan Boreham

        Thanks for commenting, but Docker uses the same kernel module as Weave Net does, so it must have been installed.

        February 2, 2017 at 3:05 pm
  • Veit

    I’m also interested in seeing the benchmarks with the overlay network encryption enabled.

    November 19, 2017 at 5:01 am

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.