This post is a continuance of my Docker series, and examines Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host.
In this blog I want to look into a lower-level building block: Linux Network Namespace.
The same as with cgroups, Docker uses Linux Network Namespace for resource isolation. I was looking into cgroup a year ago, and now I want to understand more about Network Namespace.
The goal is to both understand a bit more about Docker internals, and to see how we can provide network isolation for different processes within the same host. You might need to isolate process when running several MySQL or MongoDB instances on the same server (which might come in handy during testing). In this case, I needed to test ProxySQL without Docker.
We can always use different ports for different MySQL instances (such as 3306, 3307, 3308), but it quickly gets complicated.
We could also use IP address aliases for an existing network interface, and use bind=<IP.ADD.RE.SS> for each instance. But since Percona XtraDB Cluster can use three different IP ports and network channels for communications, this also quickly gets complicated.
Linux Network Namespace provides greater network isolation for resources so that it can be a better fit for Percona XtraDB Cluster nodes. Now, setting up Network namespaces in and of itself can be confusing; my recommendation is if you can use Docker, use Docker instead. It provides isolation on process ID and mount points, and takes care of all the script plumbing to create and destroy networks. As you will see in our scripts, we need to talk about directory location for datadirs.
Let’s create a network for Percona XtraDB Cluster with Network Namespaces.
I will try to do the following:
- Start four nodes of Percona XtraDB Cluster
- For each node, create separate network namespace so the nodes will be able to allocate network ports 3306, 4567, 4568 without conflicts
- Assign the nodes IP addresses: 10.200.10.2-10.200.10.5
- Create a “bridge interface” for the nodes to communicate, using IP address 10.200.10.1.
For reference, I took ideas from this post: Linux Switching – Interconnecting Namespaces
First, we must create the bridge interface on the host:
1 2 3 4 5 |
BRIDGE=br-pxc brctl addbr $BRIDGE brctl stp $BRIDGE off ip addr add 10.200.10.1/24 dev $BRIDGE ip link set dev $BRIDGE up |
Next, we create four namespaces (one per Percona XtraDB Cluster node) using the following logic:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
for i in 1 2 3 4 do ip netns add pxc_ns$i ip link add pxc-veth$i type veth peer name br-pxc-veth$i brctl addif $BRIDGE br-pxc-veth$i ip link set pxc-veth$i netns pxc_ns$i ip netns exec pxc_ns$i ip addr add 10.200.10.$((i+1))/24 dev pxc-veth$i ip netns exec pxc_ns$i ip link set dev pxc-veth$i up ip link set dev br-pxc-veth$i up ip netns exec pxc_ns$i ip link set lo up ip netns exec pxc_ns$i ip route add default via 10.200.10.1 done |
We see the following interfaces on the host:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
1153: br-pxc: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 32:32:4c:36:22:87 brd ff:ff:ff:ff:ff:ff inet 10.200.10.1/24 scope global br-pxc valid_lft forever preferred_lft forever inet6 fe80::2ccd:6ff:fe04:c7d5/64 scope link valid_lft forever preferred_lft forever 1154: br-pxc-veth1@if1155: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000 link/ether c6:28:2d:23:3b:a4 brd ff:ff:ff:ff:ff:ff link-netnsid 8 inet6 fe80::c428:2dff:fe23:3ba4/64 scope link valid_lft forever preferred_lft forever 1156: br-pxc-veth2@if1157: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000 link/ether 32:32:4c:36:22:87 brd ff:ff:ff:ff:ff:ff link-netnsid 12 inet6 fe80::3032:4cff:fe36:2287/64 scope link valid_lft forever preferred_lft forever 1158: br-pxc-veth3@if1159: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000 link/ether 8a:3a:c1:e0:8a:67 brd ff:ff:ff:ff:ff:ff link-netnsid 13 inet6 fe80::883a:c1ff:fee0:8a67/64 scope link valid_lft forever preferred_lft forever 1160: br-pxc-veth4@if1161: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000 link/ether aa:56:7f:41:1d:c3 brd ff:ff:ff:ff:ff:ff link-netnsid 11 inet6 fe80::a856:7fff:fe41:1dc3/64 scope link valid_lft forever preferred_lft forever |
We also see the following network namespaces:
1 2 3 4 5 |
# ip netns pxc_ns4 (id: 11) pxc_ns3 (id: 13) pxc_ns2 (id: 12) pxc_ns1 (id: 8) |
After that, we can check the namespace IP address:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# ip netns exec pxc_ns3 bash # ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 1159: pxc-veth3@if1158: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 4a:ad:be:6a:aa:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.200.10.4/24 scope global pxc-veth3 valid_lft forever preferred_lft forever inet6 fe80::48ad:beff:fe6a:aac6/64 scope link valid_lft forever preferred_lft forever |
To enable communication from inside the network namespace to the external world, we should add some iptables rules, e.g.:
1 2 3 |
iptables -t nat -A POSTROUTING -s 10.200.10.0/255.255.255.0 -o enp2s0f0 -j MASQUERADE iptables -A FORWARD -i enp2s0f0 -o $BRIDGE -j ACCEPT iptables -A FORWARD -o enp2s0f0 -i $BRIDGE -j ACCEPT |
where enp2s0f0
is an interface that has an external IP address (by some reason modern Linux distros decided to use names like enp2s0f0
for network interfaces, instead old good "eth0"
).
To start a node (or mysqld instance) inside a network namespace, we should use ip netns exec prefix for commands.
For example to start Percona XtraDB Cluster first node, in the namespace pxc_ns1
, with IP address 10.200.10.2
, we use:
1 |
ip netns exec pxc_ns1 mysqld --defaults-file=node.cnf --datadir=/data/datadir/node1 --socket=/tmp/node1_mysql.sock --user=root --wsrep_cluster_name=cluster1 |
To start following nodes:
1 2 3 4 5 |
NODE=2 ip netns exec pxc_ns${NODE} mysqld --defaults-file=node${NODE}.cnf --datadir=/data/datadir/node${NODE} --socket=/tmp/node${NODE}_mysql.sock --user=root --wsrep_cluster_address="gcomm://10.200.10.2" --wsrep_cluster_name=cluster1 NODE=3 ip netns exec pxc_ns${NODE} mysqld --defaults-file=node${NODE}.cnf --datadir=/data/datadir/node${NODE} --socket=/tmp/node${NODE}_mysql.sock --user=root --wsrep_cluster_address="gcomm://10.200.10.2" --wsrep_cluster_name=cluster1 etc |
As the result of this procedure, we have four Percona XtraDB Cluster nodes running in an individual network namespace, not worrying about IP address and ports conflicts. We also allocated a dedicated IP range for our cluster.
This procedure isn’t trivial, but it is easy to script. I also think provides a good understanding what Docker, LXC or other containerization technologies do behind the scenes with networks.