EmergencyEMERGENCY? Get 24/7 Help Now!

Using netcat to copy MySQL Database

 | May 31, 2009 |  Posted In: Insight for DBAs


This is mainly a cheat sheet for me to remember. Nothing rocket science.

It often makes sense to use netcat/nc to copy MySQL database between hosts in trusted networks. It bypasses encryption overhead of SSH and depending on configuration can be significantly faster.

Also note MySQL should be down when you copy data unless you’re copying from snapshot etc.

So to copy go to the mysql data directory on both boxes; such as cd /var/lib/mysql . Make sure target directory is empty. Now on the TARGET server do nc -l 4000 | tar xvf – and on the SOURCE server do tar -cf – . | nc target_ip 4000

Also note – the port you’re using should be open in the firewall.

Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.


  • Warning: TCP may introduce errors in your data due to inadequate CRC in packet header. Make sure to to a MD5 or SHA1 of the files at both ends before trusting that the data was transferred correctly.

  • @Antony I was just reviewing that to make sure, but I doesn’t tar have crc so with the cmdlines that Peter posted I reckon it’d be ok.
    In many cases though the firewall hassles are greater than the ssh overhead 😉

  • Thanks Antony,

    As I mentioned this is usually used for local networks in which case the risk is relatively low. Though it is a good point.

  • Hi,

    I think you forgot a “-p” from the netstat command on the target server. Should probably read:

    nc -l 4000 | tar xvf –

  • You need to use ‘-p’ only on debian like distros, while on redhat’s not. nc seems to be slightly different across those families.

    Sometimes when you need to limit transfer bandwidth throttle (http://klicman.org/throttle/) is extremely handful. With this little nice tool you can adjust transfer speed without stopping it !
    Also bar (http://clpbar.sourceforge.net/) is very nice 🙂

  • Andrew,

    The RedHat/Centos version of nc works without this option and it does not work with it:

    [pz@box scripts]# nc -l -p 4000
    usage: nc [-46DdhklnrStUuvzC] [-i interval] [-p source_port]
    [-s source_ip_address] [-T ToS] [-w timeout] [-X proxy_version]
    [-x proxy_address[:port]] [hostname] [port[s]]

  • Why not rsync? You get wirespeed performance, incremental updates (e.g, interruptions) and optional checksumming. You can also set up a standard rsync config on your database machines, and enable “rsync –daemon” at will and disable it after.

    Of course, if you have secure data that’s not encrypted in-DB, then scp is a better bet, or even pre- and post-encryption if ssh is to span high latency links.



  • Ken, sure, rsync is great, but if you have like 400GB of files and just want them copied ASAP via local network tar|nc way is faster and uses much less cpu. All because there is no ssl overhead.

    Besides if you need to copy db snapshot from production server and even if you limit transfer using rsync’s bwlimit option you can’t change the speed in the middle of transfer in case of the machine’s disk I/O is hot.

    You can limit disk transfer at any moment during copy process using simple throttle tool.
    Just put it between pipes like this (I suggest running it in screen)
    tar cf – . |throttle -l /tmp/throttle.ctl -m 50|bar|nc $destination_host $tcp_port
    and to limit the speed to for example to 20Mbit:
    throttle -t /tmp/throttle.ctl -m 20

  • Ken,

    Rsync is a way if you often copy data and want to set things setup.
    Remember we do consulting and often just need to do things once…. we also may not have root access in many cases to install and setup things.

  • Couple of things.

    Przemek: rsync does not use SSL. Its data only passes through SSL if you use it in SSH transport mode. Daemon mode is much faster, and will trivially saturate 2xGigE. Rsync allows you to interrupt an rsync and continue where you left off, even for large files. Changing bwlimit is easy in this way.

    Peter: rsync does not require root. But it does require you set up a trivial config file, something like:

    % cat c
    path = /path/to/db
    use chroot = false
    readonly = false
    % rsync –daemon –port=1873 –config=c
    remote % rsync -aHPv –port=1873 sourcedata remotehost::db

    (optionally -W if your destination write capacity is significantly lower than network bandwidth)

    Especially in a consulting situation, I’d rather have a four line config file in my toolbelt for a very standard command, instead of relying on A) having root access to install nc or throttle if necessary, which are far less standard, or B) having a stash of nc/throttle binaries for every possible architecture (which even these days is still pretty substantial). And the incremental nature of rsync has other flexibilities; for example, rsyncing a database while it’s running, shutting it down, and rsyncing only changed content (useful mainly in niche, low bandwidth situations).

    Again, just my two cents; I think nc is the best tool for many applications, but for file-based objects rsync is difficult to beat, IMHO.


  • Ken,

    Yes sure. Most things can be run without root. I mean if you want to get standard /etc/init.d installation.

    I typically do not need to throttle bandwidth while copying files – the use case it applies to is getting stuff between boxes as soon as possible.
    In my experience I find netcat installed about as frequently as nc.

    But well anyway thanks for the simple file on how to simply use rsync in daemon mode. It is helpful.

  • My Ubuntu 8.04 system didn’t know about bar.

    I’ve always used ‘pv’ for essentially the same function.

    pv – monitor the progress of data through a pipe

    Which appears quite similar.

  • in the utility copy vein, if you have a source host, a target host on rfc1918 address space, and an intermediate bastion host you have to ssh to in order to REACH the target host (since it doesn’t have an internet-legal ip) and you don’t have a vpn/etc, you can pipe tar through a chained ssh via:

    tar cvf – mysql_dir | ssh -A user@bastionhost.example.com “ssh -A user@rfc1918.example.com ‘cd /path/to/target/;tar xvf -‘”

  • Rsync is great for live replication copies also

    1. rsync with database running, depending on last rsync can take a while
    2. Flush tables/cache, rsync again, won’t take nearly as long
    3. shutdown master
    4. rysnc one final time, on a 10gig db only takes a couple seconds
    4. restart master, downtime only a few seconds and you have a fresh copy to use for starting a slave back up

    Of course you need to delete the proper log files depending on your setup and how you store your log files.

  • Instead of shutdowning master just use LVM snapshot method to create a slave or backup. This way you only need a few secs of read lock.
    If you don’t use LVM partitions I suggest to start using them (optimal separate ones for data and binlogs).

  • As its even been reported on this blog, the LVM snapshot penalty can be extreme, on a overly busy server (and yes I understand thats a problem of its own) starting up an LVM snapshot can cause a cascading failure of the system as it struggles to keep up.

  • 1. You do it at off peak time.
    2. IMHO if your system dies only because of LVM snapshot overhead – you’re far too close to your machine’s disk I/O limits.
    3. IMO always the best solution is to have a spare slave used only for backups, some periodic heavy queries, etc.

  • RT: agreed.


    1. You’ll find more and more that these days people using MySQL run HA systems (as HA as MySQL gets anyway) for international users and systems, not just a little PHP site in Dubuque. The “night-time trough” is no longer a luxury for a lot of the client base of MySQL.

    2. LVM snapshot overhead alone can be 20-50%; startups with limited capex don’t want to pay 20-50% more for their storage hardware to support a method of database backup used solely because a working solution isn’t available from the DB or engine vendor (explain /that/ to a VC). Established companies don’t want to budget in 20-50% more for their expensive SAN setups. We’d all love a few thousand spare TPS or a few 100MB/s of spare throughput, but that money should be going to what makes money, not what loses money. This also doesn’t cover the overhead of saturating the disk subsystem with the *actual copy* with tar/nc or rsync. Been there. Done both. Bought the T-shirt.

    3. Absolutely agree; the many other benefits of a slave makes the cost worthwhile.

    LVM is a very useful tool, but the fact that it is often represented as a panacea by the MySQL community (not singling you out) does a disservice to those relying on that information (who then have it blow up in their face). IMHO; apologies for the rant.


  • It is possible to use a less expensive cipher with ssh which will speed things up significantly.

    For example:
    rsync -e ‘ssh -c blowfish’ …
    scp -c blowfish …

  • If using Ken’s rsync example and you’re root on both sides, you’ll likely need:

    uid = root
    gid = root



    in the rsync config file to avoid permission problems.

  • Brains and Genius,respect to you all….i read almost everything on this site but i still need someone to put me through… im of good intention to life everything i learn, im a fast learner and i’d like to meet someone who ‘d be so kind and humble to please be my teacher/master.. to teach me some few thing which i could develop later on my own we could use yahoo or msn messenger for the training session…..deepspeed_772000@yahoo.com
    looking forward to meet you. thanks
    simplee me

  • You can also copy the data directory to multiple servers at once using tee:

    Each Receiver: same as in the article: nc -l 4000 | tar xvf – [or nc -l -p 4000 | tar xvf – ]
    Sender: tar -f – -c . | tee >(nc target_host_1 4000) | nc target_host_2 4000

    You can add as many additional hosts as you want.

Leave a Reply


Percona’s widely read Percona Database Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.

Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.

No, thank you. Please do not ask me again.