tcprstat is a free, open-source TCP analysis tool that watches network traffic and computes the delay between requests and responses. From this it derives response-time statistics and prints them out. The output is similar to other Unix -stat tools such as vmstat, iostat, and mpstat. The tool can optionally watch traffic to only a specified port, which makes it practical for timing requests and responses to a single daemon process such as mysqld, httpd, memcached, or any of a variety of other server processes.
The advantages of tcprstat are as follows:
- It is lightweight and unobtrusive. No bulky log files need be written and analyzed.
- Requests and responses are timed with microsecond resolution.
- The output is easy to import into spreadsheets, manipulate with command-line scripts, graph with gnuplot, and so on.
- It is protocol-agnostic, and works well for a large variety of client-server protocols that have a simple request-response model.
tcprstat is related to the tcpstat tool, but it focuses on response time measurements, not on the amount and size of the network traffic. This makes it useful for response time analysis, which is needed for techniques such as Goal-Driven Performance Optimization.
tcprstat development is hosted on Launchpad. We are currently moving downloads and bug tracking from GitHub to Launchpad, so some links in this document may refer to GitHub until that process is complete.
tcprstat is currently in Beta, although we consider it ready for production testing. Several C++ experts have reviewed the code, and it has been observed in high-stress production environments for many weeks with no apparent problems. Please test and review the code, and report any issues or suggestions you have. Please also review our roadmap and contribute your suggestions.
Here is a sample of tcprstat output, generated by watching traffic to a MySQL server on port 3306.
# tcprstat -p 3306 -t 1 -n 5 timestamp count max min avg med stddev 95_max 95_avg 95_std 99_max 99_avg 99_std 1283261499 1870 559009 39 883 153 13306 1267 201 150 6792 323 685 1283261500 1865 25704 29 578 142 2755 889 175 107 23630 333 1331 1283261501 1887 26908 33 583 148 2761 714 176 94 23391 339 1340 1283261502 2015 304965 35 624 151 7204 564 171 79 8615 237 507 1283261503 1650 289087 35 462 146 7133 834 184 120 3565 244 358
The output is generated a line at a time, one line per second, for five seconds. Each line is timestamped, and contains information about the response time of queries that ended that second. The columns include standard aggregations of the response times, in units of microseconds. There are groups of columns for the 95th percentile and 99th percentile, too. The exact meaning of each column is explained later in this manual.
Response times are computed by measuring the elapsed time from the last inbound packet to the first outbound packet. Certain types of packets containing only TCP control information are ignored.
Percentiles are computed by sorting the computed response times and ignoring the largest N percent of values. In the example just shown, during the first second the longest response time was 31221 microseconds, whereas 95% of response times were less than 411 microseconds and 99% were less than 3001 microseconds. Other types of statistics are computed similarly. For example, the 95th percentile average response time is obtained by taking the average of the values computed, except for the largest 5%.
For portability and ease of use, we build a single statically linked binary, which can be downloaded and used as-is on 64-bit platforms. At present there is no installation mechanism, so to use tcprstat, you simply need to
- Download the statically linked 64-bit binary (version 0.3.1)
- Move it into a directory in your PATH, such as /usr/bin
- Rename it to tcprstat
- Make it executable with
There are currently no operating-system-specific packages for tcprstat, although in the future we plan to include it in our APT and YUM repositories. If you need to run tcprstat on 32-bit systems, you will need to build it from source.
tcprstat requires root permissions to execute. If you are not root, you should either become root with
su, or execute tcprstat with
sudo. The following examples assume you are root, and do not show the use of
The simplest functionality is to accept all defaults and simply execute the tool. By default, it will measure the TCP traffic for 10 seconds and print out a header, followed by a single line of statistics.
# timestamp count max min avg med stddev 95_max 95_avg 95_std 99_max 99_avg 99_std 1283265068 23892 425546 30 505 161 6240 835 186 102 4179 261 429
Basic, Useful Functionality
In practice, you will usually select a specific port to measure, set the tool to iterate infinitely, and possibly change the default 10-second interval. To accomplish this, use the following options:
- -p <port> selects a port
- -i <secs> sets the measurement interval, in seconds
- -n <iter> specifies the number of iterations to run; 0 means to run forever
For example, to watch Sphinx traffic in one-second intervals forever, execute
# tcprstat -p 3312 -i 1 -n 0.
Customizing The Output
tcprstat's output can be changed to include or omit various types of statistics on the traffic it measures. The default output includes columns that are useful for many purposes. The output is specified with %C format codes, where C is a single character. The following table documents the available format codes and their meaning.
|%n||count||y||Count of requests that completed during this iteration|
|%a||avg||y||Average response time|
|%s||sum||y||Sum of response times|
|%x||sqs||Sum of squares of response times|
|%m||min||y||Minimum response time|
|%M||max||y||Maximum response time|
|%h||med||y||Median response time|
|%S||stddev||y||Standard deviation of response times|
|%v||var||Variance of response times|
|%t||elapsed||Seconds elapsed since the first iteration|
|%%||A literal %|
|\t||A tab character|
|\n||A newline character|
|95,99||Adds a prefix||y||A percentile indicator; see later in this section for more|
You can change the tcprstat output format by customizing the -f option. For example, to print out the number of requests per second to a MySQL server on port 3306, execute the following:
# tcprstat -f '%n\n' -p 3306 -t 1 -n 0 count 2212 2070 ...
You can also include a percentile indicator, to compute the statistics over the Nth percentile of the response times. The tool currently supports only the 95th and 99th percentiles. To print out statistics for a given percentile, include the percentile number between the % character and the format code. The default column header will then include the percentile value. The following example prints the maximum, 95th percentile maximum, and 99th percentile maximum response times in microseconds:
# tcprstat -f '%M\t%95M\t%99M\n' -p 3306 -t 1 -n 0 max 95_max 99_max 31221 411 3001 52721 495 2828 12173 507 1513 ...
Analyzing a pcap File
tcprstat has the ability not only to analyze traffic live as it is captured, but also to analyze traffic from a file created by tcpdump. This makes it possible to gather traffic on one machine and analyze it elsewhere at a different time. To save traffic for later analysis, execute
tcpdump with the
-w option and specify the file in which to store the resulting traffic. Use the
-r option to tcprstat to read traffic from that file and analyze it.
tcprstat normally decides what traffic is incoming (a request) and what traffic is outbound (a response) by finding a list of IP addresses bound to the local network interfaces. However, this will not work correctly when traffic is analyzed from a file that was gathered on a different host. You can use the -l option to specify a list of local IP addresses in this case.
The following is a complete list of tcprstat's command-line options. You can always find a full list of the available command-line options and brief usage information by giving the --help option.
|Option Name||Short Name||Type||Default Value||Meaning|
|--format||-f||string||”%T\t%n\t%M\t%m\t%a\t%h\t%S\t%95M\t%95a\t%95S\t%99M\t%99a\t%99S\n”||A format string; see above.|
|--[no]header||string||Enabled||If no argument is given, tcprstat auto-generates the header based on --format. If an argument is given, tcprstat uses that as the header instead. If --no-header is used, tcprstat will not print a header.|
|--help||Shows program information and usage.|
|--interval||-t||integer||10||The number of seconds tcprstat waits between each successive line of output.|
|--iterations||-n||integer||1||How many iterations tcprstat should execute before exiting; 0 means infinity.|
|--local||-l||string||Accepts a comma-separated list of IP addresses to consider as local IP addresses, instead of getting a list from the operating system.|
|--port||-p||integer||Capture traffic only for the specified TCP port; if none, capture all traffic.|
|--read||-r||string||Read the specified pcap file instead of capturing traffic from the network.|
|--version||Shows version information.|
The tcprstat source code is hosted on Launchpad.
Please use Launchpad's issue tracking system to report issues and request features. For general discussion, please use the Percona-Discussion Google Group. You can use the
#percona IRC channel on FreeNode to chat with community members.
For commercial support, maintenance packages, or to sponsor features, please contact Percona Sales.
tcprstat was written by Ignacio Nin, based on ideas from Baron Schwartz and other Percona consultants. The code was reviewed by Oleg Tsarev, Sasha Pachev, and other members of the Percona development team.