Emergency

mysql 5.7 insert netflow data (30 000 rcords /per sec)

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • mysql 5.7 insert netflow data (30 000 rcords /per sec)

    Hello Everybody
    i am developing netflow monitor and i am stuck in mysql 5.7 on windows PC.
    The application i am working on only for windows PC
    I have netflow packets (30 000 per sec) and i need to insert them to DB
    I created innodb table with partitions ( 1 partition for 1 day).
    This is table structure
    CREATE TABLE IF NOT EXISTS netflow_1 (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `stimestamp` int(10) unsigned NOT NULL,
    `flow_start_time` int(10) unsigned NOT NULL,
    `input_snmp` SMALLINT(5) unsigned DEFAULT NULL,
    `output_snmp` SMALLINT(5) unsigned DEFAULT NULL,
    `direction` TINYINT(3) unsigned DEFAULT NULL,
    `ipv4_src_addr` int(11) unsigned DEFAULT NULL,
    `ipv4_dst_addr` int(11) unsigned DEFAULT NULL,
    `netflow_bytes` BIGINT(20) unsigned DEFAULT NULL,
    `protocol` TINYINT(3) unsigned DEFAULT NULL,
    `l4_src_port` SMALLINT(5) unsigned DEFAULT NULL,
    `l4_dst_port` SMALLINT(5) unsigned DEFAULT NULL,
    `src_as` SMALLINT(5) unsigned DEFAULT NULL,
    `dst_as` SMALLINT(5) unsigned DEFAULT NULL,
    PRIMARY KEY (`id`,`stimestamp`),
    KEY `idx_stimestamp` (`stimestamp`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    PARTITION BY RANGE (stimestamp) (
    PARTITION p2017_12_01 VALUES LESS THAN (UNIX_TIMESTAMP("2017-12-02"))
    );


    And i use procedure to insert data . It takes me 45-50 sec to insert 1 million records
    How to improve i do not know
    Also how to
    1 - select port, sum(netflow_bytes) as _netflow_sum from netflow_1 where stimestamp > 0 and stimestamp < 4000000 group by port order by _netflow_sum desc limit 0,10;
    2 - select src_as, dst_as, sum(netflow_bytes) as _netflow_sum from netflow_1 where stimestamp > 0 and stimestamp < 4000000 group by concat(src_as, dst_as) desc limit 0 , 10;
    User should be able to select netflow by time range. In GUI it is usual table with pagination and calendar.

    Netflow data will be stored during 30 days. It means the table with have 30 partitions.
    Thanks for any advice

  • #2
    For your insertion performance, I would first try dropping the partition altogether to see if that helps. If that does help, then I would suggest the following (I do this as well in practice)
    • insert one more column, partition_id (int) which is a concat of YYYMM or YYYYMMDD. Have this be your partition column instead of timestamp (still partition on range). As you know, timestamp is 8 bytes, and for the most part unique, so on a range partition, this does hurt, especially on your insert frequency
    1. That select is fine. Make sure port is an index. You can try to select from the specific partition instead of specifying a WHERE clause (if it's for the entire day) SELECT x FROM y PARTITION(p2017_12_01)
    2. Don't concat the GROUP BY, it's not required (it's the same thing as GROUP BY src_as, dst_as). Same suggestion as above for WHERE

    Comment


    • #3
      1 - i use stimestamp (it is int) as a partition column
      2 - thanks foGROUP BY concat(...), i forgot about this
      But i have many options i have to "GROUP BY "
      1 ) (CONCAT(ifnull(n.protocol,''), LEAST(ifnull(n.l4_src_port,0), ifnull(n.l4_dst_port,0))))
      2 ) port
      3 ) ipv4_src_addr
      4 ) ipv4_dst_add
      5 ) concat ( ipv4_src_addr , ipv4_dst_add )
      6 ) src_as
      7 ) dst_as
      8 ) concat ( src_as , dst_as)

      'Make sure port is an index.' In my case it means i should create a lot of indexes. And some of them should be complex ( concat ( ipv4_src_addr , ipv4_dst_add ) )

      About inserting performance
      https://prnt.sc/hjuoms

      Thanks for advice

      Comment

      Working...
      X