Storing IPv6 IPs in MySQL

Intro to IPv6:
ipv6.l.google.com has IPv6 address 2001:4860:0:2001::68
2001:4860:0:2001::68 is short notation for:
2001:4860:0000:2001:0000:0000:0000:0068

IPv6 is 128 bit. There are 2^128 IPv6 IPs. That’s 340 undecillion.

I have a whole lot more detail on what IPv6 is, saving that for a later presentation, let’s jump to something fun. This is what I need your help with. What are the considerations you have to make when developing your applications for IPv6? Here’s one:

Imagine you want to store a list of IPv6’s in a MySQL table. Maybe this is a list of IPs allowed to connect to your application, or maybe it’s a list of IP to server assignments.

IPv6 is 128 bit. This is larger than will fit in any single MySQL numeric data type. (Note: postgres has ip/netmask data types and functions to do calculations on them. It will also listen on the IPv6 network stack, MySQL won’t. But, that’s another story. The challenge for now is, storing IPv6 IPs in MySQL)

One method is to store the IPv6 addresses as a string that represents the IP in short notation:

CREATE TABLE `ipshort` (
`id` bigint(22) unsigned NOT NULL auto_increment,
`bitmask` tinyint(3) unsigned NOT NULL default ‘128’,
`ip` varchar(39) NOT NULL,
PRIMARY KEY (`id`),
KEY `bitmask` (`bitmask`),
KEY `ip` (`ip`),
UNIQUE KEY `bitmaskip` (`bitmask`, `ip`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Notes:

  • The length of “ip” will be up to 39+1. Average length depends on how compressed the IPs are that you insert.
  • Not good if you want to test an IP against the table. In other words, if you want to test if 2001:1::1/128 is covered by any ip/bitmask already stored in the table, you’ll have to select each one out and do a bunch of math on each one.

Next attempt, store IPs as their binary string:

CREATE TABLE `ipbinary` (
`id` bigint(22) unsigned NOT NULL auto_increment,
`binstr` varchar(128) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `binstr` (`binstr`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Notes:

  • The bitmask is assumed by the length of the binstr string.
  • Not space efficient. For example, storing 2001::1 in binary will require all 128 bytes.
  • Maybe you don’t care how much disk space it consumes. You’re only going to store 1 million entries, and 100 to 150 MB is not a concern to you for disk or memory (you want the indexes to fit in memory).
  • This is much better if you want to test an IP against the table. In other words, if you want to test if 2001:1::1/128 is covered by any ip/bitmask already stored in the table, you could do it like this:
    1. convert compare_ip to string that represents the binary conversion
    2. compare_bitmask = compare_bitmask specified ? compare_bitmask : 128
    3. $sql = “SELECT * FROM ipbinary WHERE 1!=1 “;
      for ( $j=1; $j<=$compare_bitmask; $j++)
      {
      $sql .= ” OR binstr=SUBSTR(‘$binip’, 1, $j)”;
      }
      NOTE: yes, creates a big query, but limited to 128 conditions max. Hard for human comprehension, but easy for MySQL to handle. This is a fast query, even on a huge table, as long as your indexes are loaded in memory. Key point, does not require a full table scan to find the matching rows, it’s index friendly.
    4. If you get a row from the query, it’s in the table.

Finally, I created a hybrid, that stores each section of the hex-based IP as a decimal:

CREATE TABLE `ipsmallint` (
`id` bigint(22) unsigned NOT NULL auto_increment,
`bitmask` enum(’16’,’32’,’48’,’64’,’80’,’96’,’112′,’128′) NOT NULL default ‘128’,
`block1` smallint(5) unsigned NOT NULL default ‘0’,
`block2` smallint(5) unsigned NOT NULL default ‘0’,
`block3` smallint(5) unsigned NOT NULL default ‘0’,
`block4` smallint(5) unsigned NOT NULL default ‘0’,
`block5` smallint(5) unsigned NOT NULL default ‘0’,
`block6` smallint(5) unsigned NOT NULL default ‘0’,
`block7` smallint(5) unsigned NOT NULL default ‘0’,
`block8` smallint(5) unsigned NOT NULL default ‘0’,
PRIMARY KEY (`id`),
UNIQUE KEY `block` (`block1`,`block2`,`block3`,`block4`,`block5`,`block6`,`block7`,`block8`),
KEY `bitmask` (`bitmask`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Notes:

  • Only use this method if it’s acceptable to limit your bitmasks to ’16’,’32’,’48’,’64’,’80’,’96’,’112′,’128′
  • A hex address is 8 blocks of hex numbers, like this: FFFF:AAAA:0000:0000:BBBB:BBBB:CCCC:55555. The max value for each block is FFFF or 65535.
  • In MySQL, a smallint is 2 bytes, unsigned, holds a max size of 65535. Magic.
  • To store an IP in this table, you take each block, convert to decimal, and store in the appropriate block. If you want to store FFFF::/16, you put 65535 in block1, the rest get 0.
  • This is a fixed table size. Every row is 8 bytes bigint + 1 byte for the enum + 2 bytes for each smallint. 25 bytes per row. This is the most disk space efficient method I can think of. The ip part of the table is 8*2bytes = 16 bytes. That’s exactly how many bytes are required to store a 128-bit number.
  • You can a similar technique if you want to test an IP against the table. In other words, if you want to test if 2001:1::1/128 is covered by any ip/bitmask already stored in the table, you could do it like this:
  1. Uncompress compare_ip and explode on the you’ll have 8 parts.
  2. compare_bitmask = compare_bitmask specified ? compare_bitmask : 128
  3. $sql = “SELECT * FROM ipsmallint WHERE 1!=1″;
    for ( $j=1; $j<=$bitmask/16; $j++)
    {
    $sql .= ” OR (bitmask = ‘”.($j*16).”‘”;
    for ( $k=1; $k<=$j; $k++)
    $sql .= ” AND block$k = “.hexdec($parts[$k-1]);
    $sql .= “)”;
    }
    NOTE: smaller query, limited to 8 conditions. Again, fast query, even on a big table.
  4. If you get a row from the query, it’s in the table.
  • Limitation to all 3 tables: BIGINT is 8 bytes, or 64-bits. You’re limited to 2^64 entries, half of the IP space available in IPv6. You’ll have serious problems with MySQL long before you get to 2^64 rows anyway, so not much of a concern, just worth a note.

In my experiments, I injected ~800 thousand random IPs into ipsmallint and only ~60 thousand rows into ipbinary. Compare IP time was ~1000 lookups per second against ipsmallint, and about 200 per second against ipbinary. ipbinary consumed 3 times as much disk space per row (51975 rows consumed 4707568 bytes. ipsmallint was 801171 rows consuming 20830446 bytes). In conclusion, I think the long, up to 128 conditions, lookup query really does have a heavy factor. I did an EXPLAIN on one of the long ipbinary queries, it showed:

+—-+————-+———-+——-+—————+——–+———+——+——+————-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+—-+————-+———-+——-+—————+——–+———+——+——+————-+
| 1 | SIMPLE | ipbinary | range | binstr | binstr | 130 | NULL | 98 | Using where |
+—-+————-+———-+——-+—————+——–+———+——+——+————-+

Doesn’t seem like it should take so long. Maybe I made a mistake, it is late, and I’m tired. OR, maybe the theory about indexes being more efficient in fixed length tables, such as ipsmallint, is true, and more so than I originally thought!

One other interesting topic, AAAA records in a MyDNS system. Is ipshort or ipsmallint method better for disk space efficiency? ipshort is up to 40 bytes (39 + 1 byte to store the length). ipsmallint method is always exactly 16 bytes. Let’s see, the google example 20 bytes + 1 to store the length. Let’s just say that’s the average, 21. We get a 23.8% savings in disk space by using ipsmallint. Another way to express is it requires 31.25% more disk space to use the ipshort method. Of course, there is a trade-off to using ipsmallint – the DNS system now must convert the 8 blocks to hex, assemble the string, and convert to short notation. This requires processing power, and could reduce the efficiency/scalability of the DNS application servers. The question is, by how much, and is it worth the trade off in disk space savings?

DaveK

247 thoughts on “Storing IPv6 IPs in MySQL”

  1. mysql> show create table address;
    +———+———————————————————————————————————————————————————————————————+
    | Table | Create Table |
    +———+———————————————————————————————————————————————————————————————+
    | address | CREATE TABLE `address` (
    `id` int(10) unsigned NOT NULL auto_increment,
    `ip` decimal(39,0) default NULL,
    PRIMARY KEY (`id`)
    ) ENGINE=MyISAM AUTO_INCREMENT=10 DEFAULT CHARSET=latin1 |
    +———+———————————————————————————————————————————————————————————————+
    1 row in set (0.00 sec)

    mysql> Insert into address set ip= CONV(‘ffff’, 16, 10) * POW(65535, 7) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 6) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 5) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 4) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 3) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 2) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 1) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 0);

    Query OK, 1 row affected, 1 warning (0.07 sec)

    mysql> Insert into address set ip= CONV(‘ffff’, 16, 10) * POW(65535, 7) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 6) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 5) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 4) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 3) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 2) +
    -> CONV(‘ffff’, 16, 10) * POW(65535, 1) +
    ->
    -> CONV(‘fffe’, 16, 10) * POW(65535, 0);
    Query OK, 1 row affected, 1 warning (0.00 sec)

    mysql> select * from address;
    +—-+—————————————–+
    | id | ip |
    +—-+—————————————–+
    | 10 | 340246022585899900000000000000000000000 |
    | 11 | 340246022585899900000000000000000000000 |
    +—-+—————————————–+
    2 rows in set (0.00 sec)

    Question:
    What’s wrong with this picture?
    Answer:
    The calculated values are identical, but they are not from the same IP. In fact the values inserted are not the correct decimal representation. This is due to MySQL losing precision in the POW calculation.

    This is why:

    mysql> select POW(65535, 7);
    +——————–+
    | POW(65535, 7) |
    +——————–+
    | 5.191742286784e+33 |
    +——————–+
    1 row in set (0.00 sec)

    So, DECIMAL(39) will work great for storing, and even does good on math such as where ip > value1 and ip < value2. Just have to do the IPv6 to decimal mathematics outside of MySQL. If someone can show a reliable method for converting an IPv6 IP to a decimal value using MySQL, that would be awesome!

Comments are closed.