IP Networking

Synopsis

#include "swoc/swoc_ip.h"

Usage

This library is for storing and manipulating IP addresses as data. It has no support for actual network operations. The goal is to make handling IP networking data straight forward.

Many of the classes are triplicated - a family independent type along with family specialized types. In general the generic type will act as a union of the family specific types.

IPEndpoint

swoc::IPEndpoint is a wrapper around sockaddr and related structures to provides a number of utilities such as

  • Constructing instances from strings.

  • Support for IPv4 and IPv6 in a single data type.

  • Family independent access to common elements such as the port.

  • Conversion to and from the more specialized classes.

IPAddr

The classes swoc::IPAddr, swoc::IP4Addr, and swoc::IP6Addr are used to hold IP addresses. IP4Addr and IP6Addr are family specific and hold (respectively) IPv4 and IPv6 addresses. IPAddr acts as a union of these two types along with an IP family specifier that indicates the type of address contained. The type specific classes provide performance and storage benefits when the type of the address is known or forced, while IPAddr provides a generic type useful for interfaces. An IPAddr can be in an invalid state but the family specific classes cannot, as every possible bit pattern is a valid address.

These classes provide support for parsing and formatting IP addresses. The constructor can take a string and, if a valid address, will initialize the instance to that address. The downside is there is no indication of failure other than the instance initializing to the zero or “any” address. This can be reasonable in situations where those addresses are not valid either. However in general the swoc::IPAddr::load() method should be used, which both initializes the instance and provides an indication of whether the input was valid. This is also the expected to check a string for being a valid address.

Conversions to and from sockaddr are provided. This is handier with IPAddr as it will conform to the family of the address in the sockaddr. It ususally best to use IPEndpoint instead of a raw sockaddr.

A variety of string formats are supported, primarily for legacy and convenience. The family specific types will only parse addresses of that family. To parse a string as either family use the generic type IPAddr. An IPv4 address can be specified as a single numeric value, which is used primarily to parse “0” as an address (e.g. in the case of specifying a network as “0/0”). IPv4 octets can be specified in decminal (default), octal (leading 0), or hexadecimal (leading “0x”).

IPSrv

Storage for an address and a port. There is no really good name for this therefore I used the DNS term for such an object. This consists of the usual triplet of classes, :swoc:`IP4Srv`, :swoc:`IP6Srv`, and :swoc:`IPSrv`. The first two are protocol family specific and the third holds an instance of either an IP4Srv or an IP6Srv. The address and port can be manipulated separately.

IPRange

The classes swoc::IPRange, swoc::IP4Range, and swoc::IP6Range are used to hold ranges of IP addresses. IP4Range and IP6Range are family specific and hold (respectively) IPv4 and IPv6 addresses. IPRange acts as a union of these two types along with an IP family specifier that indicates the type of address contained. The type specific classes provide performance and storage benefits when the type of the address is known or forced, while IPRange provides a generic type useful for interfaces. Note that an IPRange holds a range of addresses of a single family, it can never hold a range that is of mixed families. In addition is the class swoc::IPRangeView which is a view to a family specific range wrapped in a family agnostic class. The purpose is to decrease copying by delaying and therefore in some cases avoiding.

These classes provide support for parsing and formatting ranges of IP adddresses. The parsing logic accepts three forms of a range. In all cases the lower and upper limits of the range must be the same IP address family.

Range

Two addresses, separated by a dash (“-”) character. E.g.

172.26.13.4-172.27.12.9

Network

An address and a CIDR based mask, separated by a slash (“/”) character. E.g.

1337:0:0:ded:BEEF::/48

Singleton

A single IP address, which is interpreted as a range of size 1.

Such a string can be passed to the constructor, which will initialize to the corresponding range if properly formatted, otherwise the range will be default constructed to an invalid range. There is also the swoc::IPRange::load() method which returns a bool to indicate if the parsing was successful. This is the expected mechanism for validating a string as a valid range.

This class has formatting support in “bwf_ip.h”. In addition to all of the formatting supported for sockaddr, the additional extension code ‘c’ can be used to indicate compact range formatting. Compact means a singleton range will be written as just the single address, and if the range is also a network it will be printed in CIDR format.

Range

Compact

10.1.0.0-10.1.0.127

10.1.0.0/25

10.2.0.1-10.2.0.127

10.2.0.1-10.2.0.127

10.3.0.0-10.3.0.126

10.3.0.0-10.3.0.126

10.4.1.1-10.4.1.1

10.4.1.1

IPNet

Address networks are supported by the usual triplet of classes

In addition the class sowc::IPMask is used to store a network mask. There are no family specific variants as a mask is really a CIDR based bit count along with utility methods.

String parsing requires a “/” separator with a leading network address followed by either a CIDR count or an address that is a valid network mask. In the latter case the mask address must be of the same family as the network address. A valid network mask must be a sequence of ‘1’ bits followed by all ‘0’ bits.

Networks are treated as specialized ranges and can always be converted to an instance of a range class. Conversely a range class can be converted to a sequence of networks. For any range there is exactly one sequence of networks that contains the same addresses and is of minimal length. Range class support generating this sequence which results in a sequence of instances of a network class.

A mask can be converted to the corresponding address. E.g. the IPv6 address for a 85 bit mask could be generated with

IPMask(85).as_ip6() // yields an IP6Addr instance.

Conversions

Most conversions between types should be straight forward but in some cases there is some indirection in order to avoid bad conversions due to differences in address families.

Conversion from swoc::IPEndpoint to swoc::IPAddr is direct as the latter can be explicitly constructed from the former. For swoc::IP6Addr and swoc::IP4Addr the family must be checked first. The expected way to do this is

if ( auto * sa = ep.ip4() ; sa ) {
   IP4Addr addr(sa);
   // ....
}

Note the ip4() and ip6() methods return a pointer to the appropriate family specific type (sockaddr_in*' and :code:`sockaddr_in6*) if the family matches and a nullptr if not. This is intended to be similar to how dynamic casts are handled when it is not guaranteed the generic type contains an instance of the more specific type.

Conversion from address types to socket addresses can be done by constructing an IPEndpoint or, if the socket address structure already exists, using the copy_to methods on the address types, such as swoc::IP4Addr::copy_to(). Note converting an address types sets only the family and address. Converting a service type also sets the port.

IPSpace

The swoc::IPSpace class is designed as a container for ranges of IP addresses. Lookup is done by single addresses. Conceptually, for each possible IP address there is a payload. Populating the container is done by applying a specific payload to a range of addresses. After populating an IP address can be looked up to find the corresponding payload.

The payload is a template argument to the class, as with standard containers. There is no template argument for the key, as that is always an IP address.

Applying payloads to the space is analogized to painting, each distinct payload considered a different “color” for an address. There are several methods used to paint the space, depending on the desired effect on existing payloads.

mark

swoc::IPSpace::mark() applies the payload to the range, replacing any existing payload present. This is modeled on the “painter’s algorithm” where the most recent coloring replaces any prior colors in the region.

mark_bulk

swoc::IPSpace::mark_bulk() applies multiple payload to the range using the same logic as mark. This has much better performance than calling mark many times in succession, and results in a much more balanced RBTree structure (for faster lookups) in the case where ip ranges are inserted in ascending order.

fill

swoc::IPSpace::fill() applies the payload to the range but only where there is not already a payload. This is modeled on “backfilling” a background. This is useful for “first match” logic, as which ever payload is first put in to the container will remain.

blend

swoc::IPSpace::blend() applies the payload to the range by combining (“blending”) the payloads. The result of blending can be “uncolored” which results in those addresses being removed from the space. This is useful for applying different properties in sequence, where the result is a combination of the properties.

Blend

Blending is different than marking or filling, as the latter two apply the payload passed to the method. That is, if an address is marked by either method, it is marked with precisely the payload passed to the method. blend is different because it can cause an address to be marked by a payload that was not explicitly passed in to any coloring method. Instead of replacing an existing payload, it enables computing the resulting payload from the existing payload and a value passed to the method.

The swoc::IPSpace::blend() method requires a range and a “blender”, which is a functor that blends a color into a PAYLOAD instances. The signature is

bool blender(PAYLOAD & payload, U const& color)

The type U is the same as the template argument U to the blend method, which must be compatible with the second argument to the blend method. The argument passed as color to the functor is the value passed as the second argument to blend. The point of this indirection is to enable passing values of types distinct from the payload type. In practice blending rarely replaces or updates the entire payload, but only a part of it, and therefore the blend can be simpler and more performant by passing only the data needed for the update, and not an entire payload instance type.

The method is modeled on C++ compound assignment operators. If the blend operation is thought of as the “@” operator, then the blend functor performs lhs @=rhs. That is, lhs is modified to be the combination of lhs and :arg`rhs`. lhs is always the previous payload already in the space, and rhs is the color argument to the blend method. The internal logic handles copying the payload instances as needed.

The return value of the blender indicates whether the combined result in lhs is a valid payload or not. If valid the method should return true. In general most implementations will return true; in all cases. If the method returns false then the address(es) for the combined payload are removed from the container. This allows payloads to be “unblended”, for one payload to cancel out another, or to do selective erasing of ranges.

As an example, consider the case where the payload is a bitmask. It might be reasonable to keep empty bitmasks in the container, but it would also be reasonable to decide the empty bitmask and any address mapped to it should removed entirely from the container. In such a case, a blender that clears bits in the payloads should return false when the result is the empty bitmask.

Similarly, if the goal is to remove ranges that have a specific payload, then a blender that returns false if lhs matches that specific payload and true if not, should be used.

There is a small implementation wrinkle, however, in dealing with unmapped addresses. The color is not necessarily a PAYLOAD and therefore must be converted in to one. This is done by default constructing a PAYLOAD instance and then calling blend on that and the color. If this returns false then unmapped addresses will remain unmapped.

Examples

Blending Bitsets

As an example of blending, consider a mapping of IP addresses to a bit set, each bit representing some independent property of the address (e.g., production, externally accessible, secure, etc.). It might be the case that each of these was in a separate data source. In that case one approach would be to blend each data source into the IPSpace, combining the bits in the blending functor. If std::bitset is used to hold the bits, the declarations could be done as

  // Color each address with a set of bits.
  using PAYLOAD = std::bitset<32>;
  // Declare the IPSpace.
  using Space = swoc::IPSpace<PAYLOAD>;

To do the blending, a blending functor is needed.

  auto blender = [](PAYLOAD &lhs, PAYLOAD const &rhs) -> bool {
    lhs |= rhs;
    return true;
  };

This always returns true because blending any bits never yields a zero result. A lambda is provided to do the marking of the example data for convience. This takes a list of example data items defined as a range and a list of bits to set.

  using Data = std::tuple<TextView, PAYLOAD>;

The marking logic is

  auto marker = [&](Space &space, swoc::MemSpan<Data> ranges) -> void {
    // For each test range, compute the bitset from the list of bit indices.
    for (auto &&[text, bits] : ranges) {
      space.blend(IPRange{text}, bits, blender);
    }
  };

Let’s try it out. For the first pass this data will be used.

    {{"100.0.0.0-100.0.0.255", make_bits({0})},
     {"100.0.1.0-100.0.1.255", make_bits({1})},
     {"100.0.2.0-100.0.2.255", make_bits({2})},
     {"100.0.3.0-100.0.3.255", make_bits({3})},
     {"100.0.4.0-100.0.4.255", make_bits({4})},
     {"100.0.5.0-100.0.5.255", make_bits({5})},
     {"100.0.6.0-100.0.6.255", make_bits({6})}}

After using this, the space contents are

7 ranges
100.0.0.0-100.0.0.255     : 10000000000000000000000000000000
100.0.1.0-100.0.1.255     : 01000000000000000000000000000000
100.0.2.0-100.0.2.255     : 00100000000000000000000000000000
100.0.3.0-100.0.3.255     : 00010000000000000000000000000000
100.0.4.0-100.0.4.255     : 00001000000000000000000000000000
100.0.5.0-100.0.5.255     : 00000100000000000000000000000000
100.0.6.0-100.0.6.255     : 00000010000000000000000000000000

Those are non-overlapping intervals and therefore are not really blended. Suppose the following ranges are also blended - note these overlap the first two ranges from the previous ranges.

    {{"100.0.0.0-100.0.0.255", make_bits({31})},
     {"100.0.1.0-100.0.1.255", make_bits({30})},
     {"100.0.2.128-100.0.3.127", make_bits({29})}}

This yields

9 ranges
100.0.0.0-100.0.0.255     : 10000000000000000000000000000001
100.0.1.0-100.0.1.255     : 01000000000000000000000000000010
100.0.2.0-100.0.2.127     : 00100000000000000000000000000000
100.0.2.128-100.0.2.255   : 00100000000000000000000000000100
100.0.3.0-100.0.3.127     : 00010000000000000000000000000100
100.0.3.128-100.0.3.255   : 00010000000000000000000000000000
100.0.4.0-100.0.4.255     : 00001000000000000000000000000000
100.0.5.0-100.0.5.255     : 00000100000000000000000000000000
100.0.6.0-100.0.6.255     : 00000010000000000000000000000000

The additional bits are now present on the other side of the bit set. Note there are now more ranges because the last range overlapped two of the previously existing ranges. Those are split because of the now differing payloads for the new ranges.

What happens if this range and data is blended into the space?


The result is

6 ranges
100.0.0.0-100.0.0.255     : 10000000000000000000000000000001
100.0.1.0-100.0.1.255     : 01000000000000000000000000000010
100.0.2.0-100.0.3.255     : 00110000000000000000000000000100
100.0.4.0-100.0.4.255     : 00111000000000000000000000000100
100.0.5.0-100.0.5.255     : 00000100000000000000000000000000
100.0.6.0-100.0.6.255     : 00000010000000000000000000000000

Note the “.2” and “.3” ranges have collapsed in to a single range with the bits 2,3,29 set. The “.4” range remains separate because it also has bit 4 set, which is distinct.

Blending allows selective erasing. Let’s erase bits 2,3,29 in all ranges. First a blending functor is needed to erase bits instead of settting them.

  auto resetter = [](PAYLOAD &lhs, PAYLOAD const &rhs) -> bool {
    auto mask  = rhs;
    lhs       &= mask.flip();
    return lhs != 0;
  };

Note this returns false if the result of clearing the bits is an empty bitset. Just to be thorough, let’s clear those bits for all IPv4 addresses.

  space.blend(IPRange{"0.0.0.0-255.255.255.255"}, make_bits({2, 3, 29}), resetter);

The result is

5 ranges
100.0.0.0-100.0.0.255     : 10000000000000000000000000000001
100.0.1.0-100.0.1.255     : 01000000000000000000000000000010
100.0.4.0-100.0.4.255     : 00001000000000000000000000000000
100.0.5.0-100.0.5.255     : 00000100000000000000000000000000
100.0.6.0-100.0.6.255     : 00000010000000000000000000000000

The “.2” and “.3” ranges have disappeared, as this bit clearing cleared all the bits in those ranges. The “.4” range remains back to its original state, the extra bits having been cleared. The other ranges are unchanged because this operation did not change their payloads. No new ranges have been added because the result of unsetting bits where no bits are set is also the empty bit set. This means the blender returns false and this prevents the range from being created.

As a final note, although the data used here was network based, that is in no way required. This line of code being executed

  space.blend(IPRange{"100.0.2.19-100.0.5.117"}, make_bits({16, 18, 20}), blender);

yields the result

7 ranges
100.0.0.0-100.0.0.255     : 10000000000000000000000000000001
100.0.1.0-100.0.1.255     : 01000000000000000000000000000010
100.0.2.19-100.0.3.255    : 00000000000000001010100000000000
100.0.4.0-100.0.4.255     : 00001000000000001010100000000000
100.0.5.0-100.0.5.117     : 00000100000000001010100000000000
100.0.5.118-100.0.5.255   : 00000100000000000000000000000000
100.0.6.0-100.0.6.255     : 00000010000000000000000000000000

Although the examples up to now have used PAYLOAD as the argument type to the blend method, this is not required in general. The type of the second argument to blend is determined by the second argumen to the functor, so that data other than strictly PAYLOAD can be blended into the space. For instance the blending functor might directly take a list of bit indices -

  auto bit_blender = [](PAYLOAD &lhs, std::initializer_list<unsigned> const &rhs) -> bool {
    for (auto idx : rhs)
      lhs[idx] = true;
    return true;
  };

In this case the call to blend must also take a list of bit indices, not a PAYLOAD (e.g. a std::bitset<32>).

  std::initializer_list<unsigned> bit_list = {10, 11};
  space.blend(IPRange{"0.0.0.1-255.255.255.254"}, bit_list, bit_blender);

That call to blend will blend bits 10 and 11 into all IPv4 addresses except the first and last, yielding

10 ranges
0.0.0.1-99.255.255.255    : 00000000001100000000000000000000
100.0.0.0-100.0.0.255     : 10000000001100000000000000000001
100.0.1.0-100.0.1.255     : 01000000001100000000000000000010
100.0.2.0-100.0.2.18      : 00000000001100000000000000000000
100.0.2.19-100.0.3.255    : 00000000001100001010100000000000
100.0.4.0-100.0.4.255     : 00001000001100001010100000000000
100.0.5.0-100.0.5.117     : 00000100001100001010100000000000
100.0.5.118-100.0.5.255   : 00000100001100000000000000000000
100.0.6.0-100.0.6.255     : 00000010001100000000000000000000
100.0.7.0-255.255.255.254 : 00000000001100000000000000000000

History

This is based (loosely) on the IpMap class in Apache Traffic Server, which in turn is based on IP addresses classes developed by Network Geographics for the Infosecter product. The code in Apach Traffic Server was a much simplified version of the original work and this is a reversion to that richer and more complete set of classes and draws much of its structure from the Network Geographics work directly.

I want to thank Uthira Mohan for being my intial tester and feature requestor - in particular the design of blending is a result of her feature demands.