Tuesday, April 02, 2013

Open Source and Closed Protocols

There isn't a better way to learn a protocol for me than reading the RFC and watching Wireshark captures and decodes. However, it becomes difficult when we're talking about a closed proprietary protocol.

Recently, I've been creating new protocol modules for the Net::Frame Perl module suite. Hot Standby Router Protocol (HSRP) is a Cisco proprietary protocol but is detailed in RFC 2281. However, that's only version 1. Version 2 isn't in an open RFC. Another example is Cisco Discovery Protocol (CDP) which has no RFC for version 1 or version 2. Luckily, there is plenty of information on the Internet.

I couldn't get all the information for all the CDP message types, but I'm about 90% there. A more interesting note was in the CDP header itself. There is a checksum that's described as the "normal Internet checksum". However, when I implemented that in a Net::Frame::Layer::CDP module, it didn't work. What was the problem?

A Google search lead me to the source code for Wireshark and a file called packet-cdp.c. A comment beginning on line 230 explains:

"CDP doesn't adhere to RFC 1071 section 2. (B). It incorrectly assumes checksums are calculated on a big endian platform, therefore i.s.o. padding odd sized data with a zero byte _at the end_ it sets the last big endian _word_ to contain the last network _octet_. This byteswap has to be done on the last octet of network data before feeding it to the Internet checksum routine. CDP checksumming code has a bug in the addition of this last _word_ as a signed number into the long word intermediate checksum. When reducing this long to word size checksum an off-by-one error can be made. This off-by-one error is compensated for in the last _word_ of the network data."

That meant I needed some "massaging" of my payload before sending to my Internet checksum routine. A quick proof of concept proved correct and then some solicitation for simplification produced this:

   [...]

   if (length( $payload )%2) {
      if (substr($payload, -1) ge "\x80") {
         substr $payload, -1, 1, chr(ord(substr $payload, -1) - 1);
         substr $payload, -1, 0, "\xff";
      } else {
         substr $payload, -1, 0, "\0";
      }
   }

   [...]

   $self->checksum(inetChecksum($phpkt));

I also found a very long thread in a mailing list archive that describes how some smart guys found this problem and came up with the Wireshark solution. Thanks for open source - even on closed protocols!

Tuesday, March 05, 2013

Thinking Outside the "Package" for Packets

I responded to a query on a community forum for Perl about creating a DNS update with a spoofed source address. I was ASSURED it was for a contest and the code was what was essential. The contest site recommended scapy, but the poster was looking for a Perl solution. I had the easy answer: Windows Packet Crafter.

The poster went on to detail some code that created the DNS update but when he sent the packet, he couldn't change the source IP address. Obviously. He used Net::DNS as follows:

use strict;
use warnings;
use Net::DNS;

my $update = Net::DNS::Update->new('evil.zz');
$update->push(prerequisite => nxrrset('hacker11.evil.zz. A'));
$update->push(update => rr_add('hacker11.evil.zz. 86400 A 127.0.0.1'));

my $res = Net::DNS::Resolver->new;
$res->nameservers('192.168.200.113');

my $reply = $res->send($update);

The last line sends the update by taking the created DNS data and letting standard socket routines create the Layer 3 header. Windows Packet Crafter (WPC) can create the custom Layer 3 header and with Net::Frame::Layer::DNS (which I wrote), one can easily create the required packet. However, with most of the work done, is it necessary to recode the original script using Net::Frame::Layer::DNS? Turns out ... no.

Because of the layered nature of the Net::Frame suite of modules on which WPC is based, one can easily create any or all layers of a frame with the objects or simply by hand crafting an octet stream. Or ... even using another Perl module that can output the required stream.

For Net::DNS, there is an undocumented sub called make_query_packet() in the Net::DNS::Resolver::Base code that creates a Net::DNS::Packet object on which the data() method can be called to create the necessary octet stream. It may sound complicated, but all it means is replace the last line of code above with:

$dnsdata = $res->make_query_packet($update);

Now, in WPC, you can use the $dnsdata->data call to create the DNS payload in a UDP packet. It looks like the following:

VinsWorldcom@C:\tmp\> wpc.pl -i "Wireless Network Connection"
Welcome to Windows Packet Crafter (WPC)
Copyright (C) Michael Vincent 2012

Wireless Network Connection

wpc> use Net::DNS;
wpc> $update = Net::DNS::Update->new('evil.zz');
wpc> $update->push(prerequisite => nxrrset('hacker11.evil.zz. A'));
wpc> $update->push(update => rr_add('hacker11.evil.zz. 86400 A 127.0.0.1'));
wpc> $res = Net::DNS::Resolver->new;
wpc> $res->nameservers('192.168.200.113');
wpc> $dnsdata = $res->make_query_packet($update);

We've used WPC to enter the original code with the modified last line - instead of send, use the make_query_packet() routine to create the Net::DNS::Packet object. Continuing, we create the packet in WPC:

wpc> $ether = ETHER;
wpc> $ipv4 = IPv4(src=>'1.1.1.1',dst=>'192.168.200.113',protocol=>NF_I+Pv4_PROTOCOL_UDP);
wpc> $udp = UDP(dst=>53,payload=>$dnsdata->data);
wpc> $packet = packet $ether,$ipv4,$udp;

And to be sure before we send it, we can use Net::Frame::Layer::DNS for a nice decode:

wpc> use Net::Frame::Layer::DNS qw(:consts);
wpc> decode $packet;
ETH: dst:55:66:88:78:aa:30  src:c0:c1:c2:08:46:56  type:0x0800
IPv4: version:4  hlen:5  tos:0x00  length:90  id:23417
IPv4: flags:0x00  offset:0  ttl:128  protocol:0x11  checksum:0x53fe
IPv4: src:1.1.1.1  dst:192.168.200.113
UDP: src:50281  dst:53  length:70  checksum:0x5661
DNS: id:5329  qr:0  opcode:5  flags:0x00  rcode:0
DNS: qdCount:1  anCount:1
DNS: nsCount:1  arCount:0
DNS::Question: name:evil.zz
DNS::Question: type:6  class:1
DNS::RR: name:hacker11.[@12(evil.zz)]
DNS::RR: type:1  class:254  ttl:0  rdlength:0
DNS::RR: name:[@25(hacker11.[@12(evil.zz)])]
DNS::RR: type:1  class:1  ttl:86400  rdlength:4
DNS::RR::A: address:127.0.0.1

Job done!

Note this can also be done with other Perl modules like Net::DHCP::Packet with the new() and serialize() calls, for example.

Friday, March 01, 2013

Blog about NOT Logging

A question came across a mailing list I subscribe to about limiting the syslog messages sent from a Cisco router to a syslog server. The question arose since a certain Cisco blade switch has a known bug where it reports the redundant power supply is faulty even though it doesn't have one. The message - sent every 5 minutes - was becoming quite bother to the operations folks since there were 80 such devices all reporting the erroneous error.

The asker had already found the 'logging discriminator ...' command, but couldn't apply it. A quick test in Dynamips and I had the answer for him.

The 'discriminator' option as we applied it looked for a regular expression in the syslog message body and was configured to "drop" the message (not send it to the syslog server). It worked with the following configuration:

logging discriminator NOREPORT msg-body drops "Redundant power supply faulty or in standby mode"
logging host 1.1.1.1 discriminator NOREPORT

Satisfied we had a working fix, it was time for some more investigation.

I've selectively enabled SNMP traps with the 'snmp-server enable traps XXX' commands, but I didn't know it was possible with syslog messages - I never really tried to be honest. In fact, all logging is enabled with a simple command:

logging 192.168.100.254

There are options for which facility or severity to send, but not many options for creative tuning - they're all of certain class or nothing. The 'discriminator' option seemed pretty useful. However ...

The 'discriminator NAME' doesn't work like an access-list where you can add multiple lines. You get one (1) discriminator and you get one (1) time to apply it to the syslog host. So how long can the regular expression be? Not very - as soon as I started to get fancy with the regular expression to block multiple messages, I got errors:

R1(config)#$msg-body drops "((Configured from)|(Interface         ))"
R1(config)#$msg-body drops "((Configured from)|(Interface          ))"
% unmatched ()

With the grouping parenthesis and the logical or vertical bar (pipe), I could only get a maximum of 38 characters. When I tried 39, I started getting the "unmatched" error and looking at the 'show run', my configuration line was truncated at 38 characters:

R1(config)#do sh run | i logg
logging discriminator NOREPORT msg-body drops ((Configured from)|(Interface          )

Notice the last parenthesis is left off (should be two of them). This severely limits the creativity when trying to selectively block syslog messages. There are other alternatives, like 'mnemonic' which will block an entire category of syslog messages by regular expression. So less characters to fit within the 38, but entire classes of messages dropped.

Maybe there's a better way?

Thursday, January 24, 2013

Winsock or Winsuck

In a previous post I talked about updating Netcat to support both IPv4 and IPv6 in a single Windows executable. I added a lot of new features including multicast listener with source specific multicast in IPv4 only.

I added source specific multicast in IPv4 only because the Winsock API did not provide the required structures for IPv6 source specific multicast in a "standard" way. Berkeley sockets did it as you'd expect by adding IPv6 complements for the existing IPv4 functions and structures.

IPv4IPv6
ip_mreq_sourceipv6_mreq_source

Additionally, the address family independent structures and options are also available.

I recently looked at ssmping and found they did source specific multicast for IPv6 on Windows so I decided to look at the source for guidance and revisit my Netcat.

I still irks me that instead of:

    struct ip_mreq_source mreq;
    mreq.imr_multiaddr = *mgroup;

    ...

    mreq.imr_sourceaddr = *rad;

I have to do:

    #define MCAST_JOIN_SOURCE_GROUP         45

    ...

    GROUP_SOURCE_REQ mreq;
    struct sockaddr_in6 g6;
    g6.sin6_addr = *mgroup6;
    memcpy(&mreq.gsr_group, &g6, sizeof(struct sockaddr_in6));

    ...

    g6.sin6_addr = *rad;
    memcpy(&mreq.gsr_source, &g6, sizeof(struct sockaddr_in6));

Moving data back and forth with pointers and addresses, changing the storage structures and those memory copies - argh.

This is why I like Perl. I'm not a programmer by trade. This memory management is hidden from me in Perl and furthermore, Perl code will work across Windows and Linux. Instead in C, I have a bunch of '#ifdef' compiler directives to determine if I'm compiling on Win32 and if I have the proper version. The IPv6 source specific multicast routines aren't available and the address family independent ones are only available on Windows Vista or later (or maybe Windows Server 2008 - the documentation is pretty confusing). Not only do I need separate code for Windows or Linux within those compiler directive branches, I also need legacy and new Windows code if the Windows version is less than Vista.

Ultimately, it works. So I am happy. And since I'm not a "real" programmer and I only do this occasionally, I deal with it - and complain on my blog! Don't even get me started on support for QoS in 'setsockopt()'. How annoying is this for people who program for a living and need to support multiple platforms?

Wednesday, January 23, 2013

Testing DHCPv6: Part 2

In a previous post, I talked about getting a DHCPv6 client on a linux image to use with Qemu and GNS3. That post was mainly focused on documenting the steps to get the DHCPv6 client on the host and ultimately working. I neglected to talk about the DHCPv6 server that I configured on the Cisco router.

Of course, I was just bit by a minor configuration miss that I learned in that exercise and quickly forgot until gently reminded as I watched my DHCPv6 simulation NOT work.

The DHCPv6 server configuration on the router is pretty simple:

ipv6 dhcp pool DHCPv6
 address prefix 2001:DB8:A64:6800::/64
 dns-server 2001:4860:4860::8888
 domain-name dynamips.com

But what I forgot was that SLAAC is mandatory for IPv6 nodes and will work as long as router advertisements are present. So simply setting the M-bit in the router advertisements to force DHCPv6 is not enough. You need to NOT advertise the IPv6 prefix on the interface.

So the interface configuration looks like:

interface FastEthernet2/0
 ipv6 address 2001:DB8:A64:6800::1/64
 ipv6 enable
 ipv6 nd prefix 2001:DB8:A64:6800::/64 no-advertise
 ipv6 nd managed-config-flag
 no ipv6 redirects
 no ipv6 unreachables
 ipv6 dhcp relay destination 2001:DB8:A01:100::1

In the above example, I'm sending the DHCPv6 requests to the DHCPv6 server running on another Cisco router. I've set the M-bit and I've also stopped router advertisements of the prefix with the 'ipv6 nd prefix 2001:DB8:A64:6800::/64 no-advertise' command.

And now it works!

Thursday, December 20, 2012

Multicast IPv6 Anycast RP Design

Like most folks, I heard of multicast but never configured it. That changed back in 2006 when we deployed music on hold over multicast for our Cisco voice infrastructure and also installed VBrick IP television which required multicast. I did the research and created a PIM sparse mode design based on the Cisco Solution Reference Network Design for campus networks.

We used anycast rendezvous points (RP) peered with Multicast Source Discovery Protocol (MSDP). We used a site local multicast scope and properly filtered on the edge. We peered with the multicast RPs in our headquarters site and passed organization local scope groups. Overall, we had a pretty robust, fault tolerant multicast design.

I did more multicast work at my next client through 2011 including dense-mode and simplified multicast forwarding (SMF), custom code for Internet Group Management Protocol (IGMP) joins for multicast on mobile ad-hoc networks (MANET) and lots of other unique designs.

Of course, this was all on IPv4.

Multicast for IPv6 is a different beast and although I haven't seen a requirement to deploy it for a client yet, I wanted to do some testing to get up to speed before the need manifests. I was fortunate to attend the Cisco IPv6 Fundamentals, Design and Deployment class last week. Although much was review for me given my hands-on IPv6 experience over the last 4 years, the multicast module and lab was very informative.

I understood my IPv4 multicast reference design would not map to IPv6. While IPv6 supports anycast RP, there is no MSDP for IPv6. Some (very basic) background:

In a Protocol Independent Multicast Sparse-Mode (PIM-SM) design, multicast sources send their multicast streams to the Rendezvous Point (RP). Multicast listeners are directed to the RP to make the connection and start receiving traffic. To make the design redundant and fault tolerant, we use anycast RP. Anycast is the practice of configuring the same unicast address on multiple devices. Traffic is routed towards the closest instance of the address based on routing protocol metrics. This of course can result in multicast sources sending traffic to one anycast RP instance and multicast listeners being routed to a different anycast RP instance. No connection would be made in this case.

To overcome this issue, we use Multicast Source Discovery Protocol (MSDP) to peer the anycast RPs so they share information about the multicast sources. This way, no matter which instance of the anycast RP a listener connects to, it will be able to establish the stream from the multicast source and start receiving traffic.

With the above explanation, the absence of MSDP in IPv6 is a problem with an anycast RP design. Foregoing anycast RP and just using a single RP does not provide fault tolerance should the RP go down. What to do?

RFC 4610 addresses this issue and Cisco implements it in IOS 15 and other flavors like XE. However, I learned in class that Cisco had another approach they called "Anycast RP with prefix arbitration".

The concept is simple using standard routing rules of longest match prefix. Simply configure the anycast address on multiple devices with different masks. Instead of equal cost multipath routing in redundant environments, all traffic will be routed to the anycast instance with the longest prefix (anycast RP primary). Should that RP go down, all traffic will be routed to the anycast instance with the second longest prefix (anycast RP secondary), and so on (anycast RP tertiary ...). This is very similar to "floating static routes"; static routes with a manually configured admin distance to bring up a BRI interface when the primary frame-relay goes down (remember 1990's).

  • Configure primary anycast RP with longest prefix
  • Configure secondary anycast RP with second longest prefix
  • Configure tertiary anycast RP with third longest prefix
  • And so on ...
  • Advertise the anycast network from each device via routing protocol

This eliminates the need for MSDP to peer the anycast RPs since all traffic - both sources and listeners - will be routed to the same anycast RP instance. Or will it?

Consider an instance where multicast sources and / or listeners are directly connected to the devices which host the primary and secondary anycast RPs. Connected routes override any longest prefix match since they are connected. So there is a possibility where listeners won't be able to find sources. And that's what happened to me when I finished the documented multicast lab and decided to test this design.

The lab was only two routers with a client connected off each, so it's pretty obvious why it failed. I decided to test a more real-world scenario. Consider the following:

The relevant configurations:

C1#show run interface Loopback0
interface Loopback0
 description Anycast RP (Primary)
 no ip address
 ipv6 address 2001:DB8:AFE:FE00::1/120
 ipv6 enable
 ipv6 eigrp 1
end

C2#show run interface Loopback0
interface Loopback0
 description Anycast RP (Secondary)
 no ip address
 ipv6 address 2001:DB8:AFE:FE00::1/119
 ipv6 enable
 ipv6 eigrp 1
end

With anycast RP primary configured on C1 (/120 mask) and the anycast RP secondary configured on C2 (/119 mask), we expect all traffic to be routed to C1. Indeed it is from both D1 and D2:

D1#show ipv6 route 2001:db8:afe:fe00::1
Routing entry for 2001:DB8:AFE:FE00::/120
  Known via "eigrp 1", distance 90, metric 156160, type internal
  Route count is 1/1, share count 0
  Routing paths:
    FE80::C804:11FF:FE44:1C, FastEthernet1/0
      Last updated 00:00:12 ago

D2#show ipv6 route 2001:DB8:AFE:FE00::1
Routing entry for 2001:DB8:AFE:FE00::/120
  Known via "eigrp 1", distance 90, metric 156160, type internal
  Route count is 1/1, share count 0
  Routing paths:
    FE80::C804:11FF:FE44:1D, FastEthernet1/0
      Last updated 00:03:05 ago

"FastEthernet1/0" is the telltale that D1 and D2 will send their traffic for the anycast RP to C1 (see diagram above). When we shutdown the Loopback0 interface (anycast RP primary) on C1, traffic fails over:

D1#show ipv6 route 2001:db8:afe:fe00::1
Routing entry for 2001:DB8:AFE:FE00::/119
  Known via "eigrp 1", distance 90, metric 156160, type internal
  Route count is 1/1, share count 0
  Routing paths:
    FE80::C805:11FF:FE44:1C, FastEthernet1/1
      Last updated 00:03:25 ago

D2#show ipv6 route 2001:DB8:AFE:FE00::1
Routing entry for 2001:DB8:AFE:FE00::/119
  Known via "eigrp 1", distance 90, metric 156160, type internal
  Route count is 1/1, share count 0
  Routing paths:
    FE80::C805:11FF:FE44:1D, FastEthernet1/1
      Last updated 00:03:27 ago

So the solution works! But what routes do C1 and C2 see? When operating in steady state (each anycast RP interface on C1 and C2 is up), all routing doesn't point to C1:

C1#show ipv6 route 2001:DB8:AFE:FE00::1
Routing entry for 2001:DB8:AFE:FE00::1/128
  Known via "connected", distance 0, metric 0, type receive
  Route count is 1/1, share count 0
  Routing paths:
    receive via Loopback0
      Last updated 00:00:28 ago

C2#show ipv6 route 2001:DB8:AFE:FE00::1
Routing entry for 2001:DB8:AFE:FE00::1/128
  Known via "connected", distance 0, metric 0, type receive
  Route count is 1/1, share count 0
  Routing paths:
    receive via Loopback0
      Last updated 00:19:13 ago

Notice C1 and C2 each see the anycast RP address as their local Loopback0 interface. In the case of C1, that's correct. In the case of C2, it's correct (according to routing rules), but not desired (according to our anycast RP with prefix arbitration design). There isn't a way to "fix" it as it isn't broken, it just highlights a design constraint on the "Anycast RP with prefix arbitration" design:

  • Never directly connect multicast sources or listeners to a device acting as an anycast RP

This may not be a problem in the design I tested as most people won't connect end stations to the core layer. But consider collapsed core designs or instances where the RPs may be configured on distribution layer switches that connect servers which are multicast sources.

"Anycast RP with prefix arbitration" is a pretty easy, straight forward design, but like anything new, test first and understand the limitations.

Tuesday, December 18, 2012

IPv6 TFTP from IPv4

No, the title doesn't imply a great new translation technology for that indispensable file transfer protocol TFTP. Instead, this is to highlight an "oversight" - I won't go so far as to call it a "bug" - in Cisco IOS.

I'm testing with version Advanced IP Services 12.4 (24)T on a 7200 series router - just in case that matters.

For many services on Cisco routers and switches, I've been using the "source interface" command to explicitly tell the device what address to source the updates from. Normally, I point it to a loopback interface. This makes looking at logs pretty easy when DNS resolves the loopback address to the device name.

So for example:

ip flow-export source Loopback0
logging source-interface Loopback0
snmp-server trap-source Loopback0
snmp-server source-interface informs Loopback0

In most cases, we'll even use "update-source LoopbackX" for iBGP neighbors.

This makes looking at a Syslog and SNMP Trap aggregator easy. As long as they resolve addresses to names, I see content like:

Router1  Informational  Local7  Interface FastEthernet0/0 up
SwitchA  Emergency      Local6  Power supply 1 down  

Instead of:

10.254.254.1  Informational  Local7  Interface FastEthernet0/0 up
10.254.254.2  Emergency      Local6  Power supply 1 down  

Now that we've shown why this is good practice, I'll also add that we track nightly TFTP backups of configurations in TFTP logs and the same principle applies. So we use the 'ip tftp source-interface Loopback0' command. Notice however that all previous commands don't start with 'ip', the TFTP 'source-interface' command does. Big deal? With IPv6 it turns out ... YES, it is.

Granted the backup routine tested connected to the devices via IPv4 and requested a TFTP backup via SNMP to the IPv4 address of the TFTP server - so we didn't lose a night's worth of backups and wake up to an error log. The benefits of testing first! However, with IPv6 enabled and an IPv6 address on the Loopback0 interface, IPv6 TFTP should work. And in the test, it didn't.

Here's the relevant configuration:

ip tftp source-interface Loopback0

interface Loopback0
 ip address 10.254.254.1 255.255.255.255
 ipv6 address 2001:DB8:AFE:FE00::1/128
 ipv6 enable

interface FastEthernet2/0
 description To TFTP Server
 ip address 192.168.100.1 255.255.255.0
 ipv6 address 2001:DB8:192:168::1/64
 ipv6 enable

The TFTP server in the test lives at:

192.168.100.254
2001:db8:192:168::254

Again, IPv4 TFTP worked as expected. The Loopback0 address (10.254.254.1) shows in the TFTP logs. But with IPv6, something strange happened:

R1#copy run tftp
Address or name of remote host []? 2001:db8:192:168::254
Destination filename [r1-confg]?
.....
%Error opening tftp://2001:db8:192:168::254/r1-confg (Timed out)
R1#

And the resultant TFTP server log shows:

TFTP# crapps.pl -S tftpd -6
Starting MODE       -> TFTP Server
Listening on        -> [::]:69 (udp)
TFTP Root directory -> .

afe:fe01::  62506  WRQ  OCTET  r1-confg  STARTED
afe:fe01::  62506  WRQ  OCTET  r1-confg  STARTED
afe:fe01::  62506  WRQ  OCTET  r1-confg  File './r1-confg' already exists
afe:fe01::  62506  WRQ  OCTET  r1-confg  Timeout occurred on DATA packet 1
...

Who the heck is "afe:fe01::"? My Loopback0 IPv6 address is "2001:db8:afe:fe00::1". True, but my Loopback0 IPv4 address is "10.254.254.1", or in hex used directly as an IPv6 address is "afe:fe01::". I remember a saying about computers doing exactly what you tell them to. The Cisco router is sourcing the TFTP from the 'ip tftp source-interface Loopback0' - 'ip' as in "IPv4".

So is IPv6 TFTP broken? No, you just need to remove the 'source-interface' command:

R1#config term
Enter configuration commands, one per line.  End with CNTL/Z.
R1(config)#no ip tftp source-interface Loopback0
R1(config)#end
R1#copy run tftp
Address or name of remote host []? 2001:db8:192:168::254
Destination filename [r1-confg]?
!!
8774 bytes copied in 5.540 secs (1584 bytes/sec)
R1#

And confirmed on the TFTP server:

TFTP# crapps.pl -S tftpd -6
Starting MODE       -> TFTP Server
Listening on        -> [::]:69 (udp)
TFTP Root directory -> .

2001:db8:192:168::1  52000  WRQ  OCTET  r1-confg  STARTED
2001:db8:192:168::1  52000  WRQ  OCTET  r1-confg  SUCCESS [8774 bytes]

Much better. Of course now the source is the interface of the router that the TFTP traverses, in this case, FastEthernet2/0. This will also be the same for IPv4 TFTP now.

Nightly TFTP backups are one of those automated tasks we set and forget. Sure there are monitors in place to catch changes and email alerts, but how often does something go wrong? Imagine waking up to an error log an no backups. Not then end of the world, but certainly not something you want to see before your first cup of coffee. Test and test again, especially when incorporating IPv6.

 

Copyright © VinsWorld. All Rights Reserved.