Company A


A Lighthearted Overview of a Collection of Technical Implementations Covering Networking, Security, Wireless, Virtualization and Storage.



You Stay Classy San Diego

Time for me to squash a pet peeve of mine:

The reference of any network segment imaginable that has a mask of 255.255.255.0 as
a “Class C”.

I was out at Company A this week and we were laying out the subnet design for the new network infrastructure. Company A uses the 10/8 Class A and RFC 1918 IP space throughout their internal network. We were planning on a /20 mask for each of the large IDFs which would allow for 16 /24 networks that we could use for various VLANs in the closet (data, voip, guest, security, management, etc). As we talked about this, there were several references to these
individual “Class C” networks.

I know what they meant. Many people use this crazy term to describe networks with masks of 255.255.255.0. The problem is that in this case, and many others, they aren’t Class C networks.

RFC 791 defines the method for determining network class as the following:

"There are three formats or classes of internet addresses: in class a, the high order bit is zero, the next 7 bits are the network, and the last 24 bits are the local address; in class b, the high order two bits are one-zero, the next 14 bits are the network and the last 16 bits are the local address; in class c, the high order three bits are one-one-zero, the next 21 bits are the network and the last 8 bits are the local address."

If we look at how a Class A network is defined, the high order bit must be a 0. The image below identifies the high order bit in the IP address 10.20.30.0.

Class A

image

This would indicate that if the first octet were in the range of 0 - 127 (if you go to 128 the first order bit will flip to a 1) we are dealing with a Class A network. Notice how the subnet mask is not part of the equation. Prior to RFC 1518 and RFC 1519 the high order bit and the next 7 bits defined the network, and the remaining bits defined the hosts (or local address). This is where the default mask of 255.0.0.0 for a Class A network is derived.

Class B networks are determined by the high order two bits being 0 and 1, in that order. This covers decimal 128 - 191 as being Class B networks as we can see below. The two high order bits and the next 14 bits define the network, therefore 255.255.0.0 is the default mask.

Class B

image

As you can guess, the Class C network is determined by interrogating the high order three bits and they need to be 1, 1, 0. The decimal range is 192 - 223. There are two more classes but for the purposes of my post I’m not getting into them. You can head here to learn more. Given the first three bits and the next 21 bits are used for the network, the default mask for a Class C network is 255.255.255.0.

Class C

image

My point here is that the subnet mask is not relevant when you are determining the class of an IP address or network. If you were going to use classfull subnet masks (who does this?) than you would apply the mask after you derived class. Only the first few bits of the first octet of the IP address are used to make that determination.

With that said, if you have the following network:

10.20.30.0 255.255.255.0

The correct way to describe it would be “a Class A network with a 24 bit subnet mask”.

It is not in any way, shape, or form a Class C network.

Carry on.

12:16 am, by companya  Comments

Network Management and Cisco UCS

"I want to see interface utilization and error rates for my ESXi servers running on UCS. Which vEth interface maps to my ESXi server?"

This is the request for help I received from my Network Administrator friends over at Company A.

Company A uses Solarwinds NPM to monitor their network infrastructure, and in the past, when their ESX boxes where just appliance servers, they could monitor the physical ports on the switches where the ESX boxes connected. In a UCS environment you can still monitor the physical ports where the cluster connects to the LAN, but you are monitoring an aggregate of traffic coming from all (or some) of the blades. In-depth explanations of UCS networking have been done before so I will link out here if you need more detail.

Lets take a look at the internal network architecture of UCS and get a visual understanding of what these guys are looking for and how to get it done.

image

You can see that with UCS there are a lot of physical and logical interfaces between the LAN (at the top of the drawing) and the NICs that the operating system on the blade server owns. This is not a problem. In fact it works quite well and is largely transparent to the network or server administrators. It is still useful to be able to see interface utilization and errors at the blade/OS level, and that is what the four red circles represent in the middle of the diagram.

These interfaces (in this case) are vNICs that have been defined on a Palo mezzanine adaptor. These vNICs are managed by UCS (again, transparently) using Network Interface Virtualization or NIV. For each vNIC on a Palo adapter there is a vEth interface created in the Fabric Interconnects that represents the termination of this vNIC.

Unfortunately the vEth interfaces are not identified in UCSM so we have to dig around a bit to get at them. Lets get started.

The first thing we need to do is enable SNMP access to UCS. Once logged in to UCSM follow along to this page:

Admin Tab -> Filter: Communication Management -> Communication Services -> SNMP

image

Next we will add a Fabric Interconnect (FI) into Solarwinds NPM. You will add each FI individually into NPM as each FI is an independent device that has active network interfaces (UCS is active-active with regard to the FI ports).

Home -> All Nodes -> Manage Nodes -> Add Node

Enter the SNMP community you entered into UCSM earlier, and let NPM walk the FI and look for interfaces. The output will look like this:

image

You can see the three LAN facing interfaces in the output. We are going to put a check mark on those bad boys. You can also see the vEth interfaces with what appears to be random numbers assigned to them - awesome! To which chassis/blade/palo do they belong? Well strangely, this information is available via CDP but not in UCSM! Again, awesome. To find out, we need to shell into the FI and issue a few commands.

UCS-A# scope adapter 1/1/1

Let me explain the 1/1/1 action. The first 1 is the chassis ID in the cluster. The second 1 is the slot ID in said chassis. You will need to determine which blade your Service Profile is associated to in order to get this value. In UCSM:

Servers Tab -> Filter: Service Profiles -> Service Profile -> General Tab -> Properites -> Associated Server

The third 1 is the adapter ID on the blade in said slot. For B200/230 blades it will be 1. For B250/B440 blades it could 1 or 2. Next:

UCS-A /chassis/server/adapter # scope host-eth-if 1

The 1 represents the vNIC ID. In my example above I will have 4 vNIC IDs that I will need to look at in order to find the vEth (or VIF) mapping.

UCS-A /chassis/server/adapter/host-eth-if # show vif

VIF:
ID Fabric ID Transport Tag Status Overall Status
————— ————- ————- ——- —————- ———————
843 A Ether 0 Allocated Active

Money. Now we can appropriately name the interfaces in Solarwinds NPM (or your favorite SNMP NMS) to reflect the blade server interfaces we are monitoring.

P.S. If the OS on your blades are ESX/ESXi, you can save time and check the CDP information window in the network configuration screen in vCenter.

01:44 pm, by companya 1  |  Comments

What is Enterprise Class? D-Link vs. Virtual Connect

This is a bit arbitrary in that I am comparing a 10G blade switch with a 1G stackable access layer switch but I think it illustrates my problem with Virtual Connect fairly well. To me Enterprise Class switching means a strong offering in the following areas:

  • Functionality
  • Performance
  • Reliability
  • Support

When determining the value of switch I tend to start with functionality. The switch has to offer up useful features that will assist the network engineer in managing and troubleshooting the redundant, secured and complex network that is found in the Enterprise. When I buy a more affordable switch for my home I don’t expect there to be a ton of features - it’s cheap! And for my home! I probably won’t need IGMP snooping in my house. (That is a level of nerd that I have not yet achieved.)

If functionality isn’t there, I don’t even move on to the other three items I bulleted above because it doesn’t matter. Here we will compare the functionality of a “cheap”, “non-Enterprise Class” D-Link switch to that of an HP Virtual Connect Flex-10.

Lets take a look at functionality of the D-Link DES-3252P switch:

http://www.dlink.com/products/?pid=633

That is a pretty solid list of features IMO. Lets take a look at Flex-10:

http://h18004.www1.hp.com/products/quickspecs/13127_div/13127_div.html

So basically you can manage it using SNMP or from a friendly GUI, it supports VLANs and link aggregation (LACP). And it does IGMP snooping. That is about it from a switching functionality standpoint. It may have high performance, great reliability and excellent support but I’m a bit hung up on the massive lack of features for a switch that finds itself in many Enterprise networks.

My question is this: If you were considering your switching options in your HP C7000 Blade Enclosure and D-Link had an offering, would you start by comparing functionality? Would you start by tossing out the D-Link because clearly HP must have a better option?

P.S. I would wager that the HP ProCurve 6120X is a much better alternative in most cases than a VC of any flavor. You will find that it’s spec sheet is much more similar to the D-Link listed above - and to most “Enterprise Class” switches.

http://h18000.www1.hp.com/products/quickspecs/13422_div/13422_div.pdf
10:25 pm, by companya 1  |  Comments

EtherChannel - You’re Doing It Wrong.

I had one of those days where you are at the customer site and you review their configuration and you wonder how it could have ever worked. Where the configuration is so wrong that you are convinced that even though the customer called you in a panic the night before with a network outage, there must have been a network outage for the entire time this configuration was in place.

"We didn’t change a thing" and somehow you believe them even though the configuration is so incredibly hosed. Here is what I ran into today while spending time at Company A:

Power outage at night. Not everything came back up. I begin troubleshooting by tracing the wires out of the ESX hosts to the network switches (Cisco 3560s). Once I got it all mapped out I cross referenced the physical NICs with the vNICs in ESX by using the little CDP information dialog.

Hosts and Clusters -> Host -> Configuration -> Networking image

This left me with enough information to create a handy little spreadsheet. Here’s a sample:


Now its time to jump in the switch and check it out. sh run… I noticed something right away that caught my eye. Each interface that went to an ESX server regardless of which ESX server or which vSwitch within an ESX server had the same line of code:

channel-group 4 mode on

Lets draw this out so we can all appreciate how much it doesn’t work.

image

I enjoy a good EtherChannel into my ESX box just like the next guy (chortle) but this is too much even for me. In fact, how did this ever pass Ethernet frames in a usable way?

In this case we are not going to use any EtherChannels. There aren’t really enough NICs. We will use NIC Teaming in the vSwitch so that we can use each uplink in an active-active mode. ESX does this by default by load-balancing based on virtual port ID (or Virtual Machine basically). This would “pin” a VM to one of the uplink ports on the vSwitch so long as it is up. Lets take a look:

Hosts and Clusters -> Host -> Configuration -> Networking -> vSwitch Properties -> Select vSwitch -> Click Edit -> NIC Teaming Tab image

Now let’s take another look at the setup after we cross connect the vSwitches to separate physical switches and put an EtherChannel between the physical switches.

image

Corrected the EtherChannel situation, added redundancy, load balancing per virtual port ID. Not too shabby. I think 10G with Cisco Nexus 1000v and QoS is a better approach but you deal with what you have in front of you.

11:09 pm, by companya  Comments

Autonomous Access Points? Yuck!

So I get a call from Company A today and they just received their 5-pack of Cisco 1142 Access Points and they need to get them online quickly. I asked if they went with the controller based solution I had strongly encouraged them to consider. Negative.

I am told that they will have budget for that very shortly but they needed these 5 APs up in their lab areas in a hurry so they went with a lower cost option. They just needed a quick configuration template to bring up an SSID with WPA PSK. No channel-bonding 802.11N. Doesn’t care if the BVI is on the same subnet as the wireless guests. Just get it done. This creates more work for them and me overall, but very well.

I ask if the lab areas are near each other - they are. I decide to select channels manually. I’ve had mixed results with the APs doing it on their own. I create a small spreadsheet for them so they know how to modify the template for each AP.

The channel assignments are meaningless without some form of reference. Let’s take a look at my chicken scratch floor plan:

image

The fifth lab is elsewhere in the building so shouldn’t come into play from an RF perspective. As you can see I am keeping the APs on channel one as far apart as I can here. I want to point out that a wireless survey would have been handy to determine how to best place the APs. We may only need one or two APs for the labs instead of four. Not gonna happen at Company A today…

So I create a template. The top few lines have values that are going to change per AP - well the default gateway won’t change but I threw it in here in case they buy another 5-pack for another building… I explain this to them. They are on board.

hostname lab1
interface Dot11Radio0
channel 1
interface BVI1
ip address 10.3.3.51 255.255.255.0
no ip route-cache
no shut
ip default-gateway 10.3.3.1

Now the rest of this can just be slammed into the remaining APs. It will set the SSID and PSK and get the SSIDs on the radio interfaces and turn them up. This is a very simple, very temporary configuration IMO.

no service pad
service timestamps debug datetime msec
service timestamps log datetime msec
service password-encryption
enable secret wishiwascapwap
aaa new-model
aaa session-id common
dot11 ssid CompanyALab
authentication open
authentication key-management wpa
guest-mode
wpa-psk ascii GJD34fdk4fsdDZD2
username admin password whyareyoustillusingtelnet
no username Cisco
bridge irb
interface Dot11Radio0
no ip address
no ip route-cache
encryption mode ciphers aes-ccm
broadcast-key change 60
ssid CompanyALab
station-role root
bridge-group 1
bridge-group 1 subscriber-loop-control
bridge-group 1 block-unknown-source
no bridge-group 1 source-learning
no bridge-group 1 unicast-flooding
bridge-group 1 spanning-disabled
no shut
interface Dot11Radio1
no ip address
no ip route-cache
encryption mode ciphers aes-ccm
broadcast-key change 60
ssid CompanyALab
dfs band 3 block
channel dfs
station-role root
bridge-group 1
bridge-group 1 subscriber-loop-control
bridge-group 1 block-unknown-source
no bridge-group 1 source-learning
no bridge-group 1 unicast-flooding
bridge-group 1 spanning-disabled
no shut
no ip http server
no ip http secure-server
bridge 1 route ip
line con 0
line vty 0 4
password whyareyoustillusingtelnet
line vty 5 15
password whyareyoustillusingtelnet
end
wr me

Yes I am broadcasting the SSID. I prefer wireless clients to be able to connect to an SSID and roam between APs in a reasonable amount of time (instantly). Security of the wireless network is also important to me. Does this seem counterintuitive? Start here to learn why it isn’t.

04:58 pm, by companya 2  |  Comments