Cisco 8Gbps IPS solution for the Datacenter
By jheary on Mon, 08/13/2007 - 11:18am.
Deploying an IPS solution in a datacenter can be a tricky affair. If not deployed properly, an IPS solution can severely affect the resiliency, performance, and security of a datacenter. Cisco’s IPS solution has been extensively tested to make sure it meets the stringent requirements of today’s complex datacenter (DC) environments. Cisco’s IPS solution is highly available, fast (8gbps), and scalable. It’s ability to complement, rather than clash, with a datacenter designed around Cisco’s routing and switching best practices makes it unique in the industry. Let’s take an in-depth look at this solution. First let’s start with a model Cisco Approved datacenter architecture, based on Cisco’s best practices. Several SRND (solution reference network design) guides can be found here http://www.cisco.com/en/US/netsol/ns656/networking_solutions_design_guid... . The SRND for DC infrastructure that the IPS solution was tested within can be found here http://www.cisco.com/application/pdf/en/us/guest/netsol/ns107/c649/ccmig... . Figure 1 shows an example of a Cisco approved DC infrastructure based on the most popular design choice, layer 2 looped triangle. In fact, figure 1 was copied directly from the DC infrastructure design guide. Figure 1:
cisco IPS image
So, now we want to protect this datacenter with IPS. Ideally, you would want the IPS solution to protect at the VLAN layer, but only those VLANs that you choose. For example, in figure 1 above let’s say we want to implement IPS for all traffic going to and from VLAN10 but leave VLAN20 alone. Now, it needs to scale to a typical datacenter with 10’s of VLANs to choose from. For example, IPS enabled on VLANs 10-40 and not involved for all other VLANs. Not involved means that no traffic ever traverses the IPS system. As you can see from the best practice design above, the DC infrastructure is highly fault tolerant, scalable, and available. Any IPS solution must maintain these traits in such a critical environment as the Datacenter. Asymetric routing, and sometimes switching, paths are common in a typical DC environment. For example, a flow might leave the DC via aggregation switch A but return to the DC via aggregation switch B. An IPS solution must be able to deal effectively with asymmetric data paths. And finally, the DC example above is built with 10gig links so the IPS solution must be fast and scalable. For better or worse, this requirement (performance) seems to be the most looked for and focused on metric in the IPS world today. The Cisco IPS solution was architected to meet the above design requirements. Take a look at figure 2 which inserts the Cisco IPS solution into figure 1’s Cisco Approved DC infrastructure. Figure 2: Check out this rad cool thing As you can see, we just broke the 10Gig etherchannel between the two aggregation switches and inserted the Cisco IPS solution. The architecture utilizes a tried and true services chassis design that has been in use for a while now with other Cisco products like the firewall blade and CSM/ACE load balancer. In fact, this IPS solution (when architected with 6509’s instead of 3750E’s) lends itself very nicely to integrating with those other solutions. The Solution consists of:
* Two 3750E 10gig capable switches in a stack configuration
* Up to 8 Cisco IPS 4260 Appliances
* Cisco MARS for event correlation and monitoring
* Cisco Security Manager for multi-device IPS configuration management
Each IPS sensor only has one physical Gigabit interface in use. Traffic flows in and out of a sensor on the same physical interface, making it an IPS inline on a stick solution. These trunked interfaces are then etherchanneled together for increased bandwidth and fault tolerance. In order to best understand the IPS solution let’s examine the packet flow. Figure 3 shows how a host on subnet 172.16.11.0/24 talks to a host outside the Datacenter. Figure 3: [img]http://www.jheary.com/Cisco_IPS_8gig_flow.jpg[/img] The host (172.16.11.10) sits on VLAN 10 but its default gateway (172.16.11.1) sits on VLAN 11. Both VLAN 10 and VLAN 11 are in the same IP subnet. It is the IPS sensor that bridges the two VLANs together at layer 2. In order for the host to reach it’s default gateway it must pass through and be inspected by the IPS sensor farm. Remember that up to 8 IPS sensors on a stick are connected to the 3750E switch stack. Then the IPS interfaces are put into an ether-channel bundle on the 3750E. The etherchannel load-balancing(LB) algorithm on the 3750E switches is used to efficiently spread the traffic load among all of the available IPS sensors. The etherchannel LB algorithm utilizes a hash of the source and destination IP address of a flow to determine what trunk to send a particular flow down. This means that ether-channel will load balance on a per-flow basis, not quite as efficient as per packet LB but still very efficient. Another important property of etherchannel LB is that it is sticky. This insures that bi-directional traffic for a particular flow is always sent to the same IPS sensor, allowing the IPS sensor to maintain a proper state table. Now that you understand the solution let’s take a look at some of the benefits of this IPS solution: High-Availability and resiliency
* This solution is incredibly resilient to failure scenarios. So much so that even the worst single failure only results in a 1-2 second outage. Most single failure scenarios are even less than that.
* If an IPS sensor, or sensors, go down the etherchannel recovers almost instantly and redirects traffic to the remaining sensors. This allows for a sort of hitless software upgrade capability.
* Handles asymmetric traffic paths.
* Software Failure and IPS overload detection and fail-open support. The Cisco IPS sensors constantly monitor their IPS engine, interfaces, and CPU utilization. In the event of a IPS engine/software failure, missed packets, or dropped packets the Sensor can automatically fail-open and start to pass traffic at wire speed without inspection.
* Hardware failover support. The Cisco IPS sensors support Ethernet NIC based hardware bypass for when the IPS box looses power completely.
* Removing a VLAN or VLANs from IPS inspection is a simple matter of moving the MSFC router placement. In the example above you would move the MSFC from VLAN 11 to VLAN 10. This effectively bypasses the IPS solution.
Performance and Scalability
* Scales to 8Gbps of real world traffic inspection. As Cisco releases faster IPS appliances the scalability will increase accordingly.
* Highly scalable. Start with as few a 2 IPS sensors for 2Gig of IPS inspection. Adding additional sensors increases the IPS performance linearly, up to 8Gbps today.
* Allows for IPS inspection at the VLAN ingress/egress. VLANs can be selectively chosen for IPS inspection.
I consider this IPS solution extremely elegant because of it’s reliance on time tested technologies like routing, spanning tree, dot1q trunking, and etherchannel, to do its job. The basic services chassis infrastructure design (albeit with fwsm and csm/ace instead of IPS) has also been around for quite some time and is heavily deployed in datacenters throughout the world. Making adding this IPS solution to an existing services chassis environment a straightforward affair.