Kevin is a Cisco Certified Systems Instructor (CCSI No. 20061) with two CCIEs (#7945), one in Route/Switch and one in Collaboration. Kevin produces video courses and writes books for Cisco Press/Pearson IT Certification. Also, he owns and operates Kevin Wallace Training, LLC, a provider of self-paced training materials that simplify computer networking.
The OpenShortest Path First (OSPF) dynamic routing protocol is one of the most beloved inventions in all of networking, widely adopted as the Interior Gateway Protocol (IGP) of choice for many networks. In this blog series, you'll be introduced first to the basic concepts of OSPF and learn about its various message types and neighbor formation.
An Overview of OSPF
Where does the interesting name come from when it comes to OSPF? It is from the fact that it uses Dijkstra's algorithm, also known as the shortest path first (SPF) algorithm. OSPF was developed so that the shortest path through a network was calculated based on the cost of the route. This cost value is derived from bandwidth values in the path. Therefore, OSPF undertakes route cost calculation on the basis of link-cost parameters, which you can control by manipulating the cost calculation formula.
As a link state routing protocol, OSPF maintains a link state database. This is a form of a network topology map. Every OSPF router on the network implements this link state database and the map to networkdestinations. Included in this link state database for each prefix is the OSPF cost value. Remember, the OSPF algorithm allows every router to calculate the cost of the routes to any given reachable destination.
A router interface running OSPF will advertise its link cost to its OSPF neighbors. This prefix and cost information is cascaded through the network as OSPF routers advertise the information they receive from one OSPF neighbor to all other OSPF neighbor routers. This process of flooding link state information through the OSPF network is known as synchronization. Based on this information, all routers with OSPF implementation continuously update their link state databases with information about the network topology and adjust their routing tables.
We like to refer to OSPF as a hierarchical routing protocol. This is because an OSPF network can be subdivided into routing areas to simplify administration and optimize traffic and resource utilization. We identify areas by 32-bit numbers. We can express these as in decimal or in the same dotted decimal notation used for IPv4 addresses.
By definition, Area 0 (or 0.0.0.0) represents the core or backbone area of an OSPF network. Any other areas you might create (Area 10, Area 20, etc) must be connected to the backbone Area 0. If your areas are not connected as described, you must engage in a workaround procedure such as a Virtual Link (described later in this blog series). You maintain connections between your areas with an OSPF router known as an area border router (ABR). An ABR maintains separate link-state databases for each area it serves and maintains summarized routes for all areas in the network.
Neighbor and Adjacency Formation
Because OSPF is a complex routing protocol (unlike a much simpler protocol such as RIP), it uses different neighbor states in its operation. You should know and understand these different states not just for the academic exercise of possessing this knowledge. These states can becomecritical in your OSPF support and troubleshooting efforts. For example, you might have a router “stuck” in one of these states, and that information can prove critical in fixing the problem.
Here are the states and information you should know about each of them:
Down - This is the first (or initial) OSPF neighbor state. It means that no information (hellos) have been received from a neighbor, but hello packets can still be sent.
During the fully adjacent neighbor state (the Full state), if a router doesn't receive a hellopacket from a neighbor within the RouterDeadInterval time or if the manually configured neighbor is being removed from the configuration, then the neighbor state changes from Full to Down. Remember, the RouterDeadInterval time is 4 times the HelloInterval by default for OSPF.
Attempt - This state is only valid for manually configured neighbors in a Non-BroadcastMultiaccess (NBMA) environment. In this state, the router sends unicast hello packets every poll interval to the neighbor from which hellos have not been received within the dead interval.
Init - This state specifies that the router has received a hello packet from its neighbor, but the receiving router's ID was not included in the hello packet. When a router receives a hello packet from a neighbor, it should list the sender's router ID in its hello packet as an acknowledgment that it received a valid hello packet.
2-Way - This state designates that bi-directional communication has been established between two routers. Bi-directional means that each router has seen the other's hello packet.
This state is attained when the router receiving the hello packet sees its own Router ID within the received hello packet's neighbor field. In this state, a router decides whether to become adjacent tothis neighbor. On broadcast media and non-broadcast multiaccess networks, a router becomes full only with the designated router (DR) and the backup designated router (BDR); it stays in the 2-way state with all other neighbors. On Point-to-Point and Point-to-Multipoint networks, a router becomes fully adjacent with all connected routers.
NOTE: At the end of this state, the DR and BDR for broadcast and non-broadcast multi-access networks are elected.
Exstart - Once the DR and BDR are elected, the actual process of exchanging link state information can start between the routers and their DR and BDR.
Exchange - In the exchange state, OSPF routers exchange database descriptor (DBD) packets. Database descriptors contain link-state advertisement (LSA) headers only and describe the contents of the entire link-state database. Routers also send link-state request packets and link-state update packets (which contain the entire LSA) in this state. The contents of the DBD received are compared to the information contained in the routers link-state database to check if new or more current link-state information is available with the neighbor.
Loading - In this state, the actual exchange of link state information occurs. Based on the information provided by the DBDs, routers send link-state request packets. The neighbor then provides the requested link-state information in link-state update packets. During the adjacency, if a router receives an outdated or missing LSA, it requests that LSA by sending a link-state request packet. All link-state update packets are acknowledged.
Full - In this state, routers are fully adjacent with each other. All the router and network LSAs are exchanged and the routers' databases are fully synchronized.
Remember that the full state is the normal state for an OSPF router. If a router is stuck in another state, it is an indication that there are problems in forming adjacencies.
While a much more complex IGP than something like RIP, surprisingly, the configuration is quite simple. In fact, there are currently two options for its configuration at the command line. You can configure OSPF using a network statement under the routing process (much like other routing protocols), or you can configure the interfaces to run OSPF in interface configuration mode. Examples 1 and 2 demonstrate the two configuration approaches using the topology shown in Figure 1.
Figure 1: A Sample OSPF Topology
Example 1: Configuring OSPF Using Network Statements
Notice in these configurations that we identify the local OSPF process on the router using a locally significant process ID value. In the example, the process ID of 1 is chosen for all the routers. Also notice that the network statement must contain a wildcard mask that indicates the significant bits in the network value that proceeds it in the command. On the ATL2 Area Border Router, a 32-bit wildcard mask of 0.0.0.0 is used to place one interface in Area 0 and another in Area 1.
Example 2: Configuring OSPF Using Interface-Level Commands
Example 3: Verifying OSPF
Notice the power of the show ip ospf neighborcommand in order to verify the neighbor relationships and their state. On ORL, we examine the routing table to ensure we are learning the OSPF Inter Area route of 10.12.12.0/24.
That's going to wrap up Part #1 of our OSPF series. Coming up in Part #2, we'll take a look at the Designated Router and Backup Designated Router election process. Take good care.
One of the big announcements this week at Cisco Live was the launch of their new DevNet certification track. Cisco CEO Chuck Robbins reiterated the fact that knowledgeable engineers are always going to be in-demand. Contrary to what many believe, network automation and A.I. integration is not designed as a replacement for those skills, but rather these advancements allow the ability to manage numerous network devices and their services through software. For large scale networks, usage of API’s for automation is the way of the future.
The launch of this new certification track is aimed at joining the skills of software developers with network professionals, with the goal of accelerating the progress of network automation in organizations throughout the world.
Here's a breakdown of the current DevNet certification offerings:
This entry-level certification is accessible to those who are "early-in-career" developers, and also experienced network engineers. Recommended experience is one or more years of developing and/or maintaining applications built on Cisco platforms. Hands-on programming experience is recommended as well, specifically the Python language.
This is ideal for those who have three to five years of experience with application development, operations, security, or infrastructure. You also have the option to choose a particular focus area under the umbrella of Software Specialist or Automation Specialist, with options listed below.
Software Specialist options:
Automation Specialist Options:
Data Center Automation
Service Provider Automation
This certification requires one core exam, plus one concentration exam. This is recommended for developers who have at least three to five years of experience designing and implementing applications built on Cisco platforms with experience in Python. This also included experienced engineers who want to learn more about software and automation.
Currently, the expert certification offering is not available. However, this is listed as a "Future Offering" on the DevNet Certification website.
By now I'm sure you've heard of the sweeping changes Cisco is making to their certification tracks, which was announced at Cisco Live on Monday June 10, 2019. I covered the CCNA exam changes in a previous post, so here I'll specifically address updates to the CCNP track.
First, if you've already started working toward any current CCNP certification - keep going! You have until February 24, 2020 to complete your certification, and in the new program, you'll receive credit for work you've already completed.
Let's begin by looking at the current list of CCNP certifications, set to expire next February:
CCNP Routing and Switching
CCNP Data Center
CCNP Service Provider
Now, here are the new CCNP certifications that will be rolling out:
CCNP Data Center
CCNP Service Provider
Cisco Certified DevNet Professional
You may notice the absence of CCNP Routing and Switching, CCNP Wireless, and CCDP. These will all be retired, and will instead offer multiple paths to achieving the new CCNP Enterprise certification, along with new Specialist Certifications based on which of the three tracks you choose to go down. To further clarify this, Cisco has provided a migration tool for their professional exam tracks, which you can access here: CCNP Migration Tool.
The important thing to make clear is that if you pass the full exam path for any CCNP track before the February deadline, you will be granted the equivalent certification under the new program. For example, if you pass the current CCNP SWITCH, CCNP ROUTE, and CCNP TSHOOT before February 24, you will receive the new CCNP Enterprise certification, plus the appropriate Specialist certifications (see the migration tool for more info on those) as outlined by Cisco.
Now, if you are starting fresh under the new CCNP program, each CCNP certification requires only two exams: one core exam and one concentration exam of your choice. To see what this looks like for your particular concentration, visit Cisco's Professional Level certification page here: Cisco Professional Certification Updates.
One last interesting thing of note is that the core exams in each technology track also serve as qualifying exams for CCIE lab exams. This means there will be no more written CCIE exams necessary before the lab attempt. So for example, let's say you have a current CCNP in Routing and Switching. After the February deadline, you will be granted the CCNP Enterprise certification, along with the CCNP Certified Specialist - Enterprise Core certification. The CCNP Certified Specialist - Enterprise Core (300-401) is a prerequisite for the new CCIE Enterprise Infrastructure, in place of a written lab exam. For more about the expert level certification updates, check here: Cisco Expert Certification Updates.
It's a lot to get your head around, but all of these changes look like a step in the right direction. Cisco's goal was to streamline the certification process and make things more accessible with less tests necessary for completing specific concentrations, and it certainly seems like they've done that. Stay tuned for more about this big announcement.
On Monday June 10, 2019 Cisco announced an unprecedented revamp of their certification program. This post dives into one of the major updates, the new CCNA certification. (We'll have a future blog post with updates on the CCNP changes.)
First, if you’re currently preparing for your CCNA R/S (or any other CCNA for that matter), don’t panic. You have until February 24, 2020 to complete your certification, at which time you’ll be given the new CCNA certification, plus a “badge” indicating your area of specialization (based on which CCNA you earned). So, Cisco recommends you “keep going” if you’re working towards any CCNA certification.
Even if you’re just thinking about going after a CCNA cert, personally, I would do it now before the February deadline hits.
However, just having a current CCENT certification won't help. You'll need a full CCNA to be granted the new CCNA certification. So, if you do just have your CCENT, you'll need to pass the ICND2 exam, to be grandfathered into the new CCNA certification.
However, if you do want to challenge yourself with the new composite CCNA exam (or maybe you’re just curious about what’s on it), let’s break it down.
You can download a comprehensive list of topics by clicking HERE.
At a super high level, the topic categories break down like this:
Network Fundamentals: 20 percent
Network Access: 20 percent
IP Connectivity: 25 percent
IP Services: 10 percent
Security Fundamentals: 15 percent
Automation and Programmability: 10 percent
Now, let’s delve into each of those categories:
Here, you’ll need to know some basics, such as the role of routers and switches in a network, the different types of topologies out there, copper/fiber cabling information, troubleshooting cabling issues, comparing TCP and UDP, IPv4 and IPv6 addressing, verifying IP address parameters in various operating systems, wireless basics, and switch operations.
In this area, you’ll need to know about VLANs, trunking, and EtherChannel, and wireless networking concepts.
Then, you’ll review how routing works, focusing on static routes and OSPFv2. (Note the absence of RIP, EIGRP, and GP). Also, you’ll need to know some theory of HSRP, VRRP, and GLBP.
In this category, you’ll explore various IP services such as: NTP, DHCP, SNMP, and QoS.
This section introduces you to security fundamentals along with a discussion of some specific security mechanisms, including VPNs, Layer 2 security mechanisms, and wireless network security.
Automation and Programmability:
This is the big one! Moving away from the CLI, this category gets into controller-based SDN, Cisco DNA Center, REST APIs, Puppet, Chef, and JSON-encoded data.
I’ll definitely be coming out with training for the new CCNA, but I’ve got to emphasize: you’ve still got 7 months to earn the current CCNA R/S, which is what I would personally do.
To accelerate your learning, regardless of what you decide to do, pursue a current CCNA track or start studying for the new CCNA, one thing you need is a solid understanding of networking fundamentals. And, since Cisco dramatically changed the state of play this morning, I want to help you master the fundamentals.
Specifically, I’m going to be doing a 3-day LIVE and online training course for FREE called CCNA Foundations, beginning June 25th. Check out the details here: https://kwtrain.com/ccna-foundations
Stay tuned for more information on this morning’s big announcement.
Cisco just announced their certification program is getting a MAJOR update. Here’s what you need to know:
A new CCNA exam will be released on February 24, 2020.
This new CCNA certification will REPLACE the following certs:
CCNA Cyber Ops
CCNA Data Center
CCNA Routing and Switching
CCNA Service Provider
If you complete any current CCNA/CCDA certification before Feb. 24, 2020, you’ll get the new CCNA certification, and a “training badge,” representing the technology area in which you received your CCNA/CCDA.
New lifetime tenure for CCIE certification maintained for 20 years continuously
What we're doing to help:
Cisco recommends that if you're currently studying for your CCNA certification, you should "keep going." Unfortunately, the CCNA program has evolved so much over the years, it now misses many of the fundamental technologies for people just getting into the industry. So, I'm going to be doing a FREE ONLINE COURSE - LIVE, staring on June 25th.
In this post, we're going to take a look at how we can work with BGP in IPv6.
BGP in IPv6
You will recall from this chapter that BGP was constructed to support many different protocols and NLRI right out from its creation. As a result, we have robust support for such technologies as IPV6, MPLS VPNs, and more.
You will also relish in the fact that once you master the basics of BGP that we have covered in this , working with BGP in IPv6 is much more similar than it is different!
BGP is so remarkably flexible, as discussed earlier in this chapter, you can use IPv4 as the “carrier” protocol for IPv6 NLRI. In this case, we consider IPv6 the “passenger” protocol. Let’s examine this configuration first and let’s use two simple routers as shown in Figure 1.
Figure 1: A Simple Topology for IPv6 BGP
Example 1 shows the configuration and verification of such a network. Notice how this configuration requires the setting of the appropriate IPv6 next hop address for the prefix advertisement. This is not needed when using IPv6 as both the carrier and the passenger protocol.
Example 1: IPv4 Carrying IPv6 NLRI
Example 2 shows the verification of this configuration on ATL2. Note that because EUI-64 is in effect on the loopback interface of ATL, you would need to copy the full IPv6 address from that interface in order to perform the ping test.
Example 2: Verifying the IPv4/IPv6 BGP Configuration
As you might guess, a much cleaner configuration is to use IPv6 to carry the IPv6 prefix information. When I say “cleaner”, I allude to the fact that it is just a much simpler configuration, and more what you would “expect”. Example 3 demonstrates this configuration. Note that I have stripped all IPv4 off the devices, so I must set a 32-bit router ID for BGP as it is not able to set one automatically from an interface on the device.
Example 3: Verifying the IPv4/IPv6 BGP Configuration
You might be wondering about verification of BGP neighbors with our various methods of IPv6 configuration. We know that we love show ip bgp summary in an IPv4 environment for this. For IPv6, use the command show bgp ipv6 unicast summary. Note how this command specifically calls out the address family that you are working with and will show you the tabular peering and prefix count you would expect.
As you recall from earlier in this blog series, there are many wonderful filtering mechanisms we can choose from for IPv4 BGP environments. The wonderful news is, we have these same set of techniques available for IPv6. This would include such mechanisms as:
AS Path Filtering
Example 4 shows a sample filtering configuration using a prefix list. Note that this configuration really does not require you to re-learn any technologies. We love it when that happens!
Example 4: IPv6 Prefix Filtering in BGP
That's going to completely wrap up our BGP series. Keep an eye out for our upcoming blog series, which will be dedicated to advanced OSPF concepts. Take good care.
In this post, we're going to take a look at BGP scalability mechanisms and related concepts.
BGP Scalability Mechanisms
Just as IP address depletion has been a concern with the Internet, so has the depletion of available autonomous system numbers. To help solve this, the engineers turned to a familiar solution. They marked an AS number range as private-use only. This permits you to experiment with AS construction and policy in a lab (for example) and use AS numbers that are guaranteed not to conflict with Internet-based systems.
Remember, the AS number is a 16-bit number permitting up to 65,536 AS numbers. The private space is marked as 64512-65535.
Another solution to the shortage has been to expand the naming address space. A larger space (32-bit number) has been approved.
For the longest time with Border Gateway Protocol peer groups were considered an absolute must from a scalability perspective. We would configure peer groups for the benefit that it would give us with much smaller configuration files. And we also configured peer groups for performance improvements.
The performance benefits have been done away with much improved mechanisms that we can use. With this said, many organizations still use them as they are so easily understood and in use for configuration shortening.
BGP peer groups arose to solve the ridiculous amount of redundancy in BGP configurations. Consider the simple (and very small) Example 1. Even this simple example still communicates the amount of redundant configuration.
Example 1: A Typical BGP Configuration without Peer Groups
Clearly its all of these configuration commands per neighbor. And many of your neighbors are going to share the same characteristics. It makes sense to group their configurations in a peer group type of configuration. Example 2 shows how you can configure and use a BGP peer group.
Example 2: BGP Peer Groups
Keep in mind that if you have specific configurations for a specific neighbor, you can still enter them in the configuration and they will apply in addition to the peer group configurations.
The other thing about peer groups and why there were so frequently used is they had performance enhancements as well. As a matter of fact, that was the very first initial reason for their creation. It would help with the efficiency of BGP's operations to do peer groups for neighbors instead of individual neighbor configurations.
A more modern (and more effective) approach is to use session templates to shorten configurations. And from a performance enhancement perspective, we now have (as of iOS 12 and later) dynamic update groups. The provide performance enhancements without you needing to configure anything as far as peer groups or templates are concerned.
When you think about a peer group, you realize it's like a template for your settings. And it's going to enable you to utilize session parameters and also policy parameters. Well, the new and improved methodology separates these functionalities into session templates and policy templates.
Thanks to session templates and policy templates, we configure settings required for a session to be properly established and we place those settings in session template. Those that involve policy actions, we place that in a policy template.
One of the great things about using these session or policy templates or both is that they follow an inheritance model. You can have a session template that does certain things with a session. Then you can set up a direct inheritance so that when you create another one it incorporates the things in the one you created previously. This inheritance model is going to give us more flexibility and we can create some really nice scalable designs for BGP implementations.
You can use templates or you can use peer groups, but they're going to be a mutually exclusive choice. So decide on your approach in advance. You're going with the legacy peer group approach or are you going with the session and policy template approach, and then you're going to make that choice and stick with it on the device, because you cannot use both approaches simultaneously.
Now, you really would guess that the configuration would be pretty straightforward for session templates, and it is. Remember, first of all, we are doing here anything that would be relevant to the session. So if we want to set timers, obviously, we need to set the remote-as, that would be something that's considered a session parameter.
Maybe we're doing update source. We're doing eBGP multihop. All this stuff is relevant to the session, and that's what we would have in the session template. Notice we begin by creating the template. So I'd say template peer-session, and then I would give it a name. And then inside of that template configuration mode we could do inheritance, so we could inherit settings from another peer session. We could set our remote-as and/or update source. And then when we're all done, we use this command exit-peer-session in order to get out of the configuration mode for that session. Example 3 shows the configuration of session template.
Example 3: BGP Session Templates
It is a simple matter of configuring a neighbor with the neighbor statement and using inherit peer-session and then giving the name of the peer session that we created for our session template, and that's going to give that neighborship those session settings.
Remember, if you wanted to do some additional configuration of the neighbor, you certainly could just by giving the neighbor, the IP address, and then whatever settings outside of the peer session template that you want to give to that neighbor. So that you still have that same flexibility that we saw with peer groups, where you can still configure individual settings for that specific neighbor outside of the template approach for that neighborship.
You might think that policy templates would be of a similar construction and usage to session templates, and you would be right. Remember, if your session templates are where we're going to configure the parameters that would relate to a BGP session, of course, policy templates are going to be where we store settings that are going to be applying to policy.
Example 4 shows the configuration and usage of a BGP policy template.
Example 4: BGP Policy Templates
Yes, all of these settings that we discussed when we were discussing policy manipulations, those are going to be what we would do inside of a policy template. Now, one key differentiator between our policy template and our session template, though, is the fact that inheritance is going to be even more flexible here.
For instance, we can go up to seven different templates that we can directly inherit policy from. This gives us even more powerful inheritance capabilities with the policy templates when compared to the session templates.
Once again, if we want to do independent individual settings for policy to a specific neighbor, we can do that by adding the appropriate neighbor commands.
Thanks to loop prevention and the IBGP split-horizon rule, among other factors, we know that we need to come up with some scalability solutions for IBGP peerings. One of those solutions is route reflectors.
Examine the topology shown in Figure 1, notice R3 is to be configured as a router reflector.
Figure 1: A Sample Route Reflector Topology
The configuration of route reflection is amazingly simple as it is all handled on the router reflector itself (R3). The clients that we're going to have, the R4, R5, and R6 route reflector clients, they're completely unaware of the configuration and are configured for IBGP peerings with R3 as normal. Example 5 shows an example of a router reflector configuration. Note this is through the simple specification of a route reflector client.
Example 5: BGP Route Reflection
The route reflector automatically creates a cluster ID value for the cluster, and that device and these clients are going to be part of what we call a router reflector cluster. Cisco recommends that you permit the automatic assignment of the cluster ID to identify the client. This is a 32-bit identifier that BGP pulls from the route reflector.
The magic of route reflection is in how the rules of IBGP change. For example, if an update comes in from a route reflector client (let’s say R4), then the R3 device “reflects” this update to its other clients (R5 and R6) as well as its non-clients (R1 and R2). All of this updating occurs even though are configuration for IBGP is well short of a full mesh of peerings that would ordinarily be required.
Now what about if the update comes in from a non-route reflector client (R1)? The route reflector will send that update to all of its route reflector clients (R4, R5, and R6) . But then R3 is going to follow the rules of IBGP, and in this case, it will not send an update via IBGP to the other non-route reflector client (R2).
In order to solve this issue, you would need to create a peering from R1 to the R2 device using IBGP. Or, of course, you could add R2 as a route reflector client of R3.
There is another way that we could attack the issue with IBGP scalability, and this is to manipulate the EBGP behavior. We do this with confederations. You just don't see confederations used as much as route reflection, and the reason is they are going to add quite a bit of complexity to your topology, and they can make troubleshooting more challenging. Figure 2 shows an example confederation topology.
Figure 2: A Sample Confederation Topology
We've got our AS 100 here. And what we do when we confederate is we go in and we make little sub autonomous systems inside of our main autonomous system. We would number these with, yes, you guessed it, private-use-only autonomous system numbers.
What we have is we are manipulating the EBGP behavior, because we are going to have confederation EBGP peerings that we can then configure between the appropriate devices that we want to use in these sub autonomous systems. As you might guess, they're not going to follow the same rules that our standard EBGP peerings would follow. Another important point is that this whole thing, to the outside non-confederated world, just looks like AS 100.
Inside, we see the actual AS sets that are inside there, and the confederated EBGP relationships between them. Other than eliminating the IBGP split horizon concern, what happens with the confederation EBGP peerings that make them different? Next hop behaviors have to change. The next hop does not change as we're going from one of these little confederations inside our AS to another confederation. Something else that happens is the local preference is going to be maintained between these different entities that we created for scalability. Also, the MED is going to be passed between those entities.
Newly added attributes will serve to ensure there is not a loop due to the confederation. The AS_confed_sequence attribute, and the AS_confed_set are used as loop prevention mechanisms.
Example 6 shows a sample partial BGP confederation configuration.
Example 6: A Sample Partial BGP Confederation Configuration
You will often discover that you need to apply common policies to a large group of prefixes. This is made easy if you flag prefixes with a special attribute value called a community. Note that by themselves, community attributes do nothing to the prefixes other than to affix an identifier value. These are 32-bit values (by default) that we can name to provide extra meaning.
You can configure community values so that they are meaningful to your AS only, or meaningful to a set of ASes. You can also have a prefix that carries multiple community attribute values. It is also simple to add, change, or remove community values as needed inside your BGP topology.
Community attributes can be represented in several formats. The older format is as follows:
Decimal - 0 to 4294967200
Hexadecimal – 0x0 to 0xffffffa0
The newer format is to use:
AA is a 16 bit number that represents your AS number, followed by a 16 bit number that you would use for significance in your own AS for policy. So you might have 100:101 for AS 100 and an internal policy you want applied to the prefixes with this community value that you have numbered 101.
There are also well-known community values. These are:
No-export – prefixes are not advertised outside of the AS; you can set this value as you send a prefix into a neighbor AS in order to cause that neighbor AS to not advertise the prefix beyond its AS boundary
Local-AS – prefixes with this community attribute are never advertised beyond the local AS
No-advertise – prefixes with this community attribute are not advertised to any device
These well-known community attributes are simply identified by their reserved names.
There are also extended communities that you can use. These offer 64-bits for the identification of communities! Often times, these bits are set for that you have a TYPE:VALUE configuration. An extended community setting might look like this:
As you might guess, we set (and act) on community values using route maps. Example 7 shows an example. Note that this example also makes use of a prefix list. These are often used in BGP for the flexible identification of many prefixes. They are much more flexible than access lists for this purpose. You specify the bits that must match with a flexible prefix notation, and then you can specify a flexible length of the subnet mask that accompanies the prefixes.
Example 7: Setting Community Values in BGP
Note: It is very easy to set communities and then forget to send them. Do not forget the send-community property as you see in Example 7.
That's going to wrap up Part 5 of our BGP series. Coming up in Part 6, we'll take a look at how we can work with BGP in IPv6. Take good care.
In this post, we're going to take a look at configuring BGP to advertise Network Layer Reachability Information (NLRI), and also the configuration of a BGP routing policy.
Before we even begin advertising NLRI using our various commands in this section, let’s take a moment to discuss an old feature of BGP that Cisco disables by default for you. The feature is called BGP synchronization. For proof that Cisco has disabled this feature on your device, just perform a show running-configuration on one of your lab BGP speakers and under the BGP process you will find the command no synchronization. If enabled, the synchronization feature prevents a BGP speaker from entering prefixes into BGP if there is not a correlated entry for the prefix in the underlying IGP (or static routes). This helps to prevent “black hole” type situations where devices in the path are not running BGP and cannot forward to the BGP prefix because they lack a route to that prefix from their IGP. This feature is off by default now because of the creation of many different scalability mechanisms that exist in BGP that permit you to configure an IBGP topology without the requirement for a full mesh of IBGP peers. Another reason it is turned off is the fact that it was somewhat encouraging the redistribution of BGP prefixes into the underlying IGP, and this is not always a safe or desirable design practice.
There is a reason that Cisco is moving away from the use of the network command to configure IGPs at the CLI. It is never a good idea in programming to have a command that does very different things when used in different areas. This is the case with the network command. When used with an IGP, you are enabling the protocol on interface using the command (and also impacting which prefixes are advertised), but with BGP, the network command does something different. It is not enabling BGP on specific interfaces, instead it is advertising a prefix that exists (somehow) on the local device and injecting it into BGP.
While the prefix that you might advertise into BGP is most often found in your IGPs contribution to the routing table, you can use other techniques to generate the prefix for advertisement. For example, you could create a loopback interface that possesses the network prefix that you want to advertise. Or you could create a static route, or even a static route pointing to Null0.
The one tricky little bit associated with the network command in BGP is the fact that if your subnet mask for your prefix is not on an IP address classful boundary (for example, 10.0.0.0/8), then you need to remember to use the mask keyword and supply the correct mask when using the command. Example 1 shows the creation of two loopback interfaces and the advertisement of their prefixes into BGP. Notice this example also shows the verification of these prefix advertisements on the ATL router.
Example 1: Using the Network Command in BGP
While the network command is simple and convenient, it would not be efficient if you had many prefixes to advertise. Another option is to redistribute prefixes into BGP from an IGP or static routes. Example 2 demonstrates redistributing prefixes that have been learned via EIGRP into BGP. Note in the verification that the origin code for these prefixes appears as a (?) indicating Unknown.
Example 2: Redistributing Prefixes into BGP
When you start advertising NLRI into BGP, you might come across prefixes in your BGP table (shown with show ip bgp) that have an (r) status code instead of the expected valid status code (*). Therstatus code indicated a RIB failure, meaning that BGP tried to place the prefix in the BGP table, but could not due to some issue.
The most common reason for RIB failure is administrative distance (AD). For example, IBGP learned prefixes carry a pretty terrible AD of 200. This means that if your router has learned the prefix through an IGP (even one as bad as RIP with an AD of 120), it will be preferred over the IBGP prefix. As a result of this AD issue, BGP does not mark the prefix as valid. Note that this tends not to happen with EBGP-learned prefixes as they have a very preferable AD of 20 (by default).
Many often times, if it is desirable to have the prefix in the IGP and BGP, administrators will manipulate the AD values on their routers to improve the AD if IBGP. For example, in the case of RIP and BGP, the admin could set the AD of IBGP learned routes to 119 in order to make them preferred over the IGP in use.
In addition to catching RIB failures in the results of show ip bgp, you can use the more direct command show ip bgp rib-failure in order to see any prefixes in this status. This is especially helpful in the case of massive BGP tables.
Configuring BGP Routing Policy
It is fairly common to encounter topologies where you explicitly do not want to advertise prefixes in your BGP table, or you might not want to receive certain prefixes from a BGP peer. Fortunately, there are many tools at your disposal for doing this. For example, here are just some methods you could employ in order to filter prefixes:
AS Path filters
Example 3 demonstrates one of these filtering methods for you. I selected the route map approach, because everyone (rightly so) loves route maps. Notice how we block one of the two prefixes we would be learning from AS 100 using this approach.
Example 3: Using a Route Map as a Prefix Filter in BGP
Notice the use of the clear ip bgp * soft command before I do my verification. This ensures the device refreshes the BGP information right away for me so that I do not have to wait for timer expirations when it comes to converging BGP on the new policy manipulations we have made.
Remember, BGP uses many different path attributes instead of just a simple metric in order to provide you with the opportunity to easily tune and tweak the manner in which routing occurs. Just some of the path attributes you might manipulate in order to effect policy would be:
You might wonder how AS Path could be manipulated in order to effect routing. After all, the AS Path is set by the routers in the path as the prefix traverses the AS topology. AS Path manipulation is often done with AS Path Prepending. You poison a prefix by prepending your own AS number to the path in order to make a longer (less preferable) AS Path. Like most of our path attribute manipulations, this is easily accomplished using a route map.
Let us walk through an example of using Local Preference to manipulate policy. We often use Local Preference to influence how we will route traffic outbound to a BGP prefix. We do this by setting Local Preference values inbound on multiple paths. Before we get started, understand that Local Preference is a value that is examined fairly high up in the BGP best path decision process, a higher value is preferred, and the values are only passed in IBGP updates. That is how the LOCAL got in to the name Local Preference.
To begin, I have advertised the same prefix into AS 200 (ATL and ATL2) from the TPA1 and TPA2 routers of AS 100. Looking at Example 4, you can see that this prefix (192.168.1.0) can be reached using the next hop of 10.10.10.1 and that this is the preferred path. An alternate path that would be used if this path failed would be through next hop 10.21.21.1.
Example 4: Getting Ready to Use Local Preference
It is now time to have some fun and change this behavior with an example path attribute manipulation. My approach will be to identify the prefix we want to manipulate (192.168.1.0) and raise the Local Preference value to be greater than the default of 100 for the path to TPA2 at next hop 10.21.21.1. I do this by manipulating the prefix as it enters via the 10.21.21.1 path. Example 5 shows this configuration.
Example 5: Manipulating Paths with Local Preference
Notice the preferred path is now via the next hop 10.21.21.1 as we desired. The non-default Local Preference value of 110 also displays for that prefix. This higher value is preferred and changes the selection made by the BGP Best Path Selection process.
That's going to wrap up Part 4 of our BGP series. Coming up in Part 5, we'll take a look at BGP scalability mechanisms. Take good care.
Now, in this post, you'll learn about how BGP neighborships are formed, within an autonomous system, between autonomous systems, and even between routers that are not directly connected. Also, we'll check out BGP authentication.
Given that BGP is an AS-to-AS routing protocol, it would make good sense that external BGP (i.e. eBGP) is a key ingredient in its operations. The very first thing that we need to keep in mind with eBGP is that the standards are built so that there is a requirement for a direct connection. This is something that we can work around (of course), but this point is worth consideration. Because a direct connection is assumed, the BGP protocol does two things:
It's going to check for a time-to-live (TTL) value, and that the time-to-live value is set to 1. This forces what appears to be a direct connection between EBGP peers.
There is a check made to make sure the two devices live on the same subnet.
Another important point of consideration with eBGP peerings is the TCP ports that are going to be in use. This is especially important (and frequent) with firewall configurations that are protecting an autonomous systems. The first BGP speaker to initiate the state changes that are going to take place as the neighborship forms will source traffic from a randomized ephemeral TCP port, and the destination port will be TCP port 179. The responding BGP speaker will source traffic from TCP port 179, and the destination port will be the randomized ephemeral port. Firewalls must be adjusted to accommodate for the changes in communication based on which a BGP speaker initiates the session, and this could, of course, change for future a session. Some administrators will even put mechanisms in place to ensure the peerings formed are sourced from a known direction.
What about IPv6? Well, as stated earlier in a previous post, BGP is very flexible and welcoming of IPv6 since the protocol was originally engineered that way. You can form eBGP (and iBGP) peerings using IPv6 addressing even if you're carrying IPv4 prefixes for the network layer reachability information.
To form you eBGP peering, you do the following:
Start the routing process for BGP, and specify the local AS (router bgp local_as_number).
Provide the remote eBGP speaker's IP address and the remote AS number (neighbor ip-_of_neighbor remote-as remote_as_number).
Example 1 demonstrates the configuration and verification of an EBGP peering between the TPA1 and the ATL routers.
Example 1: Configuring an eBGP Peering
NOTE: To assist you when learning BGP, you might want to enable the debug ip bgp feature as you are building a peering. This will allow you to see the transition states in the neighborship. Also, to gain even more information about neighborships, you can use the command show ip bgp neighbors.
Creating an IPv6-based eBGP peering is also simple. The only real change is that we go under the address family for IPv4 and activate the neighborship. Address families in Cisco routers for BGP permit you to run many different Network Layer Reachability Information (NLRI) schemes under the same overall BGP process. Example 2 demonstrates the IPv6 peering approach.
Example 2: An EBGP Peering Configuration Using IPv6
If you examine the topology closely, you might notice something that looks a bit strange. Why is there an iBGP peering created between TPA1 and TPA2? It looks completely unnecessary. Well, in this case, looks can certainly be deceiving. One of the main considerations you must master regarding BGP is the fact that there is something called the iBGP Split Horizon Rule. This rule states that no IBGP speaker can take in an update and then send that same update to another iBGP peer. Wow. This requires that we fully mesh our iBGP speakers in order to provide full awareness of prefixes.
Early on in the development of BGP, a full mesh of peerings was not a big concern. Especially if you are only talking about three routers like in our example. Unfortunately, over time, this requirement became untenable, and a couple of mechanisms exist (namely route reflectors and confederations) to work around this rule.
Another important consideration with IBGP is redundancy. We certainly want to establish multiple physical links between devices, but what happens if the link being used for BGP fails? How can we have an automatic cutover to a peering using an alternate link?
An easy way to solve this is to implement loopback addresses, and use these addresses to peer. This is something we will often do with our BGP peerings, and it might require, depending on your vendor, some additional commands in order to get it to work. For example, with Cisco, we must specifically indicate that the source of the peering is the loopback IP address.
NOTE: Another important consideration when peering between loopback addresses in iBGP is that the loopback addresses are actually reachable between the BGP speakers. This is where an Interior Gateway Protocol (IGP), such as OSPF or EIGRP, comes in very handy.
Example 3 shows the configuration of an iBGP peering between TPA and TPA1 devices. Note we use the loopback approach in the event we want to add redundant links between the devices in the future.
Example 3: Configuring an iBGP Peering
During the eBGP peerings section of this post, we discussed the expectation by BGP that your neighbors be directly connected. During the iBGP section, we discussed the advantage of peering between loopbacks for redundancy. Now it is time to answer the question: What if your eBGP speakers are not directly connected? In fact, if we want to peer between loopbacks with eBGP to take advantage of potential redundancy, how would we do that, since the loopback interfaces are NOT directly connected with each other?
BGP solves this with the eBGP multihop option. With the eBGP multihop setting, you indicate the maximum number of permissible hops. This overrides the check by BGP for a TTL of 1 as we presented earlier in the post. But what about the requirement for a direct connection? BGP actually disables this check quietly in the background for us automatically when we use the eBGP multihop feature.
Example 4 demonstrates the eBGP multihop configuration between TPA1 and ATL. Multihop is needed here, because we are peering between the loopbacks of the devices.
Example 4: eBGP Multihop
Most organizations today add authentication to their BGP configurations in order to help protect them against a wide variety of attacks. Authentication helps to ensure that you do not just peer willing with any BGP speaker out there. Admittedly, this is a bit tougher to have happen to your BGP than with other routing protocols, since a configuration of peerings is a manual process that must be conducted on both of the devices wanting to peer. Even with this said, authenticating the devices (eBGP or even iBGP) is an excellent idea.
The great news is, Cisco makes it remarkably easy to add authentication to your peerings. You simply add a password (i.e. shared secret) on each device configured for the peering. Be sure to understand that this shared secret is going to appear in clear text (by default) inside of your configuration. You might want to consider using the service password-encryption command in order to do at least a weak encryption of those clear text passwords appearing in the router configuration.
Message Digest 5 (MD5) authentication is the result of your simple password configuration on the devices. Example 5 shows authentication added to the configurations for TPA1 and ATL.
Example 5: Configuring Authentication for your BGP Peers
That's going to wrap up Part #3 of our BGP series. Coming up in Part #4 we'll take a look at how we advertise Network Layer Reachability Information. Take good care.
Part 1 of our blog series on Border Gateway Protocol (BGP) gave you an overview of BGP and then delved into BGP message types and neighbor states. Now, in this post, you'll learn about one of the most challenging aspects of BGP, how it makes its path selection decision. While routing protocols such as RIP, OSPF, and EIGRP each have their own metrics used to pick the "best" path to a destination network, BGP uses a collection of path attributes (PAs).
BGP Path Attributes
When your BGP speaker receives a BGP prefix, there are going to be many path attributes tagged to it, and we know that these are going to be critical when it comes to BGP doing things like choosing a very best path to a destination. Interestingly, not all of these path attributes are created equal.
All BGP path attributes fall into one of four main categories. Note that this list also provides example attributes in each category. Do not be too concerned with these specific attribute values now, as you will understand many of them fully when you complete this blog series.
Well-Known Mandatory (for example: Origin, AS Path, and Next Hop)
Well-Known Discretionary (for example: Local Preference)
Notice that two of the categories begin with the term well-known. Well-known means that all routers must recognize this path attribute. The two other categories begin with the term optional. Optional means that the BGP implementation on the device doesn't have to recognize that path attribute at all.
Then we have the terms mandatory and discretionary associated with the well-known term. Mandatory means that the update must contain that attribute. If the attribute does not exist, a notification error message will result, and the peering will be torn down. Discretionary, of course, would mean the attribute does not have to be in the update.
With the optional attribute categories, we have transitive and non-transitive. If transitive, the device needs to pass that path attribute on toward its next neighbor. If non-transitive, it can just ignore that attribute value.
Example 1 shows the examination of several of the path attributes for a prefix that has been received by the TPA1 router from the ATL router. Note that we use the show ip bgp command in order to see this information that is stored in the BGP routing database. Specifically, this output shows the attributes of Next Hop, Metric (MED), LocPrf (Local Preference), Weight, and Path (AS Path).
Example 1: Viewing Some of the BGP Path Attributes for a Prefix
The Origin Attribute
The origin attribute in BGP is an attempt to record where a prefix came from. There are three possibilities when it comes to the origin for this attribute: IGP, EGP, and Incomplete. You can see from the legend in Example 1 that the codes Cisco uses for these origins are i, e, and ?. For the prefix shown in Example 1, you can see that the origin is IGP. This indicates that the prefix made its way into this topology thanks to the network command inside of the configuration of that source device. We will cover the network command in all of its glory later on in this chapter. The term IGP here assumes that the prefix came from an Interior Gateway Protocol entry. Let's say we have a prefix in our OSPF routing table, and then we use the network command inside of BGP to put it into the BGP ecosystem. Of course, IGPs are not the only source for prefixes that might carry this attribute. For example, you might create a local loopback interface on the device, then use the network command to have this local prefix advertised into BGP.
EGP is referencing the now antiquated Exterior Gateway Protocol, the predecessor to BGP. As a result, you are never going to see this origin code.
Incomplete means that BGP is unsure of exactly how the prefix was injected into the topology. The most common scenario here is that the prefix was redistributed into Border Gateway Protocol from some other protocol, typically an IGP.
It is a fair question to ask why the origin code is of such importance. The answer lies in the fact that it is a key factor when BGP is using its algorithm to select a best path to a destination in the network. It can break “ties” between multiple alternative paths in the network. We also give this attribute great attention, because it is indeed one of the well-known, mandatory attributes that must exist in our updates.
The AS Path Attribute
AS Path is a well-known mandatory attribute. It is absolutely critical for the best path decision, as well as loop prevention inside of Border Gateway Protocol.
Examining our topology shown in Figure 1, consider a prefix originated at TPA. This update is sent to TPA1, and TPA does not add (called prepending in BGP) its own AS of 100 in the AS Path, since the neighbor it is sending the update to is in its own AS per the iBGP peering.
Figure 1: A Sample BGP Topology
When TPA1 sends this update to ATL, it will prepend the AS number of 100 to the update. Following this logic, ATL can update ATL2 and will not prepend its own AS number. It is not until ATL2 sends this on to some other AS when it will prepend the AS of 200. This means that when we examine a sample AS path as shown in Example 2, the rightmost AS in the path is the AS that first originated the prefix (100), and the leftmost AS is the AS that delivered the prefix to the local device (342).
Example 2: A Sample BGP AS Path
The Next Hop Attribute
It's really no surprise that a BGP prefix has an attribute called Next Hop. After all, a router is going to need to know where to send traffic for that prefix. The Next Hop attribute fulfills this need. An interesting point here, however, is the fact that Next Hop in BGP does not work exactly as it does in most IGPs. Also of note is that the rules change when you are examining iBGP versus eBGP.
When thinking about an Interior Gateway Protocol, when a device sends an update to its neighbor, the default next hop value is the interface IP address that is sending the update. This continues to be reset by each router as the update traverses the topology. The next hop takes on a simple “hop-by-hop” paradigm.
With BGP, when we have an eBGP peering and the prefix is sent, the Next Hop is indeed going to be (by default) the IP address of the eBGP speaker sending the update. However, this eBGP speaker's IP address is going to be retained as the Next Hop as the prefix is passed on from iBGP speaker to iBGP speaker. Very often we see the Next Hop attribute populated with an IP address that is not the device that handed us the update. It is really an address that represents the neighboring AS, which provided us with the prefix. So, it is correct to think of BGP as an “AS-to-AS” protocol instead of a “hop-to-hop” protocol.
This can cause some interesting issues. The main consideration here is that you must ensure all of your BGP speakers can reach the Next Hop value in the attribute so that they consider the path valid. Smartly, BGP speakers will consider a prefix invalid if they cannot reach the Next Hop value.
Fortunately, there is a workaround you can use in specific cases. You can take an iBGP device and instruct it to set itself as the Next Hop value whenever you need to. This is done with a manipulation of the peering using the neighbor command, as shown in Example 3.
Example 3: Manipulating the Default Next Hop Behavior in BGP
The BGP Weight Attribute
Weight is a very interesting attribute with BGP, because it is specific to Cisco. The good news is that since Cisco is such a giant in the industry, many other vendors will support the use of Weight as an attribute.
Weight is also one of the most unique attributes, because the value is not passed to other routers. Weight is a value that is assigned to our prefixes as a locally significant value. Weight is a simple number in the range of 0 through 65535, and the higher the weight value, the higher the preference for that path. When the prefix is locally generated, it will get a weight of 32768. Otherwise, the default weight is 0 for a prefix.
How might Weight be used? It seems strange at first, since it is not passed to other BGP speakers. However, the answer is simple. Let’s say your router receives the same prefix from two different autonomous systems that it peers with. If the administrator wants to prefer one of the paths for whatever reason, they can manipulate the local Weight value on the preferred path and instantly influence BGP's best path decision process.
BGP Best Path Selection
As stated earlier, we know that with IGPs, we have a metric value that is key for determining the best path to a destination. In the case of OSPF, that metric is based on cost, which is based on bandwidth. With BGP, there are many path attributes that a prefix can have. These all lend themselves to the BGP Best Path Selection Algorithm. Figure 2 shows the steps (beginning at the top) that are used in the Cisco BGP selection of best paths.
Figure 2: The BGP Best Path Selection Algorithm
As you examine these path decision criteria, you might immediately question why it has to be this complicated. Remember, when we're dealing with something like the Internet, we want there to be as many “tuning knobs” as possible for BGP policy. We want to be able to control as much as possible how prefixes are being shared and preferred throughout such a large and complex network.
NOTE: Other vendors follow this similar order and consideration of path attributes, but there might be slight differences. For example, Juniper would have Juniper-only considerations in their algorithm, but the main order of the well-known, mandatory attributes would be about the same. For example, the shortest AS Path would come before something like the Origin type.
When preparing for a certification exam that covers BGP, you should make yourself aware of key facts about this algorithm. For example, it is critical to know that Cisco considers the highest Weight value before looking at any other path attribute.
It is also very important to realize that before this analysis of the best path can even take place against a prefix, there must be some checks the prefix must pass in order to be even compared against other paths. For example, as discussed earlier, the Next Hop must be accessible.
Once a prefix is considered valid, the analysis starts from the top down of these best path criteria. If there is a difference in Weight values for multiple paths of a prefix, the path with the highest Weight is preferred path, and any further analysis stops. Notice also in this algorithm (as it is often referred to) that as you near the bottom of the list, the values compared are almost silly. They exist there just to break any ties that are resulting from earlier comparisons. After all, at least one path must be considered best.
I hope you've enjoyed this look at BGP path attributes. In Part 3 of this series, you'll learn about how BGP neighborships are formed, within an autonomous system, between autonomous systems, and even between routers that are not directly connected. See you then.