Loading...

Follow ipSpace.net Blog on Feedspot

Continue with Google
Continue with Facebook
or

Valid
ipSpace.net Blog by Ivan Pepelnjak - 3w ago

It’s high time for another summer break (I get closer and closer to burnout every year - either I’m working too hard or I’m getting older ;).

Of course we’ll do our best to reply to support (and sales ;) requests, but it might take us a bit longer than usual. I will publish an occasional worth reading or watch out blog post, but don’t expect anything deeply technical for the new two months.

We’ll be back (hopefully refreshed and with tons of new content) in early September, starting with network automation course on September 3rd and VMware NSX workshop on September 10th.

In the meantime, try to get away from work (hint: automating stuff sometimes helps ;), turn off the Internet, and enjoy a few days in your favorite spot with your loved ones!

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Daniel Teycheney attended the Spring 2019 Building Network Automation Solutions online course and sent me this feedback after completing it (and creating some interesting real-life solutions on the way):

I spent a bit of time the other day reflecting on how much I’ve learn’t from the course in terms of technical skills and the amount I’ve learned has been great. I literally no idea about things like Git, Jinja2, CI testing, reading YAML files and had only briefly seen Ansible before.

I’m not an expert now, but I understand these things and have real practical experience on these subjects which has given me great confidence to push on and keep getting better.

Read more ...
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

When I was still at university the fourth-generation programming languages were all the hype, prompting us to make jokes along the lines “fifth generation will implement do what I don’t know how”

The research team working in Networked Systems Group at ETH Zurich headed by prof. Laurent Vanbever got pretty close. The description of their tool says:

Read more ...
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Christoph Jaggi sent me this observation during one of our SD-WAN discussions:

The centralized controller is another shortcoming of SD-WAN that hasn’t been really addressed yet. In a global WAN it can and does happen that a region might be cut off due to a cut cable or an attack. Without connection to the central SD-WAN controller the part that is cut off cannot even communicate within itself as there is no control plane…

A controller (or management/provisioning) system is obviously the central point of failure in any network, but we have to go beyond that and ask a simple question: “What happens when the controller cluster fails and/or when nodes lose connectivity to the controller”

Worst-case scenario is the orthodox SDN architecture with centralized control plane residing in the controller. While packet forwarding might continue to work until the flows time out, even ARP won’t work anymore.

Architectures based on a bit more operational experience like Big Switch fabric can deal with short-term failures. Big Switch claims ARP entries reside in edge switches, so they can keep ARP going even when the controller fails. It might also be possible to pre-provision backup paths in the network (see also: SONET/SDH) so the headless fabric can deal with link failures (but not link recoveries because those require path recalculation). Dealing with external topology changes like VM migration is obviously already a mission impossible.

Some architectures deal with controller failure by falling back to traditional behavior. For example, ESXi hosts that lose connectivity with the NSX-V controller cluster enter controller disconnected mode in which they flood every BUM packet on every segment to every ESXi host in the domain. While this approach obviously works, try to figure out how much overhead (and wasted CPU cycles) it generates.

On the complete other end of the spectrum are systems with traditional distributed control plane that use SDN controller purely for management tasks. Cisco ACI immediately comes to mind - as I usually joke during my “NSX or ACI” workshops, you could turn off APIC controller cluster when going home for the weekend and the ACI fabric would continue to work just fine.

Where are SD-WAN systems in this spectrum? We don’t know, because the vendors are not telling us how their secret sauce works. However, at least some vendors claim their magic SD-WAN controller replaces routing protocols, which means that controller failure might prevent edge topology changes from propagating across the network.

There’s also the nasty question of key distribution. In traditional systems like DMVPN edge nodes exchange P2P keys with IKE and use shared secrets or pre-provisioned certificates to prevent man-in-the-middle attacks. In an SD-WAN system the controller might do key distribution, in which case I wish you luck when you’ll face a nasty WAN partition (or AWS region failure if the controller runs in the cloud).

Summary: Things are never as rosy as they appear in PowerPoint presentations and demos. Figure out everything that could potentially go wrong (like WAN partitioning), try to find what happens from product documentation, and ask some really hard questions (or change the vendor) if the documentation is not useful. Finally, verify every claim a $vendor makes in a lab.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

SD-WAN is the best thing that could have happened to networking according to some industry “thought leaders” and $vendor marketers… but it seems there might be a tiny little gap between their rosy picture and reality.

This is what I got from someone blessed with hands-on SD-WAN experience:

Read more ...
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

One of my readers sent me this question:

How can I learn more about reading REST API information from network devices and storing the data into tables?

Long story short: it’s like learning how to drive (well) - you have to master multiple seemingly-unrelated tasks to get the job done.

Read more ...
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

One of the first things I realized when I started my Azure journey was that the Azure orchestration system is incredibly slow. For example, it takes almost 40 seconds to display six routes from per-VNIC routing table. Imagine trying to troubleshoot a problem and having to cope with 30-second delay on every single SHOW command. Cisco IGS/R was faster than that.

If you’re old enough you might remember working with VT100 terminals (or an equivalent) connected to 300 baud modems… where typing too fast risked getting the output out-of-sync resulting in painful screen repaints (here’s an exercise for the youngsters: how long does it take to redraw an 80x24 character screen over a 300 bps connection?). That’s exactly how I felt using Azure CLI - the slow responses I was getting were severely hampering my productivity.

Read more ...
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

I always love to hear from networking engineers who managed to start their network automation journey. Here’s what one of them wrote after watching Ansible for Networking Engineers webinar (part of paid ipSpace.net subscription, also available as an online course).

This webinar helped me a lot in understanding Ansible and the benefits we can gain. It is a big area to grasp for a non-coder and this webinar was exactly what I needed to get started (in a lab), including a lot of tips and tricks and how to think. It was more fun than I expected so started with Python just to get a better grasp of programing and Jinja.

In early 2019 we made the webinar even better with a series of live sessions covering new features added to recent Ansible releases, from core features (loops) to networking plugins and new declarative intent modules.

Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

One of my subscribers sent me an interesting puzzle:

>One of my colleagues configured a single-area OSPF process in a customer VRF customer, but instead of using area 0, he used area 123 nssa. Obviously it works, but I was thinking: “What the heck, a single OSPF area MUST be in Area 0”

Not really. OSPF behaves identically within an area (modulo stub/NSSA behavior) regardless of the area number…

Read more ...
Read Full Article
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

In my quest to understand how much buffer space we really need in high-speed switches I encountered an interesting phenomenon: we no longer have the gut feeling of what makes sense, sometimes going as far as assuming that 16 MB (or 32MB) of buffer space per 10GE/25GE data center ToR switch is another $vendor shenanigan focused on cutting cost. Time for another set of Fermi estimates.

Let’s take a recent data center switch using Trident II+ chipset and having 16 MB of buffer space (source: awesome packet buffers page by Jim Warner). Most of switches using this chipset have 48 10GE ports and 4-6 uplinks (40GE or 100GE).

Read more ...
Read Full Article

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview