Category Archives: 3. Data Center

Study about data center.
– Cisco UCS
– Cisco MDS
– Design
– Media Types

ACI Training Videos

ACI Training Videos

ACI Training Videos help you understand how to deliver software flexibility to modern IT consumption models and ensure agile Data Center network environments. The ACI development team and senior ACI leaders share their experiences and knowledge about the challenges and best practices to maximize the power of scalable centralized automation solutions and policy-driven application profiles within your network environment.

https://learningnetwork.cisco.com/community/learning_center/aci-training-videos/videos?mkt_tok=eyJpIjoiWTJSa1pESmpPRFEyWm1VNSIsInQiOiJJSHNXR3Q5V1pyTkpoamFWM1pcL0UydkxZVndaTERiMWxSVVpyeFlzV1I4XC9Dc2p6MEdwb01GYVJ6Z2J6N2FWTEFIdCsxcWVnZlpLc2dPNXkwZG9TYkt2aklXVWVudVEzYlwvMm9kb29vcUpYeFwvanN1elIxaUVjOFdvUDliWE9lcUkifQ%3D%3D

OTV Deployment

Important OTV Features

Scalability

Extends Layer 2 LANs over any network that supports IP
Designed to scale across multiple data centers
Simplicity

Supports transparent deployment over existing network without redesign
Requires minimal configuration commands (as few as four)
Provides single-touch site configuration for adding new data centers
Resiliency

Preserves existing Layer 3 failure boundaries
Provides automated multihoming
Includes built-in loop prevention
Efficiency

Optimizes available bandwidth, by using equal-cost multipathing and optimal multicast replication

https://www.cisco.com/c/en/us/products/collateral/routers/asr-1000-series-aggregation-services-routers/guide-c07-735942.html
https://www.cisco.com/c/en/us/td/docs/switches/datacenter/sw/nx-os/OTV/quick_start_guide/b-Cisco-Nexus-7000-Series-OTV-QSG.html

Trill

Transparent Interconnection of Lots of Links:

http://searchnetworking.techtarget.com/definition/Transparent-Interconnection-of-Lots-of-Links-TRILL
http://slideplayer.com/slide/3425124/
https://www.google.com.ph/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwiO7rfQ27vXAhWCVrwKHbtNC3sQFggmMAA&url=https%3A%2F%2Fwww.cisco.com%2Fc%2Fen%2Fus%2Fabout%2Fpress%2Finternet-protocol-journal%2Fback-issues%2Ftable-contents-53%2F143-trill.html&usg=AOvVaw33FHypLqWfyENEXsCwLOpl

ACI Stretch Fabric Design

  1. Cisco recommend a minimum of 3 APIC Server?
    • Is this per site or possible to have distributive setup which installed on different site? -“We only allow a single APIC in lab/test environments where redundancy is not required.” So meaning you can still use a single server to control the devices? But in this statement “When the connection between two sites is lost, the site with one APIC controller will be in the minority (site 2 in the figure above). When a controller is in the minority, it cannot be the leader for any shards. This limits the controller in site 2 to read only operations; administrators cannot make any configuration changes through the controller in site 2”

So meaning I cannot use a singe APIC server to control the infrastructure?

  1. Split brain condition?
    • What does split brain condition means? Is it for 2 APIC server or This is for multiple sites with distributed APICS?
    • Can give simple scenario that split brain occur?

Answer:

white-paper-c11-737855_0

https://supportforums.cisco.com/t5/application-centric/increasing-apic-size-and-split-brain-condition/td-p/3185120
http://policyetc.com/post/aci-stretched-fabric-design
https://aci-troubleshooting-book.readthedocs.io/en/latest/apic.html#majority-and-minority-handling-clustering-split-brains

APIC/Fabric Discovery Process II

In this discovery process, a fabric node is considered active when the APIC and node can exchange heartbeats through the Intra-Fabric Messaging (IFM) process. The IFM process is also used by the APIC to push policy to the fabric leaf nodes.

Fabric discovery happens in three stages. The leaf node directly connected to the APIC is discovered in the first stage. The second stage of discovery brings in the spines connected to that initial seed leaf. Then the third stage processes the discovery of the other leaf nodes and APICs in the cluster.

The diagram below illustrates the discovery process for switches that are directly connected to the APIC. Coverage of specific verification for other parts of the process will be presented later in the chapter.

The steps are:

Link Layer Discovery Protocol (LLDP) Neighbor Discovery
Tunnel End Point (TEP) IP address assignment to the node
Node software upgraded if necessary
Policy Element IFM Setup

apicd05

During fabric registration and initialization a port might transition to an “out-of-service” state. Once a port has transitioned to an out-of-service status, only DHCP and CDP/LLDP protocols are allowed to be transmitted. Below is a description of each out-of-service issue that may be encountered:

fabric-domain-mismatch – Adjacent node belongs to a different fabric
ctrlr-uuid-mismatch – APIC UUID mismatch (duplicate APIC ID)
wiring-mismatch – Invalid connection (Leaf to Leaf, Spine to non-leaf, Leaf fabric port to non-spine etc.)
adjaceny-not-detected – No LLDP adjacency on fabric port
Ports can go out-of-service due to wiring issues. Wiring Issues get reported through the lldpIf object information on this object can be browsed at the following object location in the MIT: /mit/sys/lldp/inst/if-[eth1/1]/summary.

http://aci-troubleshooting-book.readthedocs.io/en/latest/fabric_init.html

APIC/Fabric Discovery Process I

When you connect your ACI environment.
– Connect Spine to leaf and Leaf to Spine.
– We do not connect spine(S-S) and leaf(L-L) to each other.
– Plug-In your APIC Controller.

Then you can now connect to APIC(WEB) and start the discovery process. This is a zero touch fabric, Which mean we don’t need any configuration of the switches in environment. The controller does it all for us.

ApICdisco99
When APIC discovered the first device.
– Give a name
– Give a number
– Make it part of the fabric

Actual Administrative Console

1. Connect via Https and Login.
apicd01

2. Discover Devices (Fabric > Fabric Membership).
apicd02
(First discovered device)

3. Adding the device to the network/fabric (Double-click the first device).
apicd03
Give name and number then update.

The device then create secure connection to the leaf and do all the configuration like L3, L2, Assign IP address on the network(Leaf obtains IP Address from APIC using DHCP), Add to topology, start pulling event information. As you update and add additional leaf and spine during the discovery process the system will continue to configure those device.

System is automatically documenting the topology of the network (Fabric > Topology).
apicd04

2 tier over 3 tier architecture

My Question about DC traditional and new era network architecture.

Saw an article saying we are moving toward to 2 tier architecture in our data center infrastructure, Since according to the article we have a lot of east and west traffic now and traditional 3 tier is like insufficient.

DC 2 Tier setup was:
N9k Platform <– Core
N9k w/ N2k <–Agg/Access

We used to see 2 tier/Collapse Core architecture in small-medium campuses.

The data center uses a clos fabric made up of spines nodes and leaf nodes. The layer 2 and layer 3 outside links (connecting to domains outside of the clos fabric) as well as the firewalls and other service insertion are typically done on a leaf node (although I think there are cases where some of those things are supported on the spine).

Generally speaking the spine nodes only connect to leaf nodes and leaf nodes only connect to spines (within the backbone of the fabric). This architecture offers equal-cost multi-pathing, non-blocking links, predictable latency and delay, etc.

The main difference between them is the massive bandwidth you can have contrasting CLOS (leaf-spine-leaf) topology and Hierarchical topology (3 tier – core/aggregation/access). As every leaf is connected to every spine. the fabric transport capacity would be the addition of those links to every spine.

So, if you have 4 leaves with 40 GB links to the Spines, and 2 spines, then you have 80GB of transport capacity in the fabric. Every leaf would have 80 GB (adding the 40GB per link, per spine).

spl917

Also, in the case of deployments where the Spines are connected as border devices, its seen in multipod deployments. Usually, the rule of the thumb is to only connect leaves in Spines, nothing else goes there. But, with this new implementation you can merge 2 pods and connect them via Spines. I believe it started to be supported in ACI version 2.2 and spine/leaf switches must support IPN (only supported in 2nd generation of spines/leaves).

https://learningnetwork.cisco.com/thread/112163

Increasing APIC Size

Why is it recommended to use minimum of 3 APIC Server?

To understand why three APICs is the recommended minimum you must understand how the APICs distribute information between the three. All parts of ACI are datasets generated and processed by the Distributed Policy Repository and that data for those APICs functions are partitioned into logically banded subsets called shards (like DB shard). a Shard is then broken into three replicas or copies. each APIC has a replica for every shard but only 1 APIC is the master for a particular replica/shard. This is a way to distribute the workload evenly and load balance processing across the cluster of 3 as well as a fail safe in case an APIC goes down.

Now that the theory is out of the way, imagine one of your three APICs goes down. the remaining two will negotiate who will now be the master for the shards that the down APIC was in charge of. Workload is then load balance to the two and the cluster becomes fully fit again. Working with 2 APICs is really unadvised due to the split brain condition. This occurs when APIC 1 and APIC 2 thing they are both leaders for a shard and cannot agree so the shard is in contention and the cluster is unfit/”data layer partially diverged”. with the cluster in this state it is unadvised to make changes in the GUI, i don’t remember if its even allowed.

With the case of only 1 APIC, that APIC does all the work, it is the leader for all shards but if it goes down then you can not make any changes at all. data plane will continue forwarding but since no APIC, theres no way to create new policies or changes.

https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/kb/b_KB_Cluster_Management.html#task_3F7041739BD147B3A3BA9C2EA42115F8

https://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/kb/b_kb-aci-stretched-fabric.html

http://aci-troubleshooting-book.readthedocs.io/en/latest/apic.html#majority-and-minority-handling-clustering-split-brains