CEF polarization ?

Hi guys,

in my working place i was facing loadbalancing issues,
this is what happened 2 ASR9K'S having 4 10 gig interface between
them,no cost modification was done,routers running ospf and isis
as igp,the issues was traffic was flowing HEAVILY through
one of the issues,we raised a TAC-case and engineer issued
ip cef loadbalancing algorithm adjust <value>
and he said it may be cause of uneven flows of carying bandwidth
and he tuned the value to 9 and after some hours the utilization
got slightly better .

i understand CEF works like this,
for a single source-destination ip pair the algorithm
will xor the bits of source and destination pair and chooses
the link to be used,we are using the default method of 
per destination based load-sharing.

after encountered this issue,i was told by TAC engineer
that it was cef polarization problem,
i tried to undertand the methods to overcome this problem
they said we can use ports and some other blogs suggested
to use universal id algorithm

my question is

1)can any one explain with a example to understand
how we can avoid cef polarization in our network ?

some help would be appreciated...

Comments

  • Your load sharing mechanism should be based on the traffic patterns. So say if you have an internal web server on you network that is upstream from your ASR9K. And all your users were downstream from your ASR9K. If your load sharing mechanism is set to per-destination IP, then it would make absolutely no difference to the traffic flow. ALL of the traffic would go through only one of the 10 gig interfaces. So if you look at the options of your etherchannel, as shown below. Then src-ip would be better suited. And you would get a lot more load-sharing over each of the links in your etherchannel.

     

    R2(config)#port-channel load-balance ?

      dst-ip       Dst IP Addr

      dst-mac      Dst Mac Addr

      src-dst-ip   Src XOR Dst IP Addr

      src-dst-mac  Src XOR Dst Mac Addr

      src-ip       Src IP Addr

      src-mac      Src Mac Addr

  • We are  not using port channels in our scenario

    If my understanding  is right cef load balancing (I. E per destination  load sharing)

    works based only on xor ing the last bits of src and dst ip address  and choose  the outgoing link according to the output of xor operation  

    Am I right?

  • Have you read the following link:

    http://www.cisco.com/c/en/us/support/docs/ip/express-forwarding-cef/116376-technote-cef-00.html

    Thanks

     

     

    We are  not using port channels in our scenario

    If my understanding  is right cef load balancing (I. E per destination  load sharing)

    works based only on xor ing the last bits of src and dst ip address  and choose  the outgoing link according to the output of xor operation  

    Am I right?


     

  • yes i read it,but looking for more explanation regarding overcoming polarization part...

  • peetypeety ✭✭✭

    I'm not familiar with the ASR9K platform in particular, but I vaguely recall that the CEF hashing algorithm is a 17-way function. That suggests to me that it only XORs 17 possible bits, not 96+ possible bits (32-bits source address, 32-bits dest address, 16-bits source port, 16-bits dest port, who knows what else factors in). I suspect the thingy that TAC had you change somehow manipulates which bits are compared; perhaps 17 bits is all the linecard CPUs can justify comparing at blazing rates, but choosing WHICH bits is still configurable.

    That's my guess, and I'm sticking to it. ;)

  • my question is

    1)can any one explain with a example to understand
    how we can avoid cef polarization in our network ?

    some help would be appreciated...

    Cef Polarization phenomenon is the result of a not efficient hashing result obtained for a set of flows received by a preceeding router.

    When the result of the hashing function calculated for a set of flows produce the same result obtained by the preceeding neighbor, traffic may not be equally distributed across multiple paths and per destination-load-balancing CEF behaviour fails.

    In order to overcome this situation you should tell to the router to consider other elements in order to compute an even distribution of flows through its paths. A good idea is to use universal algorithm, which is an improvement to the original algorithm.

    You could use also L4 port algorithm which accounts also for ports and increase the chances to produce differents hashing values from the computation.

    However i never worked on ASR9K platforms so i don't know which algorithms are available there. My answer takes the technology issue from a general perspective.

  • CEF polarization is caused by routers all making the same hashing decision as traffic flows through the network.

    Like this......

    Router.A is doing ECMP for subnet 1.1.1.0 to Router.B and Router.C. Based on the hashing algorythm it'll send some traffic towards B and some towards C.

    The problem is that is Router.B also has two ECMP options and applies exactly the same algorythm, send everything out one path only.

    It's almost as if CEF will send all the even numbers out interface 1 and the odd out interface 2, and if someone upstream filters out the even numbers you'll only use interface 2.

    So.....

    To fix this Cisco brought out the Universal Algorythm. It inserts a router's unique 4 byte 'Universal ID' in there, which offsets the hash and then each router should hash slightly differently.

    The Universal Algorythm has been further enhanced with the L4 Port Algorythm and the Tunnel Algorythm.

    The best fit will be based on the sort of traffic you're seeing polarized - is it towards a specific destination...or port...or from a specific host..?

    The CCIE Routing and Switching v5.0 Official Cert Guide Vol 1 explains it quiet well.

  • The number of bits choosen is depenedent on the type of line card typoon or trident,that is what stated in the command as below

     

    "<0-31>  Adjust count - configure up to 31 for Typhoon; up to 3 for Trident cards; Value will be masked on Trident (to make it <= 3) if this configured value is beyond 3"

     

     

  • [email protected] yup i agreee with you,i saw this in a cisco document too,but it failed to explain at the level i expected,its just vague.

    but i need a calculation level example to understand it,because i am confused at which bits of the port number and ip address are choosen by default and how they are calculated and how they produce the output link.

  • DaveHill when i asked tac engineer like are we using universal unique-id algorithm,anti-polarization weight,

    ports to overcome cef polarization,he shared this document of alexander tujils.

    https://supportforums.cisco.com/document/111291/asr9000xr-loadbalancing-architecture-and-characteristics

     

    i tried to understand that document for 2 days,but it was way over me,and i asked him to explain how that ACTUALLY works and he says "CEF is implemented differently in different platforms and as it is cisco proprietary i cannot share the implementation and its internal working ".

     

    I really felt terrible with cisco.[;)]

  • peetypeety ✭✭✭

    i tried to understand that document for 2 days,but it was way over me,and i asked him to explain how that ACTUALLY works and he says "CEF is implemented differently in different platforms and as it is cisco proprietary i cannot share the implementation and its internal working ".

    I really felt terrible with cisco.Wink

    Why are you trying to get so deep into this? Do you not know your traffic well enough to know if it's single-flow, a few flows, or lots of flows? As the TAC engineer said, it's different per platform (probably even different per linecard), but you also need to respect that it's proprietary, as showing you the CEF algorithm gives you a view into their memory management structures within TCAM and such, and that's definitely something they don't want in the hands of competitors.

    Do you ask your car manufacturer how long the spark plugs fire? Probably not. Would you get this frustrated if they wouldn't tell you? I hope not.

    Oh, if your ASR9k cards are built on the Trident chip, I do send my apologies. :D

  • our asr9k running typhoon based line cards.

    why i get curious was each time when we face uneven link utilization issues we raised a TAC case and they change the polarization value after some "x" time the issue resolves temporarily,but not permenantely,this type of problem keep on popping again and again in various part of our network occasionally,so i tought like is there something wrong with the algorithm ? so it made me curious and i started to dig deep.but cisco slapped the door on my way [:P]

Sign In or Register to comment.