BSR candidate RP

Can someone please explain how candidate RP is elected when we have configured BSR ?When Hash-mask-length comes into picture and what is the use of hash-mask-length ?

Comments

  • JoeMJoeM ✭✭✭

    Hi Mukuljoshi,

    Here is a good blog by Petr -- Understanding BSR Protocol

  • Thanks joem i understand that a hash is calculated for every (rp and group )...my question is how these RP's are assigned to different groups ....

  • Thanks joem i understand that a hash is calculated for every (rp and group )...my question is how these RP's are assigned to different groups ....

    In BSR, candiate RP address, hash-mask-length & multicast group address are hashed together for each group. The hash algorithm is run for each group address for each candidate RP and a hash value is calculated. The candidate RP with the heighest calculated hash value becomes the RP for that particular multicast group. All groups with the same seed hash value correspond to the same RP.

    HTH

     

     

  • JoeMJoeM ✭✭✭


    Thanks joem i understand that a hash is calculated for every (rp and group )...my question is how these RP's are assigned to different groups ....

     

    For me, there are a couple of key points in Petr's blog.

     

    1.   The individual routers make the final decisions based on priority/hash-value. Nothing is "assigned" by the BSR.   Each router receives all of Group-to-RP-sets.

    From Petr's blog ".....Ultimately, the bootstrap messages containing Group to RP-set mappings
    are received by every multicast router in the domain and used to
    populate their RP caches. It’s up to the routers to select the best
    matching RP from the sets advertised by the BSR router.
    It is important
    that all routers select the same RP for the same group, otherwise the
    multicast sources might miss receivers. In order to make full use of
    Group to RP-set information routers might want to select different RPs
    for different groups. As mentioned previously, the load balancing
    procedure must yield the same result on all routers, to maintain
    synchronous mapping. Here is how this distribution procedure works, in
    pseudo-code.....
    ."

     

    2.     show ip pim rp-hash <multicast group>

    With this command, we can see that each multicast router receives all RP-to-GROUP-SET's.  It then chooses the RP with the HIGHEST hash-value (if priority is the same, lowest priority winning).

    Take a look at the two examples at the end of Petr's blog.   Notice that the multicast router (non-RP/non-BSR) receives all RP info.  Then look at which RP it chooses for the given group.

    It is interesting to see how this is different than auto-rp.

  • JoeMJoeM ✭✭✭

    delete.   double-posted for some reason

  • Thank you Hari !!

    Ok Highest Hash is considered .....and how that is useful in load balancing between different RP's

    Suppose i have 4 rp i want to load balance between all 4 of them for different groups so what should be my hash-mask-length...?

    How shuld i choose hash-mask length .....i think with hash-mask-length 30 we can load balance between 4 rp's please correct me if am wrong. 

  • How shuld i choose hash-mask length .....i think with hash-mask-length 30 we can load balance between 4 rp's please correct me if am wrong. 


    To load balance between n RPs, the hash-mask-length to use is log(n).

    What this means is that in order to spread the load between 2 RPs, use hash-mask-length 1; to spread the load between 4 RPs, use hash-mask-length 2; to spead the load between 10 RPs, use hash-mask-length 3; and so and and so forth. This is just mathmatics.

  • Hi HS628 

    Finally after some hoemwork i found this .....

    Candidate RP is elected based on below mentioned criteria 

    1) lowest priority 

    2) highest Hash value

    3) highest Ip adddress 

     

    If we want to load balance then 2 ^ (32 - hash_mask_length) = no of RP's for loadbalancing.

    Thank you everyone for your valuable answers !!

     

  • Hi HS628 

    Finally after some hoemwork i found this .....

    Candidate RP is elected based on below mentioned criteria 

    1) lowest priority 

    2) highest Hash value

    3) highest Ip adddress 

     

    If we want to load balance then 2 ^ (32 - hash_mask_length) = no of RP's for loadbalancing.


    Thank you everyone for valuable posts !!

  • JoeMJoeM ✭✭✭

    To load balance between n RPs, the hash-mask-length to use is log(n).

    What this means is that in order to spread the load between 2 RPs, use hash-mask-length 1; to spread the load between 4 RPs, use hash-mask-length 2; to spead the load between 10 RPs, use hash-mask-length 3; and so and and so forth. This is just mathmatics.

    Hi HS628,

    I need some help with this, as my understanding was different.

    Rather than designate the number of RP's for load-balancing, it is providing that a single RP will get a given number of consecutive groups -- before the next # of groups go to the next RP.....on and on for the remaining RP-Candidates (an RP round-robin).

    Longer hash-masks gives more frequent RP-to-group changes (better load-balancing).

      For example:

    with a /31, every RP-Candidate will get  two (2) consecutive group-sets

    with a /30,   ,,      ,,      ,,          ,,   ,,   4     ,,             ,,      ,,

    with a /29,   ,,      ,,      ,,          ,,   ,,   8     ,,             ,,      ,,

     

    Here is another thread discussing this same doubt.


    Re: 8.19 BSR - Multiple RP Candidates


     

    Thanks for any clarification on this. Much appreciated.

  • Joe,

     

    Load balancing is probably a misleading word to use in this context. The hash mask is used to give different groups different RP. But the RP assigned to a group won't change -- it's determined by the hash algorithm and stay that way. The formula could be 32 - log(n) instead of log (n). Tied up now. Will try to come back to this later.

  • JoeMJoeM ✭✭✭

    HS628,

    Thanks for following-up on this. I would like to get this exact (no doubts).

    Here is a quote from the multicast section,  BSR - Multiple RP Candidates.

    "Using this “pseudo-random” selection procedure, the whole multicast address space is partitioned among different RPs. Every
    RP will get approximately 2^[32-hash-mask-length] groups assigned,
    provided that there is enough RPs to evenly distribute the load.
    "

     

    This
    is the crux of my doubt about this.   I have seen the equation
    suggesting the hash-mask-length for a given number of RP-Candidates. 
    But I am understanding it as it is stated above.

    If we have a hash-mask-length of 28, that would give

            2^[32 - 28] =  2^4 = 16 groups per RP  (round-robin)

    Then (I believe) it would round-robin back to the first RP-Candidate -- until the address-space is exhausted. 

     

    ====================

    Ahh-haa moment: 

    HS628, I think I now understand the difference in our explanations.  I agree that "Load-Balancing" is a bad name for this. 

    You are coming at the mask from the other direction, DIVIDING the address space evenly in large slices of consecutive groups.  

    My method above would round-robin consecutive groups to the RP's, until the address-space is exhausted.  

      rp1=16  rp2=16   rp3=16   rp4=16....ok start again....rp1=16  rp2=16....until exhausted.

     

    HS628 or Anyone,  feel free to correct this, if it is wrong.  I am going to try and lab this.

  • Joe,

     

    I don't have time to do a more formal write up. Below are a few notes. Hope it helps. Corection always welcome.

     

    ~~~~~~~~~~~~

    Hash-mask-length dicates the number of bits from the group addresses (starting from the left) to be inluded in RP hash value computation. This computation will be performed against all candidate RPs. The candidate RP yeilding the highest hash value will be chosen as the RP.

    By default, zero (0) bit from the group address is used in the hash value computation, resulting in all groups sharing the same RP.

    In throry, hash-mask-length of n will divide all group addresses into 2^n segments, with each segment sharing a distinct RP.

    In practice, hash-mask-length 1 thru 4 will produce exactly the same result. This is because all multicast addresses must begin with binary bit string 1110, which means the first 4 bits are the same.

    However, hash-mask-length 1 thru 4 is not the same as hash-mask-length 0 and most likely will produce a different RP because 1110 is not the same as nil.

    Therefore, to be more exact, given hash-mask-length n (n>=4), total multicast groups will be divided into 2 ^ (n-4) segments, with each segment sharing a distinct RP.

    Finally, to be even more exact, the above is true only if there are enough RPs to go around, otherwise multiple segments will still need to share the same RP.

  • To load balance between n RPs, the hash-mask-length to use is log(n).

     


    P { margin-bottom: 0.08in; }


    P { margin-bottom: 0.08in; }


    P { margin-bottom: 0.08in; }

    log2(n) + 4 ?

     

     

  • JoeMJoeM ✭✭✭

    Hi HS628,

    Sorry. I am not really clear on your math.  If you have time, can you take a look at Task 8.19 (BSR - Multiple RP Candidates)?  The solution seems to reinforce what I am saying.

    Thanks

    task 8.19 partial requirements:

    •  R5 should distribute this information and instruct all routers to loadbalance multicast groups between the two RPs.
    •  Use the maximum possible hash mask length to evenly distribute the load across the RPs.

    partial answer:

                ip pim bsr-candidate lo0 31

    The solution explains this as odd/even groups distribution.   I am calling it a round-robin until exhaustion of the multicast address-space. 

     

  • Joe,

     

    I think the WB got it backwards.

    Below is copied directly from the command ref:


    ip pim bsr-candidate



    hash-mask-length

     

    (Optional) Length of a mask (32 bits maximum) that is to be ANDed with the group address before the PIMv2 hash function is called. All groups with the same seed hash correspond to the same RP. For example, if this value is 24, only the first 24 bits of the group addresses matter. The hash mask length allows one RP to be used for multiple groups. The default hash mask length is 0.

     

     

     

    My own tests results line up with Cisco's interpretation.

    If you need to split the load between only two RPs, a single bit from the group address will suffice. Hash-mask-length greater than 1 (actually, 5) doesn't hurt, but won't give you any extra benefit either.

    NOTE: By single bit, I mean the bit beyond the initial 1110 bits.

  • task 8.19 partial requirements:

    •  R5 should distribute this information and instruct all routers to loadbalance multicast groups between the two RPs.
    •  Use the maximum possible hash mask length to evenly distribute the load across the RPs.

     

    Maximum possible hash mask length would be 32! You can run some test to see if that will give you the same result as 31. If the answer is yes, WB is wrong; otherwise, I am right. LOL !!

  • JoeMJoeM ✭✭✭

    ok HS, 

    I felt challenged.  I have tested it with up to 5 RP's.  

    You are half right.  ;-)

    • the mask has no affect until a length of at least 5 (which divides the address space right down the middle)

     

    What I am saying is correct. 

    There is one caveat. The hash values definitely have some randomness built-in.


    • Due to the builtin randomness, a longer mask gives truer load-balancing between RP's -- and is KIND OF (random) round-robin using all RP's.
    • a shorter mask is still KIND OF round-robin
    • a shorter mask takes larger swaths of address space per RP, before changing RP's.

     

    My Conclusion, is that a longer hash-mask-length gives truer and more visibly verifiable load-balancing.   The randomness does not guarantee a predictable change between RP's (it is more-or-less).

     

    Also, OP's final equation is not correct:

    If we want to load balance then 2 ^ (32 - hash_mask_length) = no of RP's for loadbalancing.

    The more (semi) accurate equation (workbook) is                           

           2^(32-hash-mask-length) =  groups per RP (given in round-robin)

     

    ================================

    TESTING

    Posting the results, will take up a lot of space.

    For anyone else testing this, here is what I did.

     

    BSR:  note that changes to hash-mask are immediate. no need to clear rp-maps

                ip pim bsr-candidate lo0  32

     (2-5) RP-candidates:

                ip pim rp-candidate lo0

     

    After changing the hash-mask-length,  copy and paste this into any multicast router:


    do sh ip pim rp-hash 224.0.0.1  | in v2
    do sh ip pim rp-hash 224.0.0.2  |  in v2
    do sh ip pim rp-hash 224.0.0.3  |  in v2
    do sh ip pim rp-hash 224.0.0.4  |  in v2
    do sh ip pim rp-hash 224.0.0.5  |  in v2
    do sh ip pim rp-hash 224.0.0.6  |  in v2
    do sh ip pim rp-hash 224.0.0.7  |  in v2
    do sh ip pim rp-hash 224.0.0.8  |  in v2
    do sh ip pim rp-hash 224.0.0.9  |  in v2
    do sh ip pim rp-hash 224.0.0.10  | in v2
    do sh ip pim rp-hash 224.0.0.11  | in v2
    do sh ip pim rp-hash 224.0.0.12  |  in v2
    do sh ip pim rp-hash 224.0.0.13  |  in v2
    do sh ip pim rp-hash 224.0.0.14  |  in v2
    do sh ip pim rp-hash 224.0.0.15  |  in v2
    do sh ip pim rp-hash 224.0.0.16  |  in v2
    do sh ip pim rp-hash 224.0.0.17  |  in v2
    do sh ip pim rp-hash 224.0.0.18  |  in v2
    do sh ip pim rp-hash 224.0.0.19  |  in v2
    do sh ip pim rp-hash 224.0.0.20  | in v2
    !next sample
    do sh ip pim rp-hash 228.1.1.1  | in v2
    do sh ip pim rp-hash 228.1.1.2  |  in v2
    do sh ip pim rp-hash 228.1.1.3  |  in v2
    do sh ip pim rp-hash 228.1.1.4  |  in v2
    do sh ip pim rp-hash 228.1.1.5  |  in v2
    do sh ip pim rp-hash 228.1.1.6  |  in v2
    do sh ip pim rp-hash 228.1.1.7  |  in v2
    do sh ip pim rp-hash 228.1.1.8  |  in v2
    do sh ip pim rp-hash 228.1.1.9  |  in v2
    do sh ip pim rp-hash 228.1.1.10  | in v2
    !next sample
    do sh ip pim rp-hash 232.0.0.1  | in v2
    do sh ip pim rp-hash 232.0.0.2  |  in v2
    do sh ip pim rp-hash 232.0.0.3  |  in v2
    do sh ip pim rp-hash 232.0.0.4  |  in v2
    do sh ip pim rp-hash 232.0.0.5  |  in v2
    do sh ip pim rp-hash 232.0.0.6  |  in v2
    do sh ip pim rp-hash 232.0.0.7  |  in v2
    do sh ip pim rp-hash 232.0.0.8  |  in v2
    do sh ip pim rp-hash 232.0.0.9  |  in v2
    do sh ip pim rp-hash 232.0.0.10  | in v2
    !next sample
    do sh ip pim rp-hash 239.255.255.1  | in v2
    do sh ip pim rp-hash 239.255.255.2   |  in v2
    do sh ip pim rp-hash 239.255.255.3   |  in v2
    do sh ip pim rp-hash 239.255.255.4  |  in v2
    do sh ip pim rp-hash 239.255.255.5   |  in v2
    do sh ip pim rp-hash 239.255.255.6   |  in v2
    do sh ip pim rp-hash 239.255.255.7   |  in v2
    do sh ip pim rp-hash 239.255.255.8   |  in v2
    do sh ip pim rp-hash 239.255.255.9   |  in v2
    do sh ip pim rp-hash 239.255.255.10   | in v2

    ===================

    OUTPUT  will look like this (note the RP and Mask used)

    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.1  | in v2
      RP 150.1.8.8 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255) <-- 32
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.2  |  in v2
      RP 150.1.2.2 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.3  |  in v2
      RP 150.1.8.8 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.4  |  in v2
      RP 150.1.8.8 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.5  |  in v2
      RP 150.1.10.10 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.6  |  in v2
      RP 150.1.8.8 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.7  |  in v2
      RP 150.1.2.2 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.8  |  in v2
      RP 150.1.7.7 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.9  |  in v2
      RP 150.1.8.8 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.10  | in v2
      RP 150.1.7.7 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.11  | in v2
      RP 150.1.3.3 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)
    Rack1R4(config-if)#do sh ip pim rp-hash 224.0.0.12  |  in v2
      RP 150.1.2.2 (?), v2
      PIMv2 Hash Value (mask 255.255.255.255)

    etc

    etc

    etc

    ===========

    Showing how many RP's I was using (five):

     sh ip pim rp-hash 224.0.0.20

     RP 150.1.3.3 (?), v2
        Info source: 150.1.5.5 (?), via bootstrap, priority 0, holdtime 150
             Uptime: 00:05:13, expires: 00:02:21
      PIMv2 Hash Value (mask 255.255.255.255)
        RP 150.1.7.7, via bootstrap, priority 0, hash value 131020395
        RP 150.1.10.10, via bootstrap, priority 0, hash value 359109156
        RP 150.1.8.8, via bootstrap, priority 0, hash value 87956298
        RP 150.1.3.3, via bootstrap, priority 0, hash value 1974385695
        RP 150.1.2.2, via bootstrap, priority 0, hash value 1898356108

     

     

    Hope this helps!   ;-)

  • My Conclusion, is that a longer hash-mask-length gives truer and more visibly verifiable load-balancing.   The randomness does not guarantee a predictable change between RP's (it is more-or-less).

     

    Joe,

     

    To be honest, I was 100% aware of that when posted my replies. In essence, this is a probability problem, meaning the results will depend on the actual distribution of the sampling. I can easily list a selected input that would prove or disapprove a specific claims. The results only make total sense when data set is large enough . A complete explanation will require a lot of time which I don't have now. 

     

    Bottomline, the explanation given in the WB is problematic. I will try to do a more complete write up on this when I got some time later. 

     

    Cheers!

  • JoeMJoeM ✭✭✭

    Given the "pseudo-randomness" (workbook), what kind of sampling do you want to see?  Initially my sampling was at 1/4 steps of the complete Multicast address space (population).

    That is why my conclusion is that longer hash-mask-lengths (smaller groupings) are much easier to verify.

     

    I still have the lab up.  Let me know what you want to see, and I will run it.

  • Joe,

     

    It's difficult for me to explain these concepts in english as I don't even know a lot of the math terminology in english. But let me give it a try anyway:

    1. In order to evenly distribute ALL multicast groups among n RPs, at least log2(n) of significant bits must be borrowed from the group address and fed into the hash algorithm.

    Here, ALL can also mean a sufficiently large number. When the number of multicast groups is small, increasing the number of bits taken from the multicast address can help randomize the hash feed and create more accurate result.

    2. The algorithm creates an illusion of round-robin-ness surroundig the RPs, but "round-robin" itself is not part of the algorithm.


    ~~~~~~~~~~~~

    1. When hash-mask-length == 0 (and 1,2,3,4), group to RP mapping is as follows:

    224.0.0.0/4 --> RP1

    Total number of addresses in each group == 2^28.


    2. When hash-mask-length == 5, group to RP mapping becomes:

    224.0.0.0/5 --> RP1
    232.0.0.0/5 --> RP2

    Total number of addresses in each group == 2^27.


    3. When hash-mask-length == 6, group to RP mapping becomes:

    224.0.0.0/6 --> RP1
    228.0.0.0/6 --> RP2
    232.0.0.0/6 --> RP3
    236.0.0.0/6 --> RP4

    Total number of addresses in each group == 2^26.


    4. When hash-mask-length == 7, group to RP mapping becomes:

    224.0.0.0/7 --> RP1
    226.0.0.0/7 --> RP2
    228.0.0.0/7 --> RP3
    230.0.0.0/7 --> RP4
    232.0.0.0/7 --> RP5
    234.0.0.0/7 --> RP6
    236.0.0.0/7 --> RP7
    238.0.0.0/7 --> RP8

    Total number of addresses in each group == 2^25.


    ........


    23. When hash-mask-length == 31, each pair of multicast group will get its own RP

    224.0.0.0/31       --> RP1
    224.0.0.2/31       --> RP2
    ...
    239.255.255.252/31 --> RP(2^23-1)
    239.255.255.254/31 --> RP(2^23)

    Total number of addresses in each group == 2^1 ==  2


    24. When hash-mask-length == 32,

    Each multicast group will get its own RP!

    Total number of addresses in each group == 2^0 == 1

     

    *** The above scheme holds completely true only if

    a. There are sufficient numbe of candidate RPs

    b. The hash algorithm does not produce any collision

  • JoeMJoeM ✭✭✭

    HS,

    Your English is excellent. Don't even worry about that.  I understand everything you are saying.

    My sticking point is regarding OP's follow-up question, and his final conclusion (equation).  I think this is where I have been differing in opinion.

    Suppose i have 4 rp i want to load balance between all 4 of them for different groups so what should be my hash-mask-length...?

    How should i choose hash-mask length .....i think with
    hash-mask-length 30 we can load balance between 4 rp's please correct me
    if am wrong.

    ....If we want to load balance then 2 ^ (32 - hash_mask_length) = no of RP's for loadbalancing.....

    <break>    I have to run an errand.  I will come back to this later.  Thanks for your explanations.  It has made me dig deeper into the subject.

     

  • Hi Joem,

     

    Thank you for correcting me ....i am alos trying to lab it and understand although this is quiet confusing...

Sign In or Register to comment.