The plenary session at 2 p.m. on the 16th of November, 2010, as follows:
CHAIR: Welcome back from lunch, I hope you enjoyed that. This plenary session has only one topic, and they will introduce themselves later, will cover, hopefully with some participation from your side as well. I just want to remind everyone of the BGP signing key party today at 6:00 in the room downstairs. And with that if the people standing, we will get started. Thomas.
THOMAS TELKMAP: Good afternoon. So, my name engineer at Cisco systems, I work for the CRS architecture team. We have been presenting this with Thomas tell camp from Cariden and Paola Lucente from pmacct. This session is on best practice in network planning and traffic engineering, we have been preparing this the three of us, we want to make it as practical as possible and so we really welcome any questions on my side, there is no problem, you can really interrupt me through the flow, give comments or ask questions. That is really appreciated.
So introduction and objective: What is the objective of capacity planning? The objective is to enforce SLA with the minimum amount of bandwidth deployed in a network. What is SLA? It's an availability of loss latency and jitter characteristic. How is it measured? It's measured most often with POP to POP active probes, per class service and measuring the drop per link or per class basis. There is only one thing to remember: It's this out to enforce the SLA, it's simply to put more capacity than the demand. And to have more capacity than the demands, frequently enough. And that is really the highlight, frequently enough, because the SLA that has to be enforced is always an availability target so it's 99 percent or 99.99 percent so it means that part of the planning process is the recognition that some type of unlikely rare events that may lead to congestion are not going to be solved and congestion will occur in these cases but that is not a problem because they are infrequent enough that still the availability targets will be met.
So the rest of the talk is really about how to make sure that the capacity is there for the demands as frequently as needed for the availability target. That is the focus of this talk.
In a brief way, the capacity planning process is the following process, as inputs we take the topology, the routing policy, the QS policy per link, the per class traffic matrix and as output, we model or simulate what will be the per class per link over provisioning factor and especially whether it is below a specified target which typically is on the area of 0 to 90, 95 percent. So what is the OP of this over factor, it's basically the load had a thank has been expected for these four inputs divided by the capacity, and if the load divide by the capacity is typically below 80 to 95 percent depending on your environment, you will decide that the network is correctly planned.
So if do you this process, you take this input, you derive the output and do this with a too like Cariden. The outcome of this factor computation will be either that the OP is below the target, in which case you are happy, and if it's not the case, then, really, the hard work starts where you have three choiced: Either you change one of the four inputs or you change the target for your output, or you simply accept the fact that, in this specific case that you were modelling, these type of failures or traffic evolution, you are not going to meet your over prove I believinging factor and you decide so be it, it's one of the case that I know I am not going to be without congestion but I expect that this is rare enough that my availability target for the SLA is not compromised, and that is very important to understand that accepting this is part of the process. At least to make it efficient.
So now we will quickly review how to gather these four inputs. The topology is very easy, you get the best from sniffing the links to the database from the router. Very easy to do. What is a bit more complex, you don't have one single topology, you have all the variations as the worky involve on a daily basis which means that you need to plan for all the possible evolutions due to failure, which means link and not failure which are of use, easy to generate, the more difficult topology are SRLG failures. These are all the shared Feds on fibre duct, bridges, common building, common city, and we will come back to this later on in the IP section to show some add vents that are going to simplify the collection of these SRLG shared.
Routing policy, that is our secondary input. Actually, we need to have the routing policy for primary path and for secondary path. For primary path, if you operate a plain network that is extremely simple, it's based on the link cost. If you operate a dynamic network it is complex and many people will forget about it. It's complex because each heading computes the path independently and the path they will pick depends on what the other routers did, so there is a notion of like when in time did this occur, with respect to the choice of the others so it is not deterministic and that is quite complex to deal with.
Static MPLS TE is where the off line tools such as made does the computation for the placement of the primary TEL SPs, from the planning process it's very simple because the tool computed them for sure but nothing is free in life. The trade?off is that from an operation viewpoint, from a daily operation viewpoint you are no longer trusting the control plane of the router to dynamically compute this, you have an off line tool computing it so there is a notion of complexity increase and certainly the res I will yen see decrease because you depend on enough line tool.
I will touch on this very quickly, it's a complex side. I prepared it for your reading as a reference but Thomas will come later on with specific examples on this. Part of the routing policy and it's obvious, is about what happens on the failure, because during the failure you will be in protection mode and after the failure you will reoptimise on the new topology and so you need to make sure that the capacity is there to support the demands during the trenchant rerouting time and also, in the reoptimised topology after the failure.
ISO SI SPF is again simple because it's deterministic, it's the extra base on the link cost. MPLS TE is again going to be complex because it's non?deterministic, it's non?deterministic from the placement of the primary TE SP E and it's also non?deterministic for the placement of the back?ups. If you use an LFA type of protection, I would say it's moderately complex because, there, the way we devise the algorithm is that we really make hard to make the selection of the LFA deterministic and, for example, in Cariden they know the way the algorithm is used and most of the time they will pick up which backup was used, they can simulate it efficiently, but there will be always cases you have two backups that are equal from all the viewpoints of the algorithm and at the end there might be a random choice between two so for a fraction of the backup policy you will have a level of non?deterministic but less for than MPLS T.
The third input is the QS policy per link and it's very easy because the bandwidth allocation policy is constant on all the links of the service provider, so you simply derive it once and it's applicable for all the links in your topology.
So that is very easy.
And finally, the other parameter that is very important is the overprovisioning factor. The overprovisioning factor is this output we are going to compute the planet or expected load on per link basis, per class basis and going to divide it by the capacity we set aside for that class of service on that link and this ratio we are going to compare it to a overprovisioning factor and if it's below the OP factor we are going to say yes the manage is correct, we are going to be enforcing the SLA; if it's above, we are at risk of not fulfilling the SLA. So what is the OP factor? And I said that for links of capacity, of 10 gigs and above, the vary conservative rule of thumb is in the 1890 person, it's very conservative, usually there are some implicit assumptions, it's a brief, we can cannot go in all details, usually there is an implicit assumption that your network carries a lot of small and unrelated flows. If you have a small number of large flows, this number will maybe no longer be conservative and would need to be reduced.
A few references, if you want to dig through this matter here. We provide it in this slide, I will skip it buts some research papers or practical papers based on real networks with real measurements that show what kind of overprovisioning factor could be used to guarantee a certain level of latency, loss on a per link basis. And they confirm our, that the area 80, 90 person is conservative.
Small digression. We are not going to do other type of QS discussion later on in this talk, again due to time constraints, but everything that we are going to explain with Thomas and Paolo, everything is applicable on a per diff class of service. We are going to explain it on a per aggregate basis but it's nothing else that one diff serve class so everything you can do for one you can do it multiple times if you have multiple diff serve class in your network.
I just describe here with one easy example how it works: If you assume you have a link of 10 gigs and you have two types of classes, class one and two, with the load that is expected to be two gigs and six gigs respectively and you know that your QS policy for bandwidth allocation is such that 90 percent of the bathe goes to class one and 10 percent goes to class two andy especially you have a work conserving scheduler which means that any bandwidth not used by one class is reusable by the other class, and so what would be the overprovisioning factor you expect in this case? For class one, you expect 2 gig and you reserve 90 percent of the 10 gigs so it's two divided by nine. Your planet OP factor is 22 percent and we saw that we would need to be below 85, 90 percent, so it's extremely safe. And that is basically how any premium class of service is built.
There have so much bandwidth reserved for them compared to the real demands that the OP factor is typically in the range 5 percent. It's really overprovisioning. So there is really no risk.
Class number two supports 6 gigs and has only 10 percent for the guarantee bandwidth, but obviously it reuses any bandwidth not used by class one so the OP factor for class 2 is six gig divided by eight and it's 75 percent, and you see here that class one ?? class two, the lower grade type of class, has a planned OP factor of 75 percent, which is much closer to the 90 percent target and you can see, here, that is the reason why the SLA that is going to be linked to class two, will have an availability that is less good. If class one will have an availability maybe of five nines, class two will have a availability of two nines and that is really how QS based capacity planning is done. So it's very simple, in practice.
And this is nothing new. In 99 I was working on a ATM deployment for CBR service, we were doing exactly like this, so it's exactly like the way ATM QS was actually built in reality.
Traffic matrix. So now, the next section is to focus on one of the parameters for the capacity planning process. We had four input parameters. We had the topology, the routing policy, the QS policy and the traffic metrics so now we zoom into the traffic matrix because it's crucial, nothing can be done without it and it's where work is needed from the service provider side.
So what is the traffic demand matrix: It is the matrix that represents the traffic from anywhere to anywhere in the network. It's typically measured on a per class of service basis, if you have a diff serve deployment, it's typically representing the peak traffic or at least a very high percentile, so nobody plan with the average traffic matrix. It's based on measurement, estimate makes and deduction.
So we will come back to this later on.
So there are two types of traffic matrixes, the internal and the external traffic matrix. The internal one is the minimum, it's 90 percent of the time what people do. It's the matrix from POP to POP, co?router to co?router aggregation router to or PE to PE so edge router to edge router. The external traffic matrix, it's the same, but the from and the to would be the peering AS, so, for example, I would compute it from all my PE devices around the border of my network but I would always say it's coming from that PIS and it's going towards that PA S /SRAOE /KWRAO that BGP next stop so it includes the internal traffic matrix. It's, if you wish, from which PE to which PE but also from which AS to which AS, peer AS, not the really source AS and final AS, this doesn't matter as a first order of magnitude.
90 percent of the time it's done for the internal traffic matrix which is an approximation, but at least start with this. So we are going to focus a little on this. The internal traffic matrix collection, there are two easy ways to collect it, I would say, or not easy but, in practice, it's often done like this today, if I look at the deployments many people in the end do it like this. On one side it's to collect the LDP MIB because you will have the traffic per LSP, to each BGP next stop and so you have your internal traffic matrix but you don't have any class of service information you don't have any external information. Other people would deploy a TE mesh, collect the counters on the T tunnel so get a traffic matrix which cover the area of the network from the TE head ends to the HE tail ends but because of scalability it's very rarely PE to PE, it's more likely POP to POP or aggregation router to aggregation router so it's a subset the internal traffic matrix. It requires a TE mesh which is an operational pain so you are not going to deploy TE mesh, that would be a bad idea. You deploy it because you have other reasons and you use it to build your traffic matrix but never deploy it for the traffic matrix, there are better ways to do it. So it's an incomplete internal matrix, it doesn't have the external information and it is not QS?based. So there are quite a few drawbacks. And in my opinion, the best is NetFlow v9. You deploy it on the PE devices, or on the aggregation routers, on the incoming interfaces from the PEs, so you have the information on a per PE basis so it's the best. You have the information for your complete internal traffic matrix because it's POP next stop, you have the information for your incoming and outgoing especially outgoing peer ASes, so you have your external traffic matrix and it is QS?based, so if it would be my decision, I would go with NetFlow v9, but in reality, I would think that maybe not a majority but a lot of actual deployments because the LDP MIB or TE mesh.
Demand estimation: It looks that it will never work and I am going to explain why it works. Demandestmation is an off line tool capability, for example made from Cariden supports this. The problem that it solves is the following:
You collect the link loads on all your network and out of this you you derive your traffic matrix so it means you take N constraints and you solve end square unknowns so it will never work mathematically. And indeed, in this chart, if I look here, this is a plot that on the X Axis shows the known demands, so what in actually the demands are in the network, and on the Y axis you see what was estimated by the algorithm, and if the algorithm would be correct, everything would be on the line, and indeed it's not, which means that the algorithm is not able to solve that problem normal, it's impossible to solve N square unknown with end inputs. However, it doesn't matter. Why? You are not doing planning for the normal topology because that is never the worst case. The planning and the bandwidth you put aside is determined by the failure. And this algorithm is going to give very good results for all the failure cases. Indeed, this is shown in this example, again the non?worst case so it's under failure, link and the ones that were estimated and we see the algorithms works very well. How come? The reason for this, it's easy to ?? unfortunately, it no longer works. Thanks.
Now it works. The intuition behind this paradox is the following:
If you have two demands going from Auckland to what is BWI in Washington. So if you have two demands, one from Auckland to Washington and one going from PA to DC, indeed it's going to be very difficult for the algorithm to know on the basis of the link load, how much traffic is on the blue demand and how much is on the orange demand. It doesn't know how to split it. But what will be true is that if the link here fail, it doesn't matter because the blue and the orange demands will be rerouted via Chicago and so the expectation of the link load on the links here and here are going to be correctly estimated, because it doesn't matter what each individual demand is. Each individual demand is wrongly expected. But as groups they behave in groups, like the stock market; everybody buy and sells at the same time and that is the same here. They go on the backup route at the same time so each individual demand, it doesn't matter.
So this is quite powerful because if you have an incomplete traffic matrix, let's say your collection process from NetFlow or from the LDP MIB or your TE mesh is incomplete, you can use these measurements for the matrix positions, you know, and you can fill in what you don't know with this algorithm. It will give you even further good results because you will have more known parameters.
Other topic: When you do the planning, you never do, in practice, the planning for the traffic matrix of today. You do the planning for the forecasted traffic matrix so it's basically best on the measurement of a multiple periods of time in the past, you find a rate of growth and then you decide, I am going to plan for the next six months, taking into account that this compound growth gross for six, 12, 18 months and you are going to set aside ?? you are going to determine how much capacity you need, where, to anticipate that bandwidth.
So this point is another help that can be provided by a planning tool. When you collect your input for your traffic matrix, either from the LSP MIB from the TE mesh or NetFlow measurements, as part of the measurement process there will always be inaccuracies and you also collect the interface loads, and so, for the peak time, let's say yesterday, you know ?? you observe traffic matrix from the viewpoint of let's say, your LSP MIB but you also know what was the link load on each link at the peak time yesterday in the network, and so a nice exercise is to match this input in terms of traffic matrix and see whether it's going to fit what you saw yesterday, and often, it will not be perfectly matching what you saw, and so as part of a planning tool, there is, I think, and Thomas could say more on this, an algorithm to actually finetune and regress the traffic matrix measurements to better fit them to the actual loads that were collected on a per link basis to refine the measurements to make it closer to the reality.
That is it. So Paola Lucente is going to continue and then Tomas will follow.
PAOLA LUCENTE: Thanks very much. So Paola Lucente, developer of pmacct project, and founder and developer of the project.
What is pmacct? It is, in a nutshell, a telemetry data collector, replicator and exporter, this is traditionally what the software is about, and also, it's stated on the slide it is open source, free and check it outright away on the URL in the corner.
To get a little bit more elaboration on what it does, you can see on the left?hand side of this slide some traffic capturing methods, for example NetFlow and sFlow which I would refer as telemetry, export protocols, and then on the other side you see a number of storage methods which include memory table but also very well?known open source free relational databases like my SQL, post gress QL.
What happened us of last year is that pmacct introduced Quagga based BGP deem enand it was implemented as a parallel thread within the collector. You can imagine you have two threads one is listening for sFlow or NetFlow messages and the other keeping up peering sessions with routers in observed routing domain. Key implementation feature is that those BGP which are received, they are maintained separated, so they are never used to compute global routing table? Why is that because the idea you can see at the bottom of this slide is to join telemetry data and routing based on the source address of the router. So what you want to do is really to get, you know, into the very details of the local routing table in a specific corner of the network. You don't want to compute anything global or whatever.
But why BGP are the collector? Because some BGP information are also included in sFlow or NetFlow, because telemetry protocols should report on forwarding planes so on the traffic being passed through the network and the export protocol should stay away or as much away as possible from contra plane information, otherwise it's like you have a nice protocol and you choose to move the contra plane over and over again, right?
So, getting in the specific topic of capacity planning and traffic engineering, two models can be seen from a telemetry perspective, NetFlow perspective. So and the first of the two is would say the most relevant because then it drives to strategic solution, and the first is from all the peer routers, all the provider edge routers, telemetry data is got ingress only at the edge interfaces and on top of that, also, BGP, full routing table is got from those peer routers. What is ?? what you achieve with that is traffic matrix, in the view of the network and which is edge to edge traffic matrix. It is very useful for, yeah, providing of customers, peers and transits, in fact there is some people that might have heard me before speaking of specifically this export model in the context of peering, but in the context of capacity planning and traffic engineering, coupled with iBGP information then you can start working out what scenarios which is pretty interesting.
The second export model is to get from PMP routers ingress measurements at core interfaces. Now, this will not generate a unique view of the observed routing domain traffic any more so it will generate end traffic matrices, but it is very simple; it doesn't require any routing information and a tactical solution. So one of the scenarios where this can be useful is if you have congestional or backbone link and at some stage you want to understand what is happening, what is this traffic which is filling my link. And then you want to do some reserve forward path work, you know, to get the flow of traffic to the edge and you know map it to the big picture. So this can be seen as really a darker sort of model or as an add?on to the previous one.
Let's go with some illustration, and so this is the first model illustrated you can see we have three flows over here, and yes, from the first model you get essentially information where traffic gets into the network where it gets out of the network and where it is from and where it is going to, essentially. So you can see an aggregation method over here defined. Of course, it can be, the traffic matrix and can be reached with whatever you want, for example with BGP?related primitives, you can use communities, local preference MET and so forth and from over here you can see that every flow is originating just one record,, so it's unified view of what is going on.
On the second slide, here, you can see the ?? illustration of the second export model. It gives a lot of insight on how traffic moves through the backbone, for example you can see that the red flow is getting through B3P1 and it's getting out at P 8 and as you can see from here every flow is not originating single traffic matrix, you have several of those, several.
This traffic, of course this model also has some information about where traffic is coming from and going to, but it's really about the original and the final AS number, which can be used for some kind of correlation with the first model, right?
And yes, speaking a little bit about scalability, we have been seeing very high level now, the export model so let's see a little bit what are the challenges behind those. In the first model we spoke about getting telemetry plus BGP from the peer routers so the idea would be that every peer router is BGP peering with the collector. Now, the very important task then is to understand what is the memory footprint of the collector then? Right. So at the very beginning of the implementation in pmacct you see there is a red line, I was saying just everybody OK it's 50 MEGS per peer that is it. Of course optimisation phase kicked in and that originated blue curve that you can see over here, so a lot of overlap of information was removed and a lot of more information shared, so new memory model and the fact that as soon as you have, yeah, some 20 BGP peers or something like that, you already go around the 20 MEGS or something like that. Of course these measurements are all with half a million IPv4 routes, 50 K IPv6 routes and 64 bit exactable. Sorry one thing, very important to comment, is that then I would say it drives to a very interesting conclusion, that for 500 peers, in this condition, so half a million IPv4 routes and things like that, the memory consumption would be around 9 gigs. And for 500 peer routers we are already speaking about, you know, very large tier 1 or huge incumbent in a large country.
Still on the scalability, so aggregation and temporal grouping, one idea was the original idea back in 2003, is that if you try to save on the disc every micro flow, that is not really scaleable. I still think, today, that is true, and essential element to a scale to very large scenarios is to perform both spartial and temporal aggregation. What I mean and how to configure that in you see in these two lines, right? You have this aggregate line and the SQL history. In this case you are building an aggregate like this out of all the mic row knows and you create five minutes. This originates at SQL table so you have scheme out of it and you can already see one interesting property that SQL table can contain some variables. And then you can map, still do what I call in the title of this slide the temporal grouping, so every hour you cycle to another table, right? So you are defining like 5 minutes counters for ascertain aggregate and every hour you change your SQL table to another one. That helps a lot in keeping, yeah, the table small and indexes small and manageable so that a bit of the ?? yeah, it works better.
Still on the we see another interesting thing, which is the spartial grouping because of course you can do temporal grouping and spartial grouping and can can divide your devices, for example, and basing on geographic location or say over here, cluster 1, 2, 3 and 4, I refer to BGP cluster for example, and you can divide in different tables so some of them are saving to one table and some others to another and things like that. Still taking the concept of taking it further.
Still on scalability, a last slide, of course at some stage you can do all of these niece things, at some stage you will hit, still, a barrier, so what can you do next, because it might not fit everything. First of all, you can hit memory shoots, not all the BGP session can be, you know, not awful the BGP can be kept into single collector, the CPU can't cope with the pace of the telemetry expert, so still you can go further. You can, first of all, disconnect to the collector from the database so you use two different boxes, then of course you can assign a routing element to the collectors, so have multiple collectors, and of course if exporting BGP and telemetry data out of the routing elements, then you should still couple the BGP and telemetry together. And then you can also go even further and assign collectors to databases or you can cluster the database. Still, the matrix on the other side can get very big so can you do anything to reduce it? Yes. You can keep a smaller router out of the equation. You can, in case you are using dense boxes, dense and modular boxes, you can keep some specific services or customers out of the equation, you might be more interested in to IP transit customers because they are making much less traffic volume compared to the other ones so they might be negligible. You can focus on the relevant traffic direction and when I say that, I think very much about the CDNs or ISPs, right? Maybe just one of the two directions relevant and the other is negligent will I eligible. Of course, last but not least, increase the sampling rate.
And yeah, nearly concluding slide. This is a couple of SQL queries. Don't have a heart attack because of SQL, I keep saying that. SQL is your friend and this is two queries in which you can see from both the strategic and the tactical export model, you can get data out of it. What it would like to communicate to you is that it's with ?? with a couple of very simple SQL queries it's very easy to get data out of the database and inject into third party tools, because at the very end the goal behind the collecting traffic is doing something about it, so it's very important that the handover interface is very well thought of and they reach, which is essentially what SQL is doing for you.
And I leave you with the further information because time is never enough to speak about everything, and so I leave you with some official examples in case you want to venture into pmacct, of course you can buy me a beer and we can speak together about it. Of course, pointers to some previous presentation, especially in the context of peering and very interesting topics, provider on address space, how to discover it, very important top particular, my manager always says something like after the first question, which is how is it going to cost the solution, then how is it going to cost to maintain it. So that is very important, that something can, yeah, be automated and the support also discovered. Maybe you want to check out also that link over there. Thanks very much.
AUDIENCE SPEAKER: Clarence and Paola Lucente have been talking about the objectives and data collection, I am going to talk about network planning and traffic engineering. So first a note about traffic management because this is such a broad term, it ranges really from strategic planning, so if you work for large company like my previous employer they ask you for five year forecasts on what you need to spend on equipment, but you need to have some idea what you are going to build out and what the cost. That is a totally different question then, for example tactical traffic engineering whether it's a problem right now in the network and you need to have the data and tools to do something about it. The more long?term planning and traffic management are mostly off line based, so it really doesn't matter too much in detail how your current network, how much traffic there exactly is, what is the state is but the more you get into architecture and especially operations, you need to be very close to the network itself, obviously.
So, talking a bit about network design and topologies. Here, you see very same network design based on a ring basically, you have your operational capacity and you have exactly the same capacity as a reserve. This is a very simple way of building your network, if you make sure you always have 50% spare capacity, if something fails it will fill up at 50% and you are fine. This works perfectly, no problem at all. But it is expensive and even though lots of operators have have their own fibre and think provisioning capacity doesn't cost them anything at least IP equipment is expensive and if you need to compete on price you might want to be a bits more efficient than this. This is one to one protection this is similar to what you do on optical layer, one waive length over the primary path and you protect it with another over a secondary path but you need to have both available so you provision one plus one so twice the capacity. You can do it in a way over here into rings and again, you keep adding capacity on both sides of the rings, it works but it is expensive and the question if we can do better than this.
So, the difference going from one to one versus one to N is again here you have one link and another path as protection. Over here, we have three links. Now, what happens is, if we take this one the top one as a primary link, it is protected by the orange link but the orange link can also protect the blue link over here. So one link in the network is acting as backup capacity for two other links in the network. So what happens is you get a it 2:1 protection. For spending 100 dollars or multiple on that on the income, you now spend 150 to protect it because the protection is shared by two links. This is what we want to generalise as a mesh in the network so this is meshing and this will save you operational costs. The percentage in practice is what we have seen on what you can achieve on a mesh is something like 15 to 20 less expenses on the network. You can do the same on a ring. You add a bypass link that has traffic between the sides on each side. It doesn't have to go via this side and you have 3 paths between the two sides of the ring. OK. This sounds very simple and it almost looks like there is no reason not to do this immediately but there are obviously costs, it's not as simple as it looks like. You need diversity, if you have fibre ring and built network over it express routes as much as you want but it's not going to help you because if one side of the optical link goes down you will lose both links so you need to have the physical diversity as well and we will see more about that in an example.
Also there is an engineering and architecture consideration. How do you make use of that link? Just making it there does not make automatically guarantees it's actually used for protection so that is what I want to talk about here is how to make this work.
A few observations: Just looking at the link utilisations is not going to help you. That tells you what is happening now, if you are 50/50 but it doesn't tell you anything about the meshing, so you need to be topology aware, which we discussed earlier and we look at failure. And that comes back to my first slide, the boundaries between planning, engineering and operations department are not that clear any more. I mean we are all working on the same problem, especially in, let's say, the old world, with circuits you had planning department which was completely separated but this is much more overlapping and integrated.
First thing to look at when evaluating these is the failure planning. I have my network and over here we have a sample network based on an actual design that has a lot of red in it, which means links congested in this case, so over 100 percent, in the worst case, so every link here shows what the highest utilisation is that can ever get. So there is a lot of red links in this network so based on simulating link failures, so we fail, we reroute all the traffic and check the rules. It shows which links are causing the problem and all this red proves to be generated by only one link here and this fails, then it will create all these problems. So now, by simulating all the possible failures I know what potential problems there are in my network. Obviously, the next step is changing the topology, so again, we have a problem here between the Chicago and Detroit, we add a direct link Chicago to Washington, this is one of the bypasses, the meshing you add to the network. You immediately see that will at least solve the immediate problem over here and this fits all with the rest of the story, you need to have your traffic matrix end to end. If you put that link in here and put iBGP matrix on it which traffic will use the link, we are just looking at the link utilisations, you will not get the answers here.
And the last one is growth, especially for new customers but also just your growth over time which Clarence talked about. For example if you want to know what happens if I add 4 gigabit of customer traffic. You need to have an idea of your distribution, your potential customer is not examining to tell you where the traffic is going to go; you need to make an assumption on your existing traffic and distribute the customer's traffic in the same way.
Just a very quick overview on the exercises you can do to plan your traffic and use your traffic matrices. Going into the very much related topic about optimisation, so network engineering and traffic engineering are basically the same thing. When doing your network engineering, you actually know what your traffic is or you have an idea what it's going to be and you build the network to meet the needs of the traffic. The other one is other way around, you need to tune your traffic so the better you do the better engineering the less traffic engineering, yeah, network engineering the less traffic engineering you use. And of course, although there is a lot you can optimise, all the steps we have discussed so far are still very relevant.
Now there are lots of questions and I don't have all the answers here in the remaining time of this session, but you need to first figure out what are we doing to optimise for, do I want to lower my delay and costs and make my resilience optimal? There are many things you can optimise for. And if you have multiple objectives they might even exclude each other. Low cost and high resilience already start biting each other. Which approach? You can start using tuning your iBGP matrix or MPLS TE route, strategic or tactical, do I want to plan ahead and prevent problems occurring or wait until it goes wrong and figure out what to do. So tactical. How often do I re?evaluate and come up with traffic traffic engineering. If you go MPLT TE route, there are more questions, how do I create my mesh, do I use dynamic tools or statically routed tools. If I use dynamic tools how do I size them, on sign or off line optimisation and there is a traffic sloshing which he will explain.
When I was, this morning, in the session about complexity, I suddenly realised why I was talking about all this MPLS stuff it does not fit in the picture of simplicity, you already see how many questions come up and maybe the goal of this presentation is to show what kind of complexity there is and what simpler solutions we might be able to choose.
So I am going to do this relatively quickie because this is just explanation what traffic engineering is and I think that is known for the people here.
So in conventional routing it's done HOP by HOP very resilient, but all the traffic to a common destination takes the same path, from R, 1, 2 and but once at 3 it will follow the same path, it doesn't take utilisations into account so you have to follow the links and see what happens. The question is what is the ?? what do we optimise for, what is cost?effective. Also, there is a nice algorithm which is called maximum flow and that tells you what the theoretical maximum is of traffic you can get on the network so all these options for traffic engineering you can very nicely compare against the solution of maximum flow so you know you will never get more than that percentage of traffic on the network.
So the traffic engineering, we can split in two kind of objectives: You can say I just optimise for the normal state of my network and if there is a failure, then I will figure out what to do later. That is what we call just minimising the maximum utilisation for normal working network. You can do tactical and optimise for failures as well. This is more resilient but will take more capacity on the network. There is a big difference between them. If we look at this here, the up link has OC 12 V OC 3, and in the normal case that is no problem, I make sure all my traffic goes on big link, but if the big link all the traffic falls back on the 155 link and I have a problem. So it really makes a big difference whether I choose to optimise for the normal case or take all the failures into account as well.
This is the really bad news. Traffic engineering does not create any capacity. If your network is full it's full you can tune your ?? it will still not heap. Especially on a ring. If you have a ring, if one sides breaks traffic needs to go the other side and traffic engineering is not going to help you at all. You can always run into examples, maybe not complete, full is full and you need to upgrade. That still applies.
One of the solutions that we had this IGP tuning or exchanging versus MPLS traffic engineering, a lot of people don't believe that tuning your IGP metrics will help I can change metrics, like in this example all was going here so I increased it to 3 and up all the traffic goes on the other path. So I can solve a problem but it will always create the same problem somewhere else in the network and this is my experience too, if you try this manually and you do this in your network this will always be the case. Recreate problems somewhere else. One of the things that you maybe don't realise which is hard to do manually, is that a very powerful mechanism for routing in a network is ECMP. You can split your traffic and if you set your metrics in a smart way split is to many places you get very nice distribution and resilient topology and as we will see later on it will increase your this brings all kind of good stuff at no cost, there is no protocol or equipment or more complexity.
A lot of research papers on this, if you want to read more about it. A lot of different approaches to the same problem. In the end, it's finding better metrics and there are multiple ways of doing so.
Example, again the same network. I am going to cover this quickly, is this is a network with just shortest delay routing so all the traffic is on the links with the shortest delay, the IGP, based on those a lot of congestion in this area here. This is an example here, all the traffic in the north, for example, congests the link over here. Even in no failure it is failure. Given topology with given traffic metrics, this is not a real network, would you not have congestion in the normal case. You start tuning your metrics, here is the problem and increase the metrics and yes it works and the traffic shifts to the other side and it creates exactly the same problem and this is example of just shifting your traffic around manually and not solving the problem.
Looking at all the failures, again this is the same picture as we saw earlier, it looks like you need to do a lot of upgrading and the question is could I have prevented this by changing my IGP metrics, so we run one of the algorithms, and what I get is a network that even under any possible circuit failure does not have congestion on it, based on this topology and the given forecasted traffic metrics and you immediately see if I select one traffic demand from the west coast to the east coast it is using ECMP so a lot of load balancing on the network and making more efficient use of the resources that way.
There is no guarantee that this always works and again, you cannot create capacity, it can just make better use of the meshing in your network. These are some studies we did on real topology so from real operators, 100 percent is the maximum flow algorithm I talked about and green are the optimised metrics which gets you somewhere between 80, 90 and 95 percent of what you could achieve. Doing explicit primary and secondary paths in MPLS gives you a better coverage because so much control over the traffic you can explicitly route every demand on the network. It has an operational cost but it is a little bit more efficient.
Now, I have a lot of slides about MPLS, I still have time for that. A lot of considerations about MPLS and thinking back about the session this morning about complexity, I think looking back at my own slides mostly I am explaining here why MPLS traffic engineering is so complex, and also the sense I got this morning is a lot of people here agree that all this complexity is probably not necessary and we need to go back to simpler networks but I think there are also a lot of newer IP networks and specifically at the mobile operators and not so much the Internet end carriers, that maybe don't have the in?house and history and maybe they don't come to RIPE meetings, and if their vendor or somebody else tells them that with MPLS traffic engineering you can create all these circuits again that you had on your previous and they start building it not realising all the different options that are. So the next slides are really to give my view on some of the trouble you run into for a lot of things, I actually don't have a solution, either.
So again, the choices are dynamic paths, let the network figure out how to route them, or explicit routes. For the dynamic paths, you need to specify the bandwidth and the network will figure out what to do. The problem is, and Clarence explained it as well, it's non?deterministic, you don't know beforehand what is going to happen. If you have multiple tunnels that compete for the same link you don't know which is going to win. You might play with priorities and things like that but it's not predictable what is going to happen. And each router only has a view from its own perspective, there is no global view of the network so it's not optimal from a global perspective; on the positive side, it's very resilient, it can deal with any failure, adapt again to what is happening.
The static paths, very deterministic so it looks like least lines, circuits or optical circuits, whatever, but it's a lot of computation and it's very difficult to deploy because if you have thousands of tunnels in the network and you need to maintain a explicit primary and secondary path, going to take a lot of resources and you need a good management system to be able to do that, if at all possible.
For the dynamic solution it looks very simple and in theory you assign a bandwidth to an LSP and that will find a path that has a bandwidth available. Simple. Problem solved. But the question is what is the bandwidth for an LSP? I mean if you look even at your links you see the traffic going up and down and if you split that in very small parts because every one carries very little traffic, a lot of fluctuation, a peak of 10 megabits, how do you size it? Do you set it to 10 megabits, that is fine, but most of the day you will be over provisioning, so it's a very difficult question on how to size these tunnels. That are solutions for that. Let me see...
You can do on?line sizing. Let the router figure out how to set the bandwidths. That is an option. I mean I think all the router vendors have a feature for that, that the router measures the traffic and sets the bandwidth it needs to reserved base on that traffic. But you have two options: Either you update very often to keep date of the traffic but all your 10,000 tunnels are going to reroute all day long and need to be updated and rerouted all the time so you get a lot of complexity on your network, or you wait a while but then the traffic might be ?? the traffic might have been gone up and your LSP has not been resized so this is one study we did here where this is the actual traffic on the network and these are the reservations based on the router tracking the traffic and what happens is, if you want to make sure that your reservations are always higher or equal to the traffic, you need to provision much more traffic than you see because you need to be a little bit ahead of the growth here so during the day the traffic goes up but you need to provision so much more that you overshoot dramatically. I know there are solutions being worked out to actually automatically adapt to these things and do it in a more sensitive way so maybe that will work so it still is an issue of concern.
Off line sizing is equally complex, as I earlier said. What do you set it to? You see traffic over a day. How much do you reserve?
We have basically discussed this.
Then, the deployment of the tunnel mesh. If you have a very large network, your edges might not be able to do MPLS traffic engineering, smaller devices do not have the capability or power to deal with many tunnels or if you create a full mesh between them, if you have 1,000 devices you need one million tunnels between them and that is a large number for the core to deal with. A lot of people do you start your traffic engineering in the core. Obviously on those two up links you don't need any traffic engineering because you need to have 50% spar capacity anyhow. You do your traffic engineering in the core and a very typical deployment, you have two core routers, your edge feeding traffic into both and you do traffic engineering on your core network. Looks like a very good solution and some deployments use this. But there is one very unfortunate result of this, and which I think a lot of people overlook and that is what we call sloshing. So again, the same network, here is your edge router and ends into two core A and B, and they nicely have a tunnel from A to E and B to F so traffic can go out here again and the tunnels will recould you tell over the network, make sure there is enough capacity. Traffic goes to router A and then goes into tunnel 1 and that has the appropriate reservations for that traffic.
But now a failure happens in the network, so the link over here fails, the tunnel automatically recould you tell us because it's a dynamic tunnel, it will nicely find a good path again for that traffic. But the issue is, the traffic from the X route is not going to router A any more because it see the links failure here and thinks this is the shortest and sends it to B. It doesn't have any knowledge to the tunnels, it sends the traffic to another router and that didn't expect it. So the reservations on the second tunnel is too low, it was not expecting this traffic so this tunnel has rerouted and has the right reservation but not the traffic. This one is on the old path with no reservation but it has the traffic. This is very difficult problem to get around. One solution is to use forwarding adjacencies to advertise your tunnels into the IGP but that has a big scalability problem because certain if you have a mesh of tens of thousands you inject a lot of data into your IGP. Another one could be to separate this from the routing plane and do load balancing over those two links into the core. That is a feasible solution. So it's something to take into account when doing MPLS traffic engineering.
Comparing these different approaches on a real network, this actually is a presentation I gave myself at the TE R EN A in 2004. It has a lot of meshing which was not so much a design decision but more, a result of years of hard leave being able to keep up with traffic route and doing whatever was possible. Otherwise there probably would have been a little bit nicer design than this one. And it was based, that network, at that time, on MPLS traffic engineering. So this is ideal case, we have the topology and traffic and we can simply compare these options I just discussed.
Earlier, I talked about optimising the highest utilised link in the network but that is only one data point. If there is one link in your network that says 90 percent but the rest is 10 and you have good network so it's unfair. I look at the highest loaded 200 links in the network so gives you a much broader picture on how the network is doing and I look at the utilisation, 100 percent is obviously full. If I use delay based metrics, basically ignoring capacity and putting everything on the shortest past that gives you a lot of congestion, that is a little unfair because it was not built to have these metrics. Then in black, is CSPF so that is dynamic tunnels, and it works perfectly well. You see a few links aren't 100 percent because that is what MPLS TE does, if there is no capacity it starts rerouting to other links. You see a few at 100 and then it goes nicely down to 50. Blue are the optimised metrics, on the same network we optimise the IGP metrics and gets the utilisation down to about 80% and the rest even lower. Actually this is the normal case so no failures. Either solution does its job nice.
Then we look at failures. And again, I will ignore the delay based metrics because that does not make sense. You see one link a little bit over 100 percent that probably has to do with the sloshing I just discussed. You see a lot more links at 100 percent, that is because again MPLS fills them completely and the optimised metrics still manage to get the utilisation under 100 percent. So this proves that optimising the metrics for any possible link failure was able to get any link under 100 percent.
This is a similar study, this was presented by Martin horn /*ER, from Deutsche Telekom and it's similar data but in a different format now. So here you have two axeees, one is the normal utilisation and one is the worst case so again the worst case is under any possible link failure in the network and now we only look at the highest utilised link in the network. If we start with default metrics this was not built for sending random, that doesn't work. If we look at the theoretical maximum flow algorithm, that is the green one over here, so in the normal case I cannot get the network under 50 and under 90 about in failure. And then we have the different options. Dynamic MPLS tunnels works perfectly well for the normal case. 100 percent. Due to the sloshing it does not work for the failure case because this was a core deployment. If we look at explicit primaries and secondaries, completely defined over the network, that does extremely well. It almost gets you at the theoretical optimum but at the cost of something that I personally think is almost impossible to maintain in your network, and again the metric based, we had a few different option, you focus on normal or worst case and somewhere in it's middle you find a point where normal case and failures it manages to get your network under 100 percent. You are very close. At some point you will need to upgrade the links again in the network. It makes more efficient user.
That is similar study so I am going to skip that one.
So, also to give you just my perspective instead of just summarising all the options, if you have some amount of meshing or ability to do some amount, it can really save your costs and doesn't increase the risk of the network. You can do this in a very safe way. Changing your metrics you need a tool that is multiple tools to do this, you can write your own tool. It's simple to deploy, no protocol involved, doesn't cost you anything apart from the software and it's not adding any complexity to the network. There are a few requirements, it cannot deal very well with having an OC 3 in parallel with a 40 gig link. That doesn't really work well with metrics but if you stay away are from that it really gets you a lot.
On the MPSTE side, dynamic tunnels are very resilient, it can deal with any situation. It does a good job there. It's difficult to come up with good mesh and tunnel sizing, it's non?deterministic.
Explicit tunnels: Extremely efficient but it's so hard to deploy, I don't think it's a good idea to do and I don't think anybody is doing that on a large scale any more as far as I know.
I will give it to Claud for a few notes.
AUDIENCE SPEAKER: OK, why are we talking about LFA FRR route. Because the first slide the objective is we want to enforce SLA and we saw that to enforce SLA we basically need to enforce an overprovisioning factor frequently enough but we missed one impact to SLA; SLA can be failed due to rerecould you telling conditions. During those conditions you have lots of connectivity until you are reoptimised on the new IGP path, except if if you have a fast reroute back up in the meantime. The availability of a fast URL backup is a parameter in your capacity planning process. And we touch on this into the backup routing policy that we mention in the introduction previously.
Here, we are going to zoom into one of the different fast reroute approach to deal with this ?? to achieve this reduced loss of connectivity during the trenchant procession of a failure. And the one where we focus on is LFA fast reroute. Why focus on this one? It's specifically because it's one that on one side is simple, it doesn't add anything to your network, it use ISPF but on the other side it is ?? it is depending on the topology to determine whether backup exists or not. And so, the planning process will be key to make sure that you have the level of backup that you seek.
So, what is per prefix LFA algorithm, I am asked and I would like to protect the link to F. I see visually, the drawings show that the shortest path to two destinations, D 1 and 2 is via the link SF so if it fails I will have loss of connectivity to D 1 and 2, my SLA might be impacted. If the availability of the service for the class of service that I am planning for the destination D 1 and 2 is very tight, I may want to have a backup. What kind of backup could I have in this case? The IGP can, itself, automatically compute that in order to protect D 1 when the link SF fail, as can simply push the packet to C. Indeed, the shortest path from C to D visually is not coming back to S, so that is a loop?free alternate path for D 1. And the leap?free alternate path D 2 is for S to pull the traffic to E when the link SF fails. This depends on the topology because automatically the router S is going to compute the shortest path of its neighbours C and E for each destinations and it will determine whether it can use this neighbour as a loop?free alternate for this destination.
The benefits: It's very simple, it's the major reason why so many designers are looking at LFA specifically at the edge of the network in the biggest part of the network today. It's a 50 millisecond protection because it's pre computed, it's extremely deployment friendly because there is no IETF protocol change, and you can deploy incrementally. Very good scaling, no degradation on the IGP conversions and it provides node and link protection in one shot. I am going to focus on the planning aspect.
There is one big issue: It is dependent. It depends on the topology. Is your neighbour able to give you a loop free alternate path, that doesn't come back to you. It depends on the topology. So when you are looking at the applicability of LFF for your network I think there is this logical three, where you have three cases: The first case is you do all your planning for availability of your SLAs on the basis of IBG convergence. If it's available, great, it's 50 minute ?? if it's not available, no worry, you were anyway planning for sub second so you will have sub second. So there it's a bonus without any cost. It's the easiest way to use it.
The other approach is to say, no, from my SLA definition of my tightest class I need 50 milliseconds, not all the time, never, nobody needs this; remember, it's always a availability but I need 50 milliseconds with high probability, so for most of my link are not failures. And there are really two camps within designers in the service provider I work with. On one side those who can manage the topology to ensure LFA and so they pick the technology and that is fine. Others, they decide to not manage the topology for LFA and for these ones, it's better to not try to use LFA because it's not going to be magical, it will simply not work if you do not optimise the topology for it.
And then finally, there is another aspect, it's the application of F LA in the edge, in the POP or newly built access aggregation, there there is a high applicability, it's the sweet spot. Why? Because the topology in these places at the edge or access aggregation is very, very good for LFA.
So, in the backbone, two camps, some networks they will give a very good LFA applicability. You can see this with red bars, so here we have eleven different service provider topology, for five we see the probability to have an LFA backup for any link failure is above 95 percent and for some others it might be around 75 ?? 69 percent where the topology was not optimised, so it's not high, it's not 80, 90, but it's not bad. The sweet spot is in the access aggregation topology. It's now a working draft, it's a Working Group draft in the R T G Working Group. On explaining why LFA is such a sweet spot technology for the edge, because you will have these type of topologist, we call this triangle, to the aggregation routers that are full mesh into the co routers or they can connected with a square and in these three topologies plus another one that is explained in the draft, if you follow a few basic rules, you are going to have 100 percent link protection without any micro loop and 99 percent and and node without any and you can repeat these 100 times, the properties are kept whatever the topology of the backbone. So the benefits of this is huge but indeed if you follow these few rules, what happens if you do not follow these rules. If you do not follow these rules you need a planning tool. You need a planning tool to compute what is going to happen with your topology, will you have LFA or not? So one part of the analysis is to see what is my coverage, how much backup do I have? And the second phase is really like bandwidth capacity planning, you are going to optimise your topology to increase the amount of backup you are going to have for the different type of failures you are going to plan for.
Giving it back to I think ?? yes. Giving it back to Thomas, he is going to wrap this up with a final example, both taking into account the bandwidth capacity planning process, we dedicated 90 percent of the time to this, and taking into account the backup planning process and engineering.
Thomas: So I will use the last five minutes for this example. It actually was a real network built but obviously I cannot use the data from somebody else's network to he present here at a conference so I rebuilt the whole network, not exactly the same, in Germany just to pick a country to make it somewhat real, a network with all the same things I have seen in the one that we were doing, so I can actually show it here and show you what the considerations were. So it was a mobile backbone network, I projected it in Germany but where it was doesn't matter. It's using IP over optical links so 10 gig or 40 links. There was a projected traffic metrics on what was expected or what was probably already known from the existing network and the objectives were make it cost?effective, make it very low delay because apparently care about delays, well obviously they care but they measure the delay and they think it's really bad if you have one millisecond more. I think it's not true. This was the objective and there was a request for coverage. We will see that these things don't really agree with each other all the time, at the end you choose which you prefer. Again, topology was IP over optical, there was six core sites and many more that only had PEs as access routers.
The design rules, because we need to minimise the delay, all the circuits on optical layer, so the IP circuits that are waive lengths that need to be on the shortest delay path. We don't want to have any unnecessary delay in the network. We try to make one to end protection, the PEs that do not have core routers are homed into the two closest, but it needs to be on the optical layer diverse.
On the IP layer, two two PE routers in all the other sides as well. There are edge routers that represent the traffic that is behind the PEs, all the media gateways, Internet access, whatever. To get the lowest delay we set IBGP equal to the delay on the optical layer because we know how the optical circuits are routed and we play with IGP metrics according to the draft just mentioned.
Here you see the picture of the optical backbone, it doesn't exist, I just made this up, it's somewhat real. These are all the optical links and this is more on schematic and geographical and you see also the numbers in there are metrics that are 10 times the delay. So 19 means there is a 1.9 millisecond delay between Hamburg and Berlin. We know the fibre is right.
The core sites were decided on, that was all the sites where you see A and Z, Hamburg, Berlin, Frankfurt, we route all the IP circuits over the lowest delay optical path we can get. And you get this layout, you see a ring here, but here you already see a problem occurring. So the link from Dusseldorf to Frankfurt and on to Munich, sits on the same optical link as the link Dusseldorf to Stuttgart, so this is the lowest delay but it seems very ineffective from a utilisation ?? cost perspective, because what happens is if you build IP topology out of this, it looks like this, so every brown arrow over here becomes link on the IP layer so it's a waive length here and becomes IP link over here. You see the topology. I already put the traffic matrix on it that was expected but now if this optical link goes down I lose both on the IP layer. So this meshing you have here, because you have one, two, three links between the two size, is gone because on the optical layer it's not diverse. Now luckily in this case, and you actually get congestion here based on the traffic metrics.
Luckily it's easy to solve without adding too much delay so one of the links we reroute this way to Frankfurt and the other way we get away from Frankfurt, so Dusseldorf we make sure it doesn't go via Frankfurt. And that way, we have a diverse design and you can see that the delays, you can add them up but the difference is fairly small.
The next step was adding the remote sides to the PEs, we have Kiel, closest one is Hamburg; 0. 7 milliseconds there is no overlap, you have two redoesn't links and you connect them no problem.
Bonn is a little bit harder, you start with that one, your primary path most of your traffic on. The second is Frankfurt, the optical waive lengths I sit on the same link as Dusseldorf, if that link goes down I lose my complete site. If I reroute to Frankfurt I can do that but that is not the closest core site any more, Bonn becomes dual homed into Dusseldorf and Stuttgart. You stay very close to your design objectives.
This is the network, you get following those rules. Now, the IGP metrics shows you the delay times, hundreds, so 5.9 milliseconds between Munich and Berlin. I put the projected traffic on it which was not measured anything, given to us and if I now look at all the failures, the highest utilisation I get is 90 percent. On these 10 gig links. The two circuits that were on the same optical link, the highest utilisation was 110 percent so I ?? at least the traffic reduction of 20 percent, simply by making the diversity and making use of the meshing.
Last one is actually applying the LFA algorithms, so we can check for all the links whether it's possible to use to find LFA to protect the traffic for either ?? protect the circuit for all the traffic or some of it or maybe for no traffic. Green one means I can protect, so there is an LFA for every prefix. Yellow means I can protect some of the prefixes but some not. And red means I cannot protect anything.
The average is 75 percent of the interfaces in the network can be protected. You see a lot of red here but that actually is on the backbone, and this has to do with the topology. I will show you later why this happens. That is what Clarence explained, it's very topology dependent and we can see if we can change the route to go make this better but also see how this impacts our other objectives.
This really as an application of the draft, and I highly suggest to read it if you are interested, you build your site topology, you have two core routers, two aggregation routers and this is all your edge traffic. I set the metrics here very high, even lower and lowest over here and it's all specified and you see you get a nice green square and a triangle so if you follow these rules, which actually are mostly common sense because you probably would have done it this way anyway, you get perfect LFA coverage for all your POPs. So the problem is set on this topology on the backbone and has a bit to do with the fact this metric here is extremely low, it has a metric of one. That means if traffic going to Stuttgart, if I would send it to the other core route in Dusseldorf, it will send it back to me. If you choose different routing with a separate a.m. B plane so you make the extremely high then you suddenly have 100 percent coverage. The last thing I tried is to run metric optimisation on this network just to see what it would do with LFA coverage and again matrix optimisation introduces a lot of load balancing and EMPC which is good, and if I do that, I reroute things off the shortest paths or my delay goes up a little bit, it goes up with only 0. 2 milliseconds on average, not that much, and a lot of the red links become yellow so the percentage of total coverage has gone up. A a lot of the things we discussed come together and you can see how you apply them. If you have the right data and tools it's not too hard to put it together.
So, with the three of us, we tried to give you a picture of how you can put all these things together. So for the router vendors to make sure all the data is there, that they properly support, you can get the links based database from the router, and late arer on we skipped that part, also to have more light view because now we all do this manually. There are tools top collect this data and to create traffic matrix out of it and to simulate, this is the simple stuff, once you have all the data you need to see what happens and you can see if you can optimise or change or apply the rules from the draft etc. And put it all together and that is actually the last step is for people who think this is relevant top try it themselves and use all the references we have in here as well. So that was the whole story together. If there are any questions for Paola Lucente or me, feel free to ask.
AUDIENCE SPEAKER: Google. Does packet size distribution matter for capacity planning exercise?
Thomas: The packet size distribution; it might affect QS maybe in some way. But I don't immediately see how it would affect capacity planning.
AUDIENCE SPEAKER: What about per packet per head.
Thomas: That is more application planning. I mean on the network ?? there is some overhead on the encapslation on the sonnet layer or whatever.
AUDIENCE SPEAKER: If you look at alternative IBG or Internet standard, if you look at specific implementations you will realise that as an example not always counted, all of the inter frame gap and preamble is not ? so actually, it does matter and it can matter up to 10 percent for assessment of your capacity planning.
Thomas: Good point. It does come back in the exercise foyer capacity planning, there is a percentage you feel safe filling your links to, that is based on experiments and real data that will have taken those things into account, that gap tells us how much there is between your five?minute measurements and when the link is really full. So what you mention sits in there but not in ARIN explicit way but simply because of the simulations you do and the research on that. But I think it's a very good point to include as well.
SPEAKER: One comment on the question, it's because when were preparing for this and we carry on with other iterations, one objective is to ease the use because we see more and more deployment of this, more and more use and this will increase because people have to reduce costs so it's an important way to reduce costs. The second objective is to really trigger and make sure that colleagues in the industry that have been through this, that have seen some things that could be simplyified, improved, so if there would be something that we should change or correct, that is also why we are doing it. It's to collect this information.
AUDIENCE SPEAKER: Can you please also a clarify, you mentioned that you expect to have less than 50 milliseconds recovery time. Is it for single tunnel, 100 of LSPs, 10,000 LSPs?
SPEAKER: Your question is very good. We were discussing, it's not a discussion about ?? LFA fast route is a reroute technique so you pre compute in advance and you pre load the backup into the hardware such when there is a direct leap connected link failure you enable the backup and your question is good, is the one of practitioner says yes, yes, yes but in practice how much time does it take to really kick in all these pre computed information. So I was driving this project for Cisco so I can talk about it. It must be 50 milliseconds but linear or ?? it's independent of the number of prefixes so if you have 5,000 prefixes in your IGP it doesn't matter how many prefixes you have. So, it is prefix?independent. In the same way that MPLS TE fast reroute in 1999 the first implementation that was released on the 12 K, when it directly connected link failure was going on the LSPs were placed on the backup one by one, the implementation was dependent with the number of LSPs so it was not 50 milliseconds it was only for a small number of tunnels. Nowadays again talking about what I know, since 3.4 for it is LSP independent. So for what I am responsible, it is LSP independent and prefix ins pent, but you are right it's implementation that you need when you qualify this in the lab that you need to verify it.
CHAIR: OK thank you very much. I think we have to stop now. But thank you for the whole session. We are now going to the coffee break. The next session starts at 4:00. It's no longer in the single room, the IPv6 Working Group will take place in this room and the DNS in the side room which is downstairs if you go to the right you will see some stairs that go up a little bit, go down those stairs, you will find the room.
LIVE CAPTIONING BY AOIFE DOWNES RPR
DOYLE COURT REPORTERS LTD, DUBLIN IRELAND.