preamble – analogies
Imagine you have a car. Your car is something that needs to be maintained, so sometimes you fill in a bit of engine oil. Especially when it lights up with the check oil light. Similar to that, a computer sometimes asks for updates so you install them.
Imagine if your car engine oil suddenly causes the side effect that you’re driving along nicely, but suddenly your car changes. Your Steering wheel is suddenly on the right shotgun seat, while the driving pedals are still on the left side.
You’re asking for the car mechanic, but even the car mechanic can’t use his tow truck because he is facing the same problem – seems your general engine oil is somehow magic and breaking cars. But the manufacturer of the car has recommended to use this type of oil so how would you know better?.
Can you blame your mechanic?
You can’t use your car anymore. Imagine now you have to fix it by shouting into the rear exhaust pipe, and then perform a rain dance around the engine cover. Then suddenly the problem goes away. Sometimes. Some of the cars will stay crooked and you actually have to disassemble it and reassemble it.
The actual issue
This is what happened to me with the current Unifi Controller 6. We upgraded at one point in the current week. This version has been recommended to us by ubiquiti staff to solve some other issues.
But one day later one of our customer was down with massive arp storms, 100’000 packets per seconds generated out of thin air.
after many hours, we actually notice this here:
this is not really something that should happen, and you cannot reproduce it
so what happens is that every access point or switch is causing a storm. That is because it appears to be a huge massive echo device. Like a mirror.
Some sites go down. We even cause a local larger hosting company which has some of our accesspoints in their datacenter to experience weird packet loss. I find that only because I notice that one mac address seems to be jumping between switches – either the real server on one side, and an AP on the other side.
This is only on sites that have been on unifi controllers for many years. 6 or more years I suspect, so it must be a problem with upgrades over upgrades over upgrades where the config gets corrupted. Sometimes even worse.
Look at my office:
only nobody noticed because everyone was out, fixing issues – and I’m in “work from home” quarantine since march….
let’s see what ubiquiti has to say about this. They surely wasted an entire day of 4 people here.