The Day the Core Goes Down: What Every ISP Should Have Ready

The Day the Core Goes Down: What Every ISP Should Have Ready

The Core Is the Heart of the ISP

The core is not just “a pair of routers.” It’s the point where everything converges:

  • the access network
  • the transport links
  • critical services
  • the Internet exit
  • authentication and management systems

When the core fails, everything fails.
And if there’s no plan, the outage quickly turns into chaos.


What Usually Happens When There’s No Preparation

When the core goes down without a prior plan, the scenario is almost always the same:

  • nobody knows exactly what failed
  • changes are tried blindly
  • configurations are touched in production
  • there’s no updated documentation
  • customers call before the NOC understands what’s happening

Time passes, pressure increases, and every bad decision worsens the impact.


The Right Question Is Not “If,” But “When”

Many ISPs design their network thinking about normal operation, but not about the anomalous day.

The right question is not:
“Can the core fail?”

The real question is:
“What happens when it fails?”

And the answer should be written, tested, and known by the team.


What Every ISP Should Have Ready

1. Real Redundancy, Not Theoretical

It’s not enough to have two devices if:

  • they’re in the same rack
  • they depend on the same switch
  • they use the same power
  • they have the same configuration without validation

Redundancy must be electrical, physical, and logical, and it must be tested.


2. Clear and Updated Documentation

In an emergency, there’s no time to “see later.”

Documentation must exist that quickly answers:

  • real core topology
  • role of each device
  • critical dependencies
  • failover paths
  • emergency access

If documentation only lives in someone’s head, it’s not documentation.


3. Backups That Work (and Are Tested)

It’s not enough to “have backups.”

You need to know:

  • where they are
  • what date they’re from
  • how they’re restored
  • how long it takes to be operational again

A backup that was never tested is just an illusion of security.


4. Emergency Procedures

In a serious outage, the team needs clear answers:

  • who makes decisions
  • what to touch and what not to
  • in what order to act
  • when to escalate
  • when to communicate

Procedures reduce errors and lower stress in critical moments.


5. Useful Monitoring and Alerts

Monitoring shouldn’t alert when the customer already complained.

It should:

  • detect degradations
  • anticipate failures
  • show real impact
  • allow prioritization

Poorly designed alerts generate noise and delay reaction.


The Worst Time to Think Is During the Outage

Many ISPs start designing their plan when the core is already down.
That’s the worst possible time.

Preparation is done cold, with time and technical judgment.
Execution is done hot, following what was planned.


Going Down Is Not the Problem

All cores go down sometime.
The problem is:

  • not knowing how to come back
  • taking longer than necessary
  • learning the lesson too late

A mature ISP is not measured by whether it fails or not, but by how it responds when it happens.


Being Prepared Is a Strategic Decision

Investing in preparation is not a technical expense, it’s a business decision:

  • less downtime
  • fewer lost customers
  • less operational stress
  • more internal confidence

At Ayuda.LA we help ISPs prepare before the critical day, not after.

If today your core works but you’re not sure what would happen if it fails, that day has already started counting.

Let’s talk before.