BGP for ISPs: the 7 most costly configuration mistakes (and how to avoid them)

BGP for ISPs: the 7 most costly configuration mistakes (and how to avoid them)

BGP—the Border Gateway Protocol—is the routing protocol that holds the internet together. For an ISP, it is also the protocol whose misconfiguration can cause the costliest, most visible, and hardest-to-explain incidents to a customer.

Unlike a down server or a broken physical link, a BGP incident can propagate far beyond your network. In extreme cases it can affect other ISPs, make specialized news, and create legal exposure.

This article documents the seven BGP configuration mistakes we see most often at Latin American ISPs, their consequences, and concrete ways to avoid them.


Mistake 1: Not filtering prefixes received from customers (no prefix-lists)

The problem: Accepting every prefix a customer announces without validating they are entitled to them.

Consequence: A customer—by mistake or malice—can announce prefixes that are not theirs. If your network accepts and re-announces them, you are participating in route hijacking. Consequences range from harming third-party connectivity to landing on operator blocklists.

The solution: Implement explicit prefix-lists per BGP customer session, allowing only prefixes assigned to that customer. Combine with IRR (Internet Routing Registry) filtering and RPKI for cryptographic origin validation.

! Ejemplo Cisco IOS
ip prefix-list CLIENTE-ASN65001-IN seq 5 permit 203.0.113.0/24
!
router bgp 65000
 neighbor 192.168.1.1 prefix-list CLIENTE-ASN65001-IN in

Mistake 2: Not filtering prefixes sent to upstreams (route leaks)

The problem: Propagating toward upstreams routes learned from customers or peers without filtering them.

Consequence: A route leak can make your network involuntary transit between two networks, saturating your links and potentially causing global-scale incidents. Some of the most notorious internet outages in recent years (including the well-known Cloudflare incident in 2019) involved route leaks as cause or amplifier.

The solution: Implement BGP communities to mark prefix origin and use explicit export policies. Toward upstreams, export only your own prefixes (originated in your AS) and those of customers paying for transit.


Mistake 3: No max-prefix on peering sessions

The problem: Not setting a maximum number of prefixes accepted per BGP session.

Consequence: If a peer has a configuration problem and starts announcing tens of thousands of unexpected prefixes, your router can saturate its routing table, burn CPU processing updates, or simply collapse.

The solution: Configure maximum-prefix on every BGP session with a reasonable limit and a defined action (warning first, then tear-down):

! Cisco IOS
router bgp 65000
 neighbor 10.0.0.1 maximum-prefix 1000 80

The 80 value raises an alert at 80% of the limit before tearing down the session. Tune the limit to what you expect from each peer.


Mistake 4: Not implementing RPKI (Resource Public Key Infrastructure)

The problem: Trusting BGP announcements without cryptographic validation that the originating AS is authorized to announce that prefix.

Consequence: Your network is vulnerable to route hijacking where a malicious or misconfigured AS announces prefixes that are not theirs. Your traffic—and your customers’—can be steered to the wrong places.

The solution: Deploy an RPKI validator (Routinator, OctoRPKI, Fort) and configure routers to reject prefixes with INVALID state. It is one of the most important advances in routing security in recent years, and implementation cost is relatively low.

! Ejemplo de política en JunOS
policy-statement RPKI-POLICY {
    term INVALID {
        from {
            protocol bgp;
            validation-database invalid;
        }
        then reject;
    }
}

Mistake 5: BGP sessions without MD5 authentication

The problem: Establishing BGP sessions without TCP-MD5 authentication.

Consequence: An attacker on the network may attempt forged TCP packets to manipulate or tear down BGP sessions. Although the attack vector requires access to the transport network, in shared environments (IXPs, colocation) it is a real risk.

The solution: Configure MD5 authentication on all eBGP sessions. Basic measure, near-zero implementation cost.


Mistake 6: Not monitoring BGP session state in real time

The problem: Learning a BGP session failed when a customer calls to report no connectivity.

Consequence: High detection time = high MTTR = SLA breach = penalties and loss of trust.

The solution: Active monitoring of BGP sessions with immediate alerts. Tools like Zabbix, Prometheus with PagerDuty/Telegram alerts, or specialized systems such as pmacct or GoBGP for routing analysis. Alerts should arrive in seconds, not minutes.

An ISP operating with 24/7 NOC has visibility of all BGP sessions on a centralized dashboard, with automatic escalation when a session goes down.


Mistake 7: Not documenting routing policy

The problem: Routing policy lives in one or two engineers’ heads, with no formal documentation.

Consequence: In an incident, the engineer who “knows how it works” may be unavailable. Configuration changes made without understanding full policy can have unexpected effects. (Again, the technical hero problem we describe here.)

The solution: Document organizational routing policy in an SSOT (Single Source of Truth): which prefixes are accepted from each peer type, which communities are used and what they mean, which filters exist and why. Keep this documentation in version control (Git) and update it with every change.


The operational perspective

BGP is a protocol that forgives little. A wrong configuration can have consequences far beyond your network. But it is also a protocol where best practices are well documented and accessible: MANRS (Mutually Agreed Norms for Routing Security), LACNIC and ARIN documents, the NANOG community.

The challenge for an ISP is not lack of information: it is having the time, team, and processes to implement these practices in a live operation without interrupting service. Our network engineering services are designed for exactly that context.


Want to audit your network’s BGP security?

At Ayuda.LA we review BGP configuration and routing policies for ISPs in Latin America. If you want to assess your network against these best practices, we can do a no-commitment review.

Request a review →


This article is part of our series on network operations for ISPs. If you want to read upcoming posts, follow us on LinkedIn.