Wednesday, December 19, 2012

What is the difference between DHCPv6 and DHCPv4?

這是很久以前寫的東西, 印象中去年還是什麼時候有拿出來過一次, 剛剛在硬碟裡面找東西無意間看到, 稍微增補並翻譯成中文貼出來異地備援一下.


What is the difference between DHCP6 and DHCP4? (IPv6 的 DHCP6 跟 IPv4 的 DHCP4 有哪些不同?)

以下列出幾項值得注意的差異:

1. 這兩個是完全不同的協定 (protocol):
  • DHCP4 是基於 BOOTP 這個古老的協定增修而來的
  • DHCP6 是一個完全重新開發的協定, 他針對以往 DHCPv4 裡面一些效率比較差的行為做了加強.
2. Multicast:
  • 與許多 IPv6 的協定一樣, DHCP6 使用 multicast 溝通以增進效率, 而不是 DHCPv4 所用的 broadcast.
3. Link-local 位址:
  • 因為是使用 multicast, 所以就必須先有 IP 位置才能發送請求封包, 而不是如同 DHCPv4 一樣用 MAC 位址就可以. 所以 DHCP6 的 clients 就使用 IPv6 獨有的 link-local 位址來發送 DHCP6 的請求封包.
4. 單次訊息交換即可完成:
  • Client 可以把所有介面的需求經由單一次的 DHCP6 請求發送給 DHCP6 Server. 而 Server 就可以一次把所有的介面需要的資料 (例如 IP 位址) 提供給 Client. 這樣做比多次請求來得有效率.
5. Stateful 或 Stateless:
  • DHCP6 可以使用 Stateful 或是 Stateless 模式運作.
  • 所謂的 Stateful 是 Client 經由 DHCP6 Server 取得 IPv6 位址以及其他資料, 這個模式幾乎等同於 DHCP4 的運作方式.
  • Stateless 是 Client 使用其他方法 (例如 SLAAC) 取得 IPv6 位址, 而 DHCP6 只用來提供其他的資訊, 例如 DNS Server 的資料.

6. DHCP6 無法提供預設閘道 (default router 或 default gateway) 資訊
  • DHCP6 其中一個非常為人詬病的問題就是 DHCPv6 無法提供預設閘道的資訊給 Client, 而必須要使用 RA (router advertisement).
NOTE:
  • 因為 DHCP6 跟 RA 這種搭配, 以及 RA 設計上一些過度理想化的假設前提, 使得網路設計, 管理與除錯的時候要增加額外的步驟. 企業在導入 IPv6 時候必須要對用戶端網路相關的標準作業文件 (SOP) 進行大規模修訂, 而造成高昂的隱性成本.
  • DHCP6 這個功能上的缺陷已經被檢討很久 (在 DHCPv6 剛出來就許多人提出質疑, 我也跑去放箭過 ^_^), 但正反兩派僵持不下使得修正仍在草案階段, 第一版修正在 2011 年被提出, 目前最新第五版修正是 2012 年 8 月提出, 但修正案還沒有通過. ( http://tools.ietf.org/html/draft-ietf-mif-dhcpv6-route-option-05 )

以下是這次的小小題外話.

IPv6 已經十幾二十歲了, 當年他的許多優點到現在看起來, 大概都隨著 IPv4 相關協定的修正與研發而消失了. 以企業網路的角度來看, IPv6 真的對企業營運有幫助的好處也只有『IP 位址很多』而已. 其他原本的各種訴求, 大多可以在現有的環境上做到, 而且解法都已經相當的成熟與穩定. 除了 IP 數量以外其他做不到的事情, 說穿了也就是企業運作不需要 / 不重要的東西.

Monday, December 17, 2012

GRE keep-alive on Juniper J-/SRX ?

GRE itself is purely session-less stuff and there is no built-in mechanism to detect the tunnel status. Different vendor then create different method to check the GRE tunnel status.

For example, Cisco IOS can config "keep alive" on the GRE interface, and Juniper JUNOS can config "keep alive" under [edit protocol oam gre-tunnel interface-name] level.

Unfortunately, Juniper J-series and SRX do not support [protocol oam] at this moment. The unconditionally "up" status on GRE interface could potentially lead to black hole.

In my environment, I do have BGP peering over the GRE tunnel between devices on two ends. Fortunately I can use BFD on BGP peering session to detect the connectivity and able to react to network failure quicker.

It's very easy to config BFD on Juniper BGP protocol, as below

[edit protocol bgp group XXX] or
[edit protocol bgp group XXX neighbor YYY]
set bfd-liveness-detection minimum-interval 1000

Where the unit of internal is ms, hence 1000 means 1 second.

During the setup of BFD, original BGP session status is intact. It is safe to setup BFD on one side and then work on another side. Also "clear bfd adaptation" command is hitless.

It is always good to have OAM or BFD when running things over Metro-E or Tunnel.

Saturday, December 15, 2012

Juniper SRX-100, GRE over IPsec and Bypass Session Table

Juniper made a very unwelcome decision to terminate packet-mode JUNOS on J-series router since 9.4 that indeed creates lots of concerns to some people, such as me, who uses J-series router as a "real" router.

Since that we need to worry about the factor of new session per second, as well as concurrent sessions when we deploy and operate the router. The new SRX firewall, according to some rumors I heard, maybe eventually retire J-series router. Juniper also advertises SRX as firewall and "security" router for branch offices.

The only reason I can image is there are some politics inside the company prevents product management team listen to customers and insist "security router" is a good selling point. However, Juniper must fully aware the burden from session table when deploy J-series and/or SRX as a pure router. Other wise the "packet-mode" and "selective packet-mode" functions will not be created.

Back in packet-mode JUNOS, it was a happy time to play with J-series router with ever capabilities, including IPsec VPN. With the flow-mode JUNOS when turning the router into packet-mode we no longer able to create IPsec VPN with remote sites, but running in flow-mode makes our NOC nervous, worrying about session table usage all the time.

According to the selective-packet-mode document, if we can establish GRE over IPsec to remote site, and put GRE interface and all down-link interface into packet-mode, we should able to bypass the session creation on those interfaces; turns out we should have very limited sessions that related to IPsec itself rather than huge amount of user sessions that travel through the box.

Since the requirement is to carry layer-3 IP traffics to remote site, rather then carry layer-2 packets. The TCP protocols should able to adjust MTU size by itself rather than rely on the fragmentation / reassembling mechanism when encapsulate user packets into GRE. So the involvement of IDP is not necessary (J- / SRX uses IDP module to reassemble GRE packet... another weird / bad decision.)

Following is the configuration and performance testing by using two SRX-100 to demo this idea.

/* == SRX-100 VPN Box @ LAB2 == */
interfaces {
    fe-0/0/0 {
        description "## PC under LAB2 ##";
        unit 0 {
            family inet {
                filter {
                    input packet-mode-ipv4;
                }
                address 10.2.0.254/24;
            }
        }
    }
    gr-0/0/0 {
        description "## GRE overhead 24 bytes ##";
        unit 1 {
            description "## GRE to HQ ##";
            tunnel {
                source 172.31.0.2;
                destination 172.31.0.1;
                path-mtu-discovery;
            }
            family inet {
                mtu 1400;
                filter {
                    input packet-mode-ipv4;
                }
                address 10.0.0.2/30;
            }
        }
    }
    fe-0/0/3 {
        description "## Internet Uplink ##";
        unit 0 {
            family inet {
                address 2.2.2.2/24;
            }
        }
    }
    lo0 {
        unit 0 {
            family inet {
                address 10.0.2.253/32 {
                    primary;
                    preferred;
                }
                address 172.31.0.2/32;
            }
        }
    }
    st0 {
        description "## IPsec overhead (proposal std/std) 62 bytes ##";
        unit 1 {
            description "## ipsec to HQ ##";
            family inet {
                mtu 1438;
            }
        }
    }
}
routing-options {
    rib inet.0 {
        static {
            /* == default route to Internet == */
            route 0.0.0.0/0 next-hop 2.2.2.254;
            /* == HQ GRE End-Point == */
            route 172.31.0.1/32 next-hop st0.1;
            /* == HQ PC == */
            route 192.168.101.0/24 next-hop 10.0.0.1;
        }
    }
}
security {
    ike {
        policy ike_pol_hq {
            mode main;
            proposal-set standard;
            pre-shared-key ascii-text "MYKEY";
        }
        gateway gw_hq {
            ike-policy ike_pol_hq;
            address 1.1.1.1;
            local-identity inet 2.2.2.2;
            external-interface fe-0/0/3.0;
        }
    }
    ipsec {
        policy ipsec_pol_hq {
            perfect-forward-secrecy {
                keys group2;
            }
            proposal-set standard;
        }
        vpn hq {
            bind-interface st0.1;
            vpn-monitor;
            ike {
                gateway gw_hq;
                ipsec-policy ipsec_pol_hq;
            }
            establish-tunnels immediately;
        }
    }
    alg {
        dns disable;
        ftp disable;
        h323 disable;
        mgcp disable;
        msrpc disable;
        sunrpc disable;
        real disable;
        rsh disable;
        rtsp disable;
        sccp disable;
        sip disable;
        sql disable;
        talk disable;
        tftp disable;
        pptp disable;
    }
    flow {
        tcp-session {
            no-syn-check;
            no-syn-check-in-tunnel;
            no-sequence-check;
        }
    }
    policies {
        from-zone trust to-zone untrust {
            policy trust-to-untrust {
                match {
                    source-address any;
                    destination-address any;
                    application any;
                }
                then {
                    permit;
                }
            }
        }
        from-zone untrust to-zone untrust {
            policy untrust-to-untrust {
                match {
                    source-address any;
                    destination-address any;
                    application any;
                }
                then {
                    permit;
                }
            }
        }
        from-zone trust to-zone trust {
            policy trust-to-trust {
                match {
                    source-address any;
                    destination-address any;
                    application any;
                }
                then {
                    permit;
                }
            }
        }
    }
    zones {
        security-zone trust {
            host-inbound-traffic {
                system-services {
                    all;
                }
                protocols {
                    all;
                }
            }
            interfaces {
                lo0.0;
                fe-0/0/0.0;
                gr-0/0/0.1;
                st0.1;
            }
        }
        security-zone untrust {
            screen untrust-screen;
            host-inbound-traffic {
                system-services {
                    ping;
                    ike;
                }
            }
            interfaces {
                fe-0/0/3.0;
            }
        }
    }
}
firewall {
    family inet {
        filter packet-mode-ipv4 {
            term all-packet-mode {
                then {
                    packet-mode;
                    accept;
                }
            }
        }
    }
}
/* == END of config on VPN Box @ LAB2 == */


/* =============================================== */


/* == SRX-100 VPN Box @ HQ == */
interfaces {
    fe-0/0/0 {
        description "## PC under HQ ##";
        unit 0 {
            family inet {
                filter {
                    input packet-mode-ipv4;
                }
                address 192.168.101.254/24;
            }
        }
    }
    gr-0/0/0 {
        description "## GRE overhead 24 bytes ##";
        unit 2 {
            description "## GRE to LAB2 ##";
            tunnel {
                source 172.31.0.1;
                destination 172.31.0.2;
                path-mtu-discovery;
            }
            family inet {
                mtu 1400;
                filter {
                    input packet-mode-ipv4;
                }
                address 10.0.0.1/30;
            }
        }
    }
    fe-0/0/3 {
        description "## Internet Uplink ##";
        unit 0 {
            family inet {
                address 1.1.1.1/24;
            }
        }
    }
    lo0 {
        unit 0 {
            family inet {
                address 10.0.1.253/32 {
                    primary;
                    preferred;
                }
                address 172.31.0.1/32;
            }
        }
    }
    st0 {
        description "## IPsec overhead (proposal std/std) 62 bytes ##";
        unit 2 {
            description "## ipsec to lab2 ##";
            family inet {
                mtu 1438;
            }
        }
    }
}
routing-options {
    rib inet.0 {
        static {
            /* == default route to Internet == */
            route 0.0.0.0/0 next-hop 1.1.1.254;
            /* == LAB2 GRE End-Point == */
            route 172.31.0.2/32 next-hop st0.2;
            /* == LAB2 PC == */
            route 10.2.0.0/24 next-hop 10.0.0.2;
        }
    }
}
security {
    ike {
        policy ike_pol_lab2 {
            mode main;
            proposal-set standard;
            pre-shared-key ascii-text "MYKEY";
        }
        gateway gw_lab2 {
            ike-policy ike_pol_lab2;
            address 2.2.2.2;
            local-identity inet 1.1.1.1;
            external-interface fe-0/0/3.0;
        }
    }
    ipsec {
        policy ipsec_pol_lab2 {
            perfect-forward-secrecy {
                keys group2;
            }
            proposal-set standard;
        }
        vpn hq {
            bind-interface st0.2;
            vpn-monitor;
            ike {
                gateway gw_lab2;
                ipsec-policy ipsec_pol_lab2;
            }
            establish-tunnels immediately;
        }
    }
    alg {
        dns disable;
        ftp disable;
        h323 disable;
        mgcp disable;
        msrpc disable;
        sunrpc disable;
        real disable;
        rsh disable;
        rtsp disable;
        sccp disable;
        sip disable;
        sql disable;
        talk disable;
        tftp disable;
        pptp disable;
    }
    flow {
        tcp-session {
            no-syn-check;
            no-syn-check-in-tunnel;
            no-sequence-check;
        }
    }
    policies {
        from-zone trust to-zone untrust {
            policy trust-to-untrust {
                match {
                    source-address any;
                    destination-address any;
                    application any;
                }
                then {
                    permit;
                }
            }
        }
        from-zone untrust to-zone untrust {
            policy untrust-to-untrust {
                match {
                    source-address any;
                    destination-address any;
                    application any;
                }
                then {
                    permit;
                }
            }
        }
        from-zone trust to-zone trust {
            policy trust-to-trust {
                match {
                    source-address any;
                    destination-address any;
                    application any;
                }
                then {
                    permit;
                }
            }
        }
    }
    zones {
        security-zone trust {
            host-inbound-traffic {
                system-services {
                    all;
                }
                protocols {
                    all;
                }
            }
            interfaces {
                lo0.0;
                fe-0/0/0.0;
                gr-0/0/0.2;
                st0.2;
            }
        }
        security-zone untrust {
            screen untrust-screen;
            host-inbound-traffic {
                system-services {
                    ping;
                    ike;
                }
            }
            interfaces {
                fe-0/0/3.0;
            }
        }
    }
}
firewall {
    family inet {
        filter packet-mode-ipv4 {
            term all-packet-mode {
                then {
                    packet-mode;
                    accept;
                }
            }
        }
    }
}
/* == END of config on VPN Box @ HQ == */

/* =============================================== */

/*== iperf tcp performance test from lab2 to hq == */

ylchang@lab2pc:~> iperf -mN -i 1 -w 1m -c 192.168.101.101
------------------------------------------------------------
Client connecting to 192.168.101.101, TCP port 5001
TCP window size: 1.00 MByte (WARNING: requested 1.00 MByte)
------------------------------------------------------------
[  3] local 10.2.0.202 port 52399 connected with 192.168.101.101 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  5.88 MBytes  49.3 Mbits/sec
[  3]  1.0- 2.0 sec  4.88 MBytes  40.9 Mbits/sec
[  3]  2.0- 3.0 sec  5.00 MBytes  41.9 Mbits/sec
[  3]  3.0- 4.0 sec  5.00 MBytes  41.9 Mbits/sec
[  3]  4.0- 5.0 sec  5.12 MBytes  43.0 Mbits/sec
[  3]  5.0- 6.0 sec  5.12 MBytes  43.0 Mbits/sec
[  3]  6.0- 7.0 sec  5.00 MBytes  41.9 Mbits/sec
[  3]  7.0- 8.0 sec  5.12 MBytes  43.0 Mbits/sec
[  3]  8.0- 9.0 sec  5.12 MBytes  43.0 Mbits/sec
[  3]  9.0-10.0 sec  5.12 MBytes  43.0 Mbits/sec
[  3]  0.0-10.0 sec  51.5 MBytes  43.1 Mbits/sec
[  3] MSS size 1348 bytes (MTU 1388 bytes, unknown interface)
/*== End of iperf tcp performance test == */

With this configuration, we successfully bypass session creation when traffic travel through IPsec VPN between sites. The entire box has only 5 sessions (include the telnet session I used to login) even the user PCs creates tens of thousand connections across the sites.