?? rfc2923.txt
字號(hào):
18:13:33.637338 A > B: . 1:1461(1460) ack 1 win 17248 (DF)
.
.
.
18:13:35.561691 A > B: . 1514021:1515481(1460) ack 1 win 17248 (DF)
18:13:35.561814 A > B: . 1515481:1516941(1460) ack 1 win 17248 (DF)
18:13:35.561938 A > B: . 1516941:1518401(1460) ack 1 win 17248 (DF)
18:13:35.562059 A > B: . 1518401:1519861(1460) ack 1 win 17248 (DF)
18:13:35.562174 A > B: . 1519861:1521321(1460) ack 1 win 17248 (DF)
18:13:35.564008 B > A: . ack 1481901 win 64680 (DF)
18:13:35.564383 A > B: . 1521321:1522781(1460) ack 1 win 17248 (DF)
18:13:35.564499 A > B: . 1522781:1524241(1460) ack 1 win 17248 (DF)
18:13:35.615576 B > A: . ack 1484821 win 64680 (DF)
18:13:35.615646 B > A: . ack 1487741 win 64680 (DF)
18:13:35.615716 B > A: . ack 1490661 win 64680 (DF)
18:13:35.615784 B > A: . ack 1493581 win 64680 (DF)
18:13:35.615856 B > A: . ack 1496501 win 64680 (DF)
18:13:35.615952 A > B: . 1524241:1525701(1460) ack 1 win 17248 (DF)
18:13:35.615966 B > A: . ack 1499421 win 64680 (DF)
18:13:35.616088 A > B: . 1525701:1527161(1460) ack 1 win 17248 (DF)
18:13:35.616105 B > A: . ack 1502341 win 64680 (DF)
18:13:35.616211 A > B: . 1527161:1528621(1460) ack 1 win 17248 (DF)
18:13:35.616228 B > A: . ack 1505261 win 64680 (DF)
18:13:35.616327 A > B: . 1528621:1530081(1460) ack 1 win 17248 (DF)
18:13:35.616349 B > A: . ack 1508181 win 64680 (DF)
18:13:35.616448 A > B: . 1530081:1531541(1460) ack 1 win 17248 (DF)
18:13:35.616565 A > B: . 1531541:1533001(1460) ack 1 win 17248 (DF)
18:13:35.616891 A > B: . 1533001:1534461(1460) ack 1 win 17248 (DF)
在本例中,每?jī)啥蔚竭_(dá)的數(shù)據(jù)產(chǎn)生一個(gè)ACK。(即使源主機(jī)是同一個(gè),由于沒(méi)有時(shí)間戳
選項(xiàng),在本例中段長(zhǎng)較大)。
如何檢測(cè)
這樣的情況可在當(dāng)通告的MSS比連接的實(shí)際PMTU要大得多的跟蹤包中可看到。
如何修改該問(wèn)題有幾個(gè)建議:
一個(gè)簡(jiǎn)單的辦法是回答每個(gè)包,而不管其大小。這有一個(gè)缺點(diǎn)是在處理大量小包時(shí)產(chǎn)
生大量的ACK;在X窗口系統(tǒng)中就有這樣的應(yīng)用。
一個(gè)稍微復(fù)雜的處理辦法是監(jiān)測(cè)進(jìn)來(lái)的段大小并試圖決定發(fā)送者使用的段大小。這對(duì)
接收者的狀態(tài)要求多一點(diǎn),但計(jì)算得更精確,能避免糊涂窗口綜合癥。
2.3 問(wèn)題名字
從PMTU確定MSS
分類(lèi)
性能
描述
在連接開(kāi)始階段的MSS通告應(yīng)基于系統(tǒng)接口的MTU。(因?yàn)樾屎推渌蜻@可能不是
最大的MSS)。某些系統(tǒng)使用決定的PMTUD值來(lái)決定要通告的MSS值。
這導(dǎo)致了通告的MSS要小于系統(tǒng)能接收的最大的MTU。
意義
通告的MSS向遠(yuǎn)程系統(tǒng)指示了可接收的最大TCP段[RFC879]。若該值太小,遠(yuǎn)程系統(tǒng)
在發(fā)送時(shí)被迫使用小的段長(zhǎng),完全是由于本地系統(tǒng)在較早時(shí)發(fā)現(xiàn)一個(gè)特別的PMTU。
由于Internet上路由器的不對(duì)稱屬性[Paxson97],返回的PMTU和發(fā)送的PMTU完全可
能不同。使用這種辦法限制段長(zhǎng)可能造成性能降低及使PMTUD算法失敗。
即使路由器是對(duì)稱的,人為將段長(zhǎng)限制降低會(huì)使得不可能以后查詢來(lái)決定PMTU是否改
變。
含義
整個(gè)PMTUD的要點(diǎn)是盡可能發(fā)送大的段。若一個(gè)持續(xù)了很長(zhǎng)時(shí)間的連接不能檢測(cè)到更
大的PMTUD,那么就無(wú)法獲得潛在的性能。這破壞了PMTUD的要點(diǎn)。
相關(guān)RFC RFC1191。
[RFC879]有MSS計(jì)算和適當(dāng)值的討論。注意本實(shí)踐并不和這些RFC的定義相沖突。
闡述問(wèn)題的輸出文件
輸出文件是在中間主機(jī)上用tcpdump記錄。主機(jī)A初始化兩條到主機(jī)B的單獨(dú)的連接
A1和A2。路由器是在MTU瓶頸位置。TCP選項(xiàng)照常從所有非SYN包中移走。
22:33:32.305912 A1 > B: S 1523306220:1523306220(0)
win 8760 <mss 1460> (DF)
22:33:32.306518 B > A1: S 729966260:729966260(0)
ack 1523306221 win 16384 <mss 65240>
22:33:32.310307 A1 > B: . ack 1 win 8760 (DF)
22:33:32.323496 A1 > B: P 1:1461(1460) ack 1 win 8760 (DF)
22:33:32.323569 C > A1: icmp: 129.99.238.5 unreachable -
need to frag (mtu 1024) (DF) (ttl 255, id 20666)
22:33:32.783694 A1 > B: . 1:985(984) ack 1 win 8856 (DF)
22:33:32.840817 B > A1: . ack 985 win 16384
22:33:32.845651 A1 > B: . 1461:2445(984) ack 1 win 8856 (DF)
22:33:32.846094 B > A1: . ack 985 win 16384
22:33:33.724392 A1 > B: . 985:1969(984) ack 1 win 8856 (DF)
22:33:33.724893 B > A1: . ack 2445 win 14924
22:33:33.728591 A1 > B: . 2445:2921(476) ack 1 win 8856 (DF)
22:33:33.729161 A1 > B: . ack 1 win 8856 (DF)
22:33:33.840758 B > A1: . ack 2921 win 16384
[...]
22:33:34.238659 A1 > B: F 7301:8193(892) ack 1 win 8856 (DF)
22:33:34.239036 B > A1: . ack 8194 win 15492
22:33:34.239303 B > A1: F 1:1(0) ack 8194 win 16384
22:33:34.242971 A1 > B: . ack 2 win 8856 (DF)
22:33:34.454218 A2 > B: S 1523591299:1523591299(0)
win 8856 <mss 984> (DF)
22:33:34.454617 B > A2: S 732408874:732408874(0)
ack 1523591300 win 16384 <mss 65240>
22:33:34.457516 A2 > B: . ack 1 win 8856 (DF)
22:33:34.470683 A2 > B: P 1:985(984) ack 1 win 8856 (DF)
22:33:34.471144 B > A2: . ack 985 win 16384
22:33:34.476554 A2 > B: . 985:1969(984) ack 1 win 8856 (DF)
22:33:34.477580 A2 > B: P 1969:2953(984) ack 1 win 8856 (DF)
[...]
注意會(huì)話A2的SYN包定義了MSS為984。
解釋什么是正確處理的輸出文件
和前面一樣,輸出文件是在中間主機(jī)上用tcpdump記錄。主機(jī)A初始化兩條到主機(jī)B的單獨(dú)
的連接A1和A2。路由器是在MTU瓶頸位置。TCP選項(xiàng)照常從所有非SYN包中移走。
22:36:58.828602 A1 > B: S 3402991286:3402991286(0) win 32768
<mss 4312,wscale 0,nop,timestamp 1123370309 0,
echo 1123370309> (DF)
22:36:58.844040 B > A1: S 946999880:946999880(0)
ack 3402991287 win 16384
<mss 65240,nop,wscale 0,nop,nop,timestamp 429552 1123370309>
22:36:58.848058 A1 > B: . ack 1 win 32768 (DF)
22:36:58.851514 A1 > B: P 1:1025(1024) ack 1 win 32768 (DF)
22:36:58.851584 C > A1: icmp: 129.99.238.5 unreachable -
need to frag (mtu 1024) (DF)
22:36:58.855885 A1 > B: . 1:969(968) ack 1 win 32768 (DF)
22:36:58.856378 A1 > B: . 969:985(16) ack 1 win 32768 (DF)
22:36:59.036309 B > A1: . ack 985 win 16384
22:36:59.039255 A1 > B: FP 985:1025(40) ack 1 win 32768 (DF)
22:36:59.039623 B > A1: . ack 1026 win 16344
22:36:59.039828 B > A1: F 1:1(0) ack 1026 win 16384
22:36:59.043037 A1 > B: . ack 2 win 32768 (DF)
22:37:01.436032 A2 > B: S 3404812097:3404812097(0) win 32768
<mss 4312,wscale 0,nop,timestamp 1123372916 0,
echo 1123372916> (DF)
22:37:01.436424 B > A2: S 949814769:949814769(0)
ack 3404812098 win 16384
<mss 65240,nop,wscale 0,nop,nop,timestamp 429562 1123372916>
22:37:01.440147 A2 > B: . ack 1 win 32768 (DF)
22:37:01.442736 A2 > B: . 1:969(968) ack 1 win 32768 (DF)
22:37:01.442894 A2 > B: P 969:985(16) ack 1 win 32768 (DF)
22:37:01.443283 B > A2: . ack 985 win 16384
22:37:01.446068 A2 > B: P 985:1025(40) ack 1 win 32768 (DF)
22:37:01.446519 B > A2: . ack 1025 win 16384
22:37:01.448465 A2 > B: F 1025:1025(0) ack 1 win 32768 (DF)
22:37:01.448837 B > A2: . ack 1026 win 16384
22:37:01.449007 B > A2: F 1:1(0) ack 1026 win 16384
22:37:01.452201 A2 > B: . ack 2 win 32768 (DF)
注意會(huì)話A1和A2使用同樣的MSS。
如何檢測(cè)
可以通過(guò)追蹤兩個(gè)單獨(dú)連接的包來(lái)檢測(cè);第一個(gè)應(yīng)該激活PMTUD;在第一個(gè)之后第二個(gè)
應(yīng)該在PMTU值未超時(shí)前盡快啟動(dòng)。
如何修改
如[RFC1122]和[RFC1191]中指出的,MSS應(yīng)該基于系統(tǒng)接口的MTU來(lái)設(shè)置。
3 安全考慮
本備忘錄指出的第一個(gè)安全考慮是,ICMP黑洞常常由于過(guò)于熱心于安全的管理員阻塞所有
ICMP消息引起。那些設(shè)計(jì)和配置安全系統(tǒng)的人理解嚴(yán)格過(guò)濾上層協(xié)議的影響是至關(guān)重要
的。若大多數(shù)TCP實(shí)現(xiàn)無(wú)法從中傳輸數(shù)據(jù)的話,世界上最安全的web站點(diǎn)也就沒(méi)有任何價(jià)
值。修復(fù)所有黑洞要比修復(fù)所有TCP實(shí)現(xiàn)要好得多。
4 致謝
感謝Mark Allman, Vern Paxson,和Jamshid Mahdavi慷慨的幫助審閱了文檔,感謝
Matt Mathis對(duì)引起PMTUD黑洞問(wèn)題的各種機(jī)制的提議及評(píng)論。描述TCP問(wèn)題的結(jié)構(gòu)和該結(jié)
構(gòu)的早期描述是從[RFC2525]中得來(lái)。特別感謝Amy Bock幫助進(jìn)行PMTUD測(cè)試以發(fā)現(xiàn)這些漏
洞。
5 參考
[RFC2581] Allman, M., Paxson, V. and W. Stevens, "TCP Congestion
Control", RFC 2581, April 1999.
[RFC1122] Braden, R., "Requirements for Internet Hosts --
Communication Layers", STD 3, RFC 1122, October 1989.
[RFC813] Clark, D., "Window and Acknowledgement Strategy in TCP",
RFC 813, July 1982.
[Jacobson89] V. Jacobson, C. Leres, and S. McCanne, tcpdump, June
1989, ftp.ee.lbl.gov
[RFC1435] Knowles, S., "IESG Advice from Experience with Path MTU
Discovery", RFC 1435, March 1993.
[RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC
1191, November 1990.
[RFC1981] McCann, J., Deering, S. and J. Mogul, "Path MTU
Discovery for IP version 6", RFC 1981, August 1996.
[Paxson96] V. Paxson, "End-to-End Routing Behavior in the
Internet", IEEE/ACM Transactions on Networking (5),
pp.~601-615, Oct. 1997.
[RFC2525] Paxon, V., Allman, M., Dawson, S., Fenner, W., Griner,
J., Heavens, I., Lahey, K., Semke, I. and B. Volz,
"Known TCP Implementation Problems", RFC 2525, March
1999.
[RFC879] Postel, J., "The TCP Maximum Segment Size and Related
Topics", RFC 879, November 1983.
[RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
Retransmit, and Fast Recovery Algorithms", RFC 2001,
January 1997.
6 作者地址:
Kevin Lahey
dotRocket, Inc.
1901 S. Bascom Ave., Suite 300
Campbell, CA 95008
USA
Phone: +1 408-371-8977 x115
email: kml@dotrocket.com
7 完整的版權(quán)說(shuō)明
Copyright (C) The Internet Society (2000). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
致謝
Funding for the RFC Editor function is currently provided by the
Internet Society.
RFC2923——TCP Problems with Path MTU Discovery TCP的路徑MTU發(fā)現(xiàn)問(wèn)題
1
RFC文檔中文翻譯計(jì)劃
?? 快捷鍵說(shuō)明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號(hào)
Ctrl + =
減小字號(hào)
Ctrl + -