News Archive (2006)
For current news, check out the LatestNews section
New TWiki software
I'm in the process of upgrading the TWiki software to the new 4.0.5 release. Since this is a major update, things are in a somewhat broken stage right now. I'm trying to make the wiki at least usable again. The major purpose of this entry, aside from warning users, is to check whether topics can be updated at all. OK, so this works. The new TWiki software has a "WYSIWYG" (what you see is what you get) editing feature. While I'm sure that this will appeal to many people, I'm slightly worried that its use could break consistency of layout compared to the old ASCII-based markup. For fixing typos etc. the WYSIWYG feature is probably great, but maybe not so much for adding new topics and paragraphs.
– Main.SimonLeinen - 17 Dec 2006
Linux post-2.6.19 changes
As always, after a new Linux kernel release (see the last entry), there is a flurry of changes that have been queued while the previous release was "frozen". I already noticed a few changes that are relevant for network performance. The drivers for the Intel PRO/1000 family of (Gigabit) Ethernet adapters were improved, including new dynamic interrupt throttling modes for Interrupt Coalescence (or Interrupt Moderation). The e1000
driver will also support IPv6TSO. All adapters from Chelsio should now be supported by the stock kernel, although the TOE functions will still require Chelsio's proprietary driver. A new family of Gigabit and 10Gb Ethernet adapters from NetXen is now supported.
Also, the TCP Vegas implementation was slightly modified to better cope with delayed ACKs.
– Main.SimonLeinen - 3 Dec 2006
Linux 2.6.19 Release
Linux 2.6.19 was released today. It includes the fixes to H-TCP and CUBIC mentioned below. I found another interesting change related to network performance, and TSO in particular: John Heffner had sent a patch to the netdev
mailing list under the subject of Bound TSO defer. He had observed that TSO can make traffic more bursty, in particular over slow links. The patch should reduce the burstiness.
Other features in 2.6.19 include new filesystems such as ext4
, GFS2, and eCryptfs.
In addition, normal (non-root) users can now access some information using ethtool
.
– Main.SimonLeinen - 2 Dec 2006
Fixes for H-TCP and CUBIC to Linux kernel tree
Two fixes were applied to Linus' kernel sources yesterday:
- For a possible integer overflowwithH-TCP at rates over 500 Mb/s
- For a scaling-related math errorinCUBIC
– Main.SimonLeinen - 27 Oct 2006
Linux default TCP congestion control algorithm changed from BIC to CUBIC
Shortly after the release of the 2.6.18 kernel, there was a huge flurry of changes being integrated for the next release (2.6.19). Two of these changes concern TCP congestion control:
- The default congestion control algorithm will be changed from BIC to CUBIC
- The default congestion control algorithm will be more deterministic (independent of module load order), and selectable during kernel configuration. The current configuration choices include Bic, Cubic, Htcp, Vegas, Westwood, and Reno.
-- Main.SimonLeinen - 27 Sep 2006
ABC (TCP Appropriate Byte Counting) default changed in Linux 2.6.18 kernel
According to a change note in Linus' kernel source management system, the default for TCP ABC ( RFC3465 Appropriate Byte Counting) has been changed from on to off. This change was integrated in the 2.6.18 release. The kernel patch includes an update to the documentation for the kernel option, which concisely explains what ABC is about.
Wiki Markup |
---|
The reason that was given for changing the default to _off_ is that ABC would "unfairly penalize\[...\] applications that do small writes". Well, I'm always wary when I hear about "fairness" in connection with TCP, so I cannot judge the merits of the \(three\-character :-\) code change. But the documentation change definitely looks like an improvement! |
OpenSolaris DTrace Network Provider
DTrace (Dynamic Tracing) is an operating system facility that can be used to "instrument" software systems for measurement and diagnosis without modifying their code. It includes the "D" programming language and provides dynamic measurement points from different "providers", which can be in kernel components, run-time libraries, or separate run-time systems such as the Java VM.
DTrace was initially implemented in Sun's Solaris 10 system, but has been (at least in part) ported to FreeBSD, and its integration announced into a future version of MacOS X. (Linux has KProbe, which is also a dynamic instrumentation tool, but apparently limited to the kernel.)
Most DTrace usage so far has focused on traditional application and OS issues such as virtual memory, file I/O, CPU, and lock contention issues. But in principle it would be very attractive for network performance debugging, in particular because it can facilitate measurements over several "layers" fairly seamlessly.
The proposal for a new Network DTrace provider has been posted to the DTrace communitydiscussionsite on OpenSolaris.org with a call for feedback. Please consider to read the proposal and post your comments from a network performance worker's point of view!
– Main.SimonLeinen - 19 Sep 2006
NDT 3.3.12 Release
Last Wednesday new version of the NetworkDiagnosticTool (NDT) can be downloaded. This version is the product of a "Google Summer of Code" project awarded to Jakub Slawinski. The announcement can be found in the ndt-announce
mailing list archive. Notable new features include IPv6 support, and a distribution of the server side (including Web100-enhanced Linux 2.6.17 kernel) as a bootable Live-CD based on the popular Knoppix system.
I have tested the software on a separate server at SWITCH last week, and today I decided to upgrade our "production" NDT server on ndt.switch.ch
to the new version. The applet user interface definitely looks nicer than with the old version!
IPv6 support does work, although I haven't managed to use it from an applet inside a browser, although I am 100% certain that both my browser - I usually use Mozilla 1.7 on Linux and Solaris - and my Java runtime (1.5 something) have IPv6 support. The web100clt
command-line client uses IPv6 by default, and I included an example in the NetworkDiagnosticTool topic. When I download the applet code ( =Tcpbw100.jar=) and run it locally as a Java application with the right options ( =java -Djava.net.preferIPv6Addresses=true -jar Tcpbw100.jar ndt.switch.ch=), it will do an IPv6 measurement. But inside the browser I'll always get IPv4 measurements. I still need to find out why. At any rate, it would be nice if the applet noticed when both IPv4 and IPv6 are available, and give the user the choice.
– Main.SimonLeinen - 28 Aug 2006
DS3.3.3 (PERT Performance Guides) is out
GN2 project deliverable DS3.3.3 was published today as GN2-06-135v2: PERT Performance Guides. This is sort of a snapshot of this PERT Knowledge Base. Please read it, and contribute to the wiki to make the next release (even :-) better!)
Calls for Papers: PFLDnet 2007, PAM 2007
The Call for Papers for PFLDnet 2007 just came in. The conference will be on 7-8 February, 2007 in Marina del Rey, CA (US). I added a note to the PFLDnet section in the TcpHighSpeedVariants topic. Submission deadline is 13 October, 2006.
Another conference of interest is the Passive and Active Measurement Conference (PAM). PAM2007 will take place in Louvain-la-Neuve (BE) on 5-6 April, 2007. The Call for Papers is also out for this one. Registration deadline is 5 October, 2006, with the full papers due one week later (same day as PFLDnet).
– Main.SimonLeinen - 22 Aug 2006
IP Journal article about Gigabit TCP; TCP Westwood+; Joint Techs
Gigabit TCP article in the IP Journal
I was at the IETF meeting in Montr?al last week, where the latest issue of the IP Journal was handed out. It has an article on Gigabit TCP by Geoff Huston, with significant review from Larry Dunn. So it must be good - I haven't yet had the time to read the article, but it has interesting graphs about the congestion control behavior of different TCP variants. Seems to be very helpful for those of us who want to be able to understand the many different methods that have been proposed in the past. See the TcpHighSpeedVariants topic for a pointer to the article.
TCP Westwood+
On the end2end-interest mailing list, there was an announcement of a patch for the Linux kernel to implement the TCP Westwood+ variant developed at the Politecnico de Bari. I have added a new topic WestwoodPlusTCP under TcpHighSpeedVariants.
Internet2 Joint Techs Meeting: 16-19 July 2006
The Joint Techs Meeting is going on these days in Madison, Wisconsin. It is being streamed in several formats. In particular, there's a HD video version consisting of a 20 Mb/s MPEG-2 stream over IPv4 multicast. Try to receive it - it should be an excellent test of your LAN infrastructure and host processing power. Channel information is on the above mentioned streaming page: http://winmedia.internet2.edu/jtmadison2006
Incidentally as I am typing this, someone is presenting measurement tools that are very relevant to our work (e.g. Thrulay). Unfortunately I cannot receive the HD stream because I am at home. The low-bandwidth stream has decent quality though.
– Main.SimonLeinen - 18 Jul 2006
Recent Linux kernel performance improvements
GSO (Generic Segmentation Offload)
Herbert Xu explains GSO (Generic Segmentation Offload) in a post to the =netdev@kernel.vger.org= mailing list. Before GSO, Linux had separate support for LargeSendOffloadLSO for TCP (called "TSO" or TCP Segmentation Offload) and for UDP (called "UFO" or UDP Fragmentation Offload), but only for devices that have hardware "offloading" support for these features.
The idea behind GSO seems to be that many of the performance benefits of LSO (TSO/UFO/...) can be obtained in a hardware-independent way, by passing large "superpackets" around for as long as possible, and deferring segmentation to the last possible moment - for devices without hardware segmentation/fragmentation support, this would be when data is actually handled to the device driver; for devices with hardware support, it could even be done in hardware.
The GSO code is being added to the Linux kernel tree between 2.6.17 and 2.6.18.
TSO/ ExplicitCongestionNotification conflicts being resolved?
Apparently, the current Linux kernel implementation disables TSO when ECN is used. The move to GSO could provide a good opportunity to lift this restriction.
TCP Segmentation Offload over IPv6 for tg3
In addition, a patch for TSO support for TCP over IPv6 for the =tg3= driver has been committed today, despite David Miller's warnings about the difficulties of implementing TSO for IPv6.
-- Main.SimonLeinen - 04 Jul 2006
Changes for SC Bandwidth Challenge
The SC06 conference sent their June 2006 newsletter last night, including a note about their Bandwidth Challenge, which they seem to be attempting to make more realistic every year:
SC06 Bandwidth Challenge: End-to-End Achievement
For six years, the Bandwidth Challenge has been an exciting and engrossing activity at SC conferences. This year the Bandwidth Challenge will focus on a different important facet of networking: End-to-End achievement. Can you fully utilize one 10 Gig path, end-to-end, disk-to-disk, from SC06 in Tampa back to your home institution, using the actual production network back home? Can you realize, demonstrate and publish all of the configuration, troubleshooting, tuning and policies, not only to show off at SC06, but to leave a legacy at your home institution whereby your scientists can achieve the same results after you? This is a decidedly different slant from the previous Bandwidth Challenge competitions, but one that is well worth embracing and will prove both challenging and inspiring!"
More information: http://sc06.supercomp.org/conference/hpc_bandwidth
This Web page mentions that for the next show, there should be a lower number of dedicated "lambdas" into the conference venue, the idea being that users share the existing research networking infrastructure to get there.
– Main.SimonLeinen - 21 Jun 2006
New York Times mentioning propagation delay and its impact on performance
The Technology section of the New York Times ran an article about huge new data centers being built by Google and other companies, Hiding in Plain Sight, Google Seeks More Power. The article mentioned that Google in particular is distributing their servers globally because of speed-of-light based delay:
Google has found that for search engines, every millisecond longer it takes to give users their results leads to lower satisfaction. So the speed of light ends up being a constraint, and the company wants to put significant processing power close to all of its users.
– Main.SimonLeinen - 15 Jun 2006
Slashdot topic: "ISPs Offer Faster Speeds, Why Don't We Get Them?"
This is an interesting topic in general, and although this is mostly about commercial issues such as "truth in advertising", some of the discussion might be relevant for PERT/end-to-end performance work. Here's the catchphrase from the topic introduction:
my grandmother signed up for the 3Mbps DSL plan through Verizon, however a speed test said she was only getting 750Kbps
My current personal pet peeve is that we should stop using "bandwidth" as the indicator for performance (or even "connectivity", see e.g. slide 11 of David West's talk about DANTE's intercontinental activities at TNC'06).
– Main.SimonLeinen - 2 Jun 2006
Xtrace
While randomly surfing the Internet, I stumbled over the new RAD Lab at U.C. Berkeley. The Lab is holding their summer retreat as I write this, and one of the talks today is about Xtrace: a Cross-layer network trace tool. This seems to study ways of providing performance (and other) instrumentation over multiple nodes and multiple protocol layers. In a somewhat researchy stage, but certainly relevant to PERT/end-to-end performance work!
(The talk starts in about six hours, so hurry if you want to participate. I don't know where it is, but it seems to be a 4-5 hour bus ride from Berkeley. Also the meeting is called a "retreat", so it is probably not public. :-)
– Main.SimonLeinen - 1 Jun 2006
Discussions at TNC 2006
I'm back from the TERENA Networking Conference (TNC 2006) in Catania, Italy. Had some interesting discussions there about NetworkBufferSizing. Many backbone operators are afraid of buying routers with small buffers. I have been, too, but now I have convinced myself that with the money we save by buying less-buffered routers, we should be able to provision enough bandwidth so that queueing will never be a problem...
nepim
(Network Pipemeter)
Found the announcement for a new tool called nepim
or "Network Pipemeter" in the comp.protocols.tcp-ip
USENET newsgroup. This seems to be similar to Iperf - can someone look at it?
-- Main.SimonLeinen - 20 May 2006
Back from the IETF meeting. The pmtud
minutes had some information about Linux plans for implementing the new PathMTU proposal, which I noted in that topic.
-- Main.SimonLeinen - 27 Mar 2006
Linux 2.6.16 came out today. Its BIC TCP implementation was changed to use the cubic window growth function suggested in CUBIC. Added a note to TcpHighSpeedVariants.
Linux 2.6.16 also adds a new random-packet-corruption feature to netem
. Added a note to the NetEm topic, and cleaned up the text a bit.
– Main.SimonLeinen - 20 Mar 2006
The deadline for the second release of the "performance guides" deliverable (DS3.3.2v2) is drawing close - it should be ready for internal review by the end of March. In preparation, I made some modifications to this Knowledge Base, in particular surrounding the traceroute family of tools, where there was some duplication as well as some missing information. There are still gaps that I'd like others to work on, for example some demos of Windows tools like PingPlotter or PathPing, as well as more introductory text on when and how to use traceroute, its limitations, etc.
I'd also get rid of the GeneralTools topic, which I consider a catch-all category that can only confuse readers. By moving traceroute out of it, it's already much smaller. The remaining text on "ping" could be moved to a separate topic, maybe under "active measurement" (but I'm not thrilled about that category either :-), and the other small sections on Linux-specific end-system monitoring and configuration tools could be moved to LinuxOSSpecific.
– Main.SimonLeinen - 28 Feb 2006
ACM Queue has started to publish " QueueCasts" on their Web server. These are interviews in the form of "podcasts", i.e. MP3 files that you can put on your portable music player and listen to while on the bus etc. I found the interview with Jarod Jensen, " Large Scale Systems: Best Practices", particularily interesting from a network performance point of view. Jarod talks about modern system instrumentation such as Sun's DTrace, the general problem of "finger-pointing", and many other performance issues in the context of large distributed systems. On the ACM Queue site, there is also a text version of an interview with Jarod Jensen by Kirk McCusick, " A Conversation with Jarod Jenson".
– Main.SimonLeinen - 17 Feb 2006
Continued my investigation of Van Jacobson's networking rearchitecture ideas (see yesterday's entry). Added the pointers to the VanJacobson topic, and sent them to Jamal Hadi Salim, who is favorably mentioned in Van's slides.
-- Main.SimonLeinen - 04 Feb 2006
Today there was quite a bit of traffic on the IRTF End-to-end mailing list on comparative evaluation of TcpHighSpeedVariants, including a somewhat heated discussion between Doug Leith and Injong Rhee, who were both posting from PFLDnet 2006 in Tokyo. This prompted me to look at Injong et al.'s A step toward realistic evaluation of high-speed TCP protocols again (already referenced from TcpHighSpeedVariants), and print it out. While a was at the site, I surfed up to the BIC TCP Web page and noticed a broken reference to the abovementioned comparison paper, which I pointed out to Injong (who was probably on his way back from Tokyo).
Surfing on from there I somehow stumbled over Dave S. Miller's (DaveM) blog entry about a talk at LCA 2006 by Van Jacobson on rearchitecting device drivers and buffer management for networking. Very intriguing. Googled through the Blogosphere until I found the slides of this presentation.
-- Main.SimonLeinen - 03 Feb 2006
Main.TobyRodwell suggested to add a "Latest News" section to the PERT Knowledge Base, so let's try this.
-- Main.SimonLeinen - 31 Jan 2006