Welcome to TechNet Blogs Sign in | Join | Help

Vista Multimedia Playback and Network Throughput

A few weeks ago a poster with the handle dloneranger reported in the 2CPU forums that he experienced reduced network throughput on his Vista system when he played audio or video. Other posters chimed in with similar results, and in the last week attention has been drawn to the behavior by other sites, including Slashdot and Zdnet blogger Adrian Kingsley-Hughes.

Many people have correctly surmised that the degradation in network performance during multimedia playback is directly connected with mechanisms employed by the Multimedia Class Scheduler Service (MMCSS), a feature new to Windows Vista that I covered in my three-part TechNet Magazine article series on Windows Vista kernel changes. Multimedia playback requires a constant rate of media streaming, and playback will glitch or sputter if its requirements aren’t met. The MMCSS service runs in the generic service hosting process Svchost.exe, where it automatically prioritizes the playback of video and audio in order to prevent other tasks from interfering with the CPU usage of the playback software:

When a multimedia application begins playback, the multimedia APIs it uses call the MMCSS service to boost the priority of the playback thread into the realtime range, which covers priorities 16-31, for up to 8ms of every 10ms interval of the time, depending on how much CPU the playback thread requires. Because other threads run at priorities in the dynamic priority range below 15, even very CPU intensive applications won’t interfere with the playback.

You can see the boost by playing an audio or video clip in Windows Media Player (WMP), running the Reliability and Performance Monitor (Start->Run->Perfmon), selecting the Performance Monitor item, and adding the Priority Current value for all the Wmplayer threads in the Thread object. Set the graph scale to 31 (the highest priority value on Windows) and you’ll easily spot the boosted thread, shown here running at priority 21:

Besides activity by other threads, media playback can also be affected by network activity. When a network packet arrives at system, it triggers a CPU interrupt, which causes the device driver for the device at which the packet arrived to execute an Interrupt Service Routine (ISR). Other device interrupts are blocked while ISRs run, so ISRs typically do some device book-keeping and then perform the more lengthy transfer of data to or from their device in a Deferred Procedure Call (DPC) that runs with device interrupts enabled. While DPCs execute with interrupts enabled, they take precedence over all thread execution, regardless of priority, on the processor on which they run, and can therefore impede media playback threads.

Network DPC receive processing is among the most expensive, because it includes handing packets to the TCP/IP driver, which can result in lengthy computation. The TCP/IP driver verifies each packet, determines the packet’s protocol, updates the connection state, finds the receiving application, and copies the received data into the application’s buffers. This Process Explorer screenshot shows how CPU usage for DPCs rose dramatically when I copied a large file from another system:

Tests of MMCSS during Vista development showed that, even with thread-priority boosting, heavy network traffic can cause enough long-running DPCs to prevent playback threads from keeping up with their media streaming requirements, resulting in glitching. MMCSS’ glitch-resistant mechanisms were therefore extended to include throttling of network activity. It does so by issuing a command to the NDIS device driver, which is the driver that gives packets received by network adapter drivers to the TCP/IP driver, that causes NDIS to “indicate”, or pass along, at most 10 packets per millisecond (10,000 packets per second).

Because the standard Ethernet frame size is about 1500 bytes, a limit of 10,000 packets per second equals a maximum throughput of roughly 15MB/s. 100Mb networks can handle at most 12MB/s, so if your system is on a 100Mb network, you typically won’t see any slowdown. However, if you have a 1Gb network infrastructure and both the sending system and your Vista receiving system have 1Gb network adapters, you’ll see throughput drop to roughly 15%.

Further, there’s an unfortunate bug in the NDIS throttling code that magnifies throttling if you have multiple NICs. If you have a system with both wireless and wired adapters, for instance, NDIS will process at most 8000 packets per second, and with three adapters it will process a maximum of 6000 packets per second. 6000 packets per second equals 9MB/s, a limit that’s visible even on 100Mb networks.

I caused throttling to be visible on my laptop, which has three adapters, by copying a large file to it from another system and then starting WMP and playing a song. The Task Manager screenshot below shows how the copy achieves a throughput of about 20%, but drops to around 6% on my 1Gb network after I start playing a song:

You can monitor the number of receive packets NDIS processes by adding the “packets received per second” counter in the Network object to the Performance Monitor view. Below, you can see the packet receive rate change as I ran the experiment. The number of packets NDIS processed didn’t realize the theoretical throttling maximum of 6,000, probably due to handshaking with the remote system.

Despite even this level of throttling, Internet traffic, even on the best broadband connection, won’t be affected. That’s because the multiplicity of intermediate connections between your system and another one on the Internet fragments packets and slows down packet travel, and therefore reduces the rate at which systems transfer data.

The throttling rate Vista uses was derived from experiments that reliably achieved glitch-resistant playback on systems with one CPU on 100Mb networks with high packet receive rates. The hard-coded limit was short-sighted with respect to today’s systems that have faster CPUs, multiple cores and Gigabit networks, and in addition to fixing the bug that affects throttling on multi-adapter systems, the networking team is actively working with the MMCSS team on a fix that allows for not so dramatically penalizing network traffic, while still delivering a glitch-resistant experience.

Stay tuned to my blog for more information.

Published Monday, August 27, 2007 8:00 AM by markrussinovich

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: Vista Multimedia Playback and Network Throughput

Great article Mark, thank you for giving a clear explanation of the problem.

Monday, August 27, 2007 2:32 PM by Tom

# re: Vista Multimedia Playback and Network Throughput

is it the reason that explain why my games are A LOT less performant with windows Vista? does playing audio in games and having Teamspeak (or ventrilo at the same time) reduce the graphic and procesor output for my games because audio is at higher priority (especially in MMORPG where the network is important)?

is there a way to put this value back to normal?

thanks

Monday, August 27, 2007 4:56 PM by Serge Munger

# re: Vista Multimedia Playback

Running XP I just tried playing an MP3 file, a video file and downloading several files from the internet all at the same time.  The audio and video files played perfectly and there was no slowdown in my network speed.

"mechanisms employed by the Multimedia Class Scheduler Service (MMCSS), a feature new to Windows Vista"

Seems to me Microsoft tried to "fix" something that wasn't broken.

Monday, August 27, 2007 5:28 PM by Richard McBeef

# re: Vista Multimedia Playback and Network Throughput

Very useful to know. I've spent a good bit of time trying to figure out why my file transfers were so slow. During all of that debugging, I probably had a music file running in the background...

Serge: Games performance is probably more related to video driver issues. My understanding is that the new Vista video driver model is theoretically just as fast as the XP video driver model, but it requires a lot of stuff to be re-written by the driver developers for best performance. Until it all gets re-written, they're using pieces from the XP driver (changing Vista's input to what the XP driver expects, then changing the XP driver's output to what Vista expects), and the result is that some tasks are done less efficiently.

On my laptop, I could hardly get any video files to play without glitching. CPU usage would go up to 100% and stay there, even though the media file played just fine with only 20% CPU usage on XP. Then by accident I was running another program that was incompatible with Aero and forced Vista to switch into the XP-style video mode. Suddenly my video started playing back smoothly at 20% video usage, just like XP.

Monday, August 27, 2007 5:37 PM by Doug

# re: Vista Multimedia Playback and Network Throughput

Serge,

I don't own Vista yet, but the first thing I thought of when I started reading this post was "games".  Good question...

Monday, August 27, 2007 5:42 PM by Matt

# re: Vista Multimedia Playback and Network Throughput

Mark,

A sincere thank you for explaining the issue in detail and not “suger coating” it—you might renew my faith in Microsoft.

Monday, August 27, 2007 6:47 PM by Anonymous

# re: Vista Multimedia Playback and Network Throughput

Running XP I just tried playing an MP3 file, a video file and downloading several files from the internet all at the same time.  The audio and video files played perfectly and there was no slowdown in my network speed.

Monday, August 27, 2007 5:28 PM by Richard McBeef

*****

It won't effect your internet speed; just your LAN speed.

Monday, August 27, 2007 7:07 PM by Dr.Butt

# Solution found

my first thought was to disable the MMCS service, but the Windows Audio service is dependent on it.

So I ran regedit, and changed the key

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Audiosrv\DependOnService

Just remove MMCS from that key in the list, and set MMCS to disabled in services, then reboot.

As soon as I rebooted I was able to copy files at 40mb/s+ while listening to audio

-Courtney

Monday, August 27, 2007 7:25 PM by Courtney

# re: Vista Multimedia Playback and Network Throughput

Why is XP not having this problem then? This seems like unnecessary feature of Vista.

I've spent countless hours for couple months trying to figure out why copying files over network was going at 5MB/s

Monday, August 27, 2007 7:27 PM by Andrew

# re: Vista Multimedia Playback and Network Throughput

It seems to me that Vista has things backwards here. Rather than ensuring that music playback can be performed within its realtime constraints, and having programmable facilities within the kernel to ask for that, it's arbitrarily degrading other bits of the system "just in case" they might interfere. The dependency is incorrect. It ought to be the scheduler making these kinds of decisions - not the sound subsystem.

It smells more like a nasty hack - as if last-minute testing showed that there were glitches in playback with the latest kernel, so quick fixes needed to be bludgeoned in.

Monday, August 27, 2007 7:35 PM by barrkel

# re: Vista Multimedia Playback and Network Throughput

"While DPCs execute with interrupts enabled, they take precedence over all thread execution, regardless of priority, on the processor on which they run"

Am I reading this correct: The DPCs can all run on core #1 and multimedia can run on the other cores? -> no interruption of multimedia?

Monday, August 27, 2007 8:57 PM by zzz

# re: Vista Multimedia Playback and Network Throughput

Serge: No, the MCSS service is only used by multimedia applications like WMP (actually that's probably all that uses it ATM).

You can revert to old behavior by right clicking My Computer, clicking Manage, finding the Services entry on the left pane and finding the MCSS service mentioned and stopping it.  To make this setting permanent, you have to go into properties and set the startup type to Disabled.  However it will not make your games run any faster. (Hmm, Courtney notes it's dependent on the Windows Audio user-mode stack and offers a workaround... I didn't know that.  Thanks Courtney!)

Games are slower overall because of all the added functionality and features in Vista... even with many services, eyecandy, and programs disabled I still find programs run better in XP (and even better in Linux!).  For example, BioShock runs horribly on my computer in Vista with input and audio lag making it unplayable.  This didn't surprise me too much as my computer didn't meet the minimum specs in the CPU department.  But, in XP, it ran quite acceptably.  I was even able to up the details settings without performance degradation.

I would recommend gamers to stick to Windows XP for the time being, unless you have a DirectX 10 video card and plan on playing DirectX 10 games (and you really, REALLY need DirectX 10 for some reason).  Also note that DirectX 10 cards will not be able to take advantage of new DirectX 10.1 features in Vista SP!.  You'd need a new card for those.

Richard: As noted in the article, you have to have quite high bandwidth (over 100mbits/s) to be able to notice any slowdown.  Also there was something broken they fixed... audio stuttering or jittering in WMP and other media applications.  Now if they only did the same thing for games I might switch over to Vista permanently.

Doug: Unless you're speaking of LAN transfers on a 1gbps network, I doubt that was your problem.

I have noticed problems when running on an IPv4 network.  IPv6 drivers are enabled by default for every network adapter, meaning every time you try to connect to a remote computer/website, Vista tries using IPv6 first.  If your network doesn't support it, this only results in wasted time.  You can disable this protocol by finding the moved "Network Connections" folder (go to Network Control Panel and click on whatever side entry relates to showing network adapters) and then right clicking Properties on desired adapters and unchecking the IPv6 protocol entries.

Andrew: XP can have this problem if enough apps are suching up the CPU.  Some multimedia apps, like Winamp, solve the problem themselves by boosting their own priority (Winamp runs at High by default) but this can be dangerous if done to High or above because the application is not stable because then it may freeze the entire system (fortunately Winamp is stable.  I've never had problems in that area).  

Whenever I have an application which is running a tad pokey, especially games, I boost it to Above Normal using Process Explorer or Task Manager (cmon Mark, fix Procexp so it can replace Taskmgr when UAC is enabled).  This works with most programs (a few games go wonky though).

barrkel: I think all programmers do this,  It's called implementing the required and requested features with the least amount of work.  We're a lazy people. ;)

zzz: I'm guessing you could minimize DPCs and interrupts on one core by putting only threads essential to music playback there, but most likely this would make the core underutilized.  And just like with memory, you don't want to leave it unused because you just end up wasting time which you could be saving by filling it with work.

Monday, August 27, 2007 10:13 PM by Dan

# re: Vista Multimedia Playback and Network Throughput

An excellent explanation Mark.

Maybe I am oversimplifying things, but you mention it limits to 10,000 packets. Wouldn't the obvious solution be to increase this number (or make it configurable). Obviously if you make it too high, your songs can skip. If you have a multi-core CPU (as most will be in the next few years), then it would seem to me to be unnecessary. Perhaps Windows could detect whether this optimisation is really helpful.

Monday, August 27, 2007 11:09 PM by Adam

# re: Vista Multimedia Playback and Network Throughput

Great post Mark!

Vinicius Canto

MVP Windows Server - Admin Frameworks

Brazil

Tuesday, August 28, 2007 12:13 AM by Vinicius Canto [MVP]

# re: Vista Multimedia Playback and Network Throughput

"I have noticed problems when running on an IPv4 network.  IPv6 drivers are enabled by default for every network adapter, meaning every time you try to connect to a remote computer/website, Vista tries using IPv6 first.  If your network doesn't support it, this only results in wasted time."

Actually, the time wasted should be very little.  Unless the hostname has an AAAA record associated with it, the system shouldn't use IPv6 to try and communicate with the host.

The only time it becomes a major delay, is when the remote host supports IPv6, and either the local or remote host isn't properly configured for IPv6.  Then, it has to time out, then switches to IPv4.

IIRC, Teredo is on by default in Vista, which means (provided the teredo relay isn't down and firewall isn't blocking) you have functioning IPv6 on your end enough to be useful.

Tuesday, August 28, 2007 12:21 AM by Brie Bruns

# re: Vista Multimedia Playback and Network Throughput

Windows Vista does allow a fine grained control of priority boost, you just need to find where to look for it instead of disabling the service (MMCSS).

http://msdn2.microsoft.com/en-us/library/ms684247.aspx

Tuesday, August 28, 2007 12:56 AM by Tanveer Badar

# re: Vista Multimedia Playback and Network Throughput

Please elaborate on why this kludge is required in Vista, when XP degrades gracefully in both the Single CPU+High Network Load+Multimedia case, and the insufficient resources to process multimedia properly case.

See: http://episteme.arstechnica.com/eve/forums/a/tpc/f/99609816/m/910004196831?r=628001007831#628001007831

Tuesday, August 28, 2007 2:08 AM by F16PilotJumper

# re: Vista Multimedia Playback and Network Throughput

Why do Vista need seuch a system ?

Ever an old Win2K, on un PIII 1Ghz, with 512Mb can play an audio file without glitchs during network transfert.

Today, with have "big" CPU, and Vista cannot play audio file smoothly during network transfert ? Such a shame ...

Tuesday, August 28, 2007 2:09 AM by Bill2

# re: Solution

I wrote up the details on my fix on my blog

http://courtneymalone.com/2007/08/28/a-note-on-vista-network-speed/

Tuesday, August 28, 2007 3:54 AM by Courtney Malone

# re: Vista Multimedia Playback and Network Throughput

Regarding multiple network cards. If I disable the other network cards will the performance go up (you said 3 cards 1/3 network performance?)

also whats up with  this key?

HKLM\Software\Microsoft\Windows NT\Currentversion\Multimedia\SystemProfile\SystemResponsiveness

seems this is an easy registry edit to allocate less cpu time for MMCSS

Tuesday, August 28, 2007 4:00 AM by Julian W

# re: Vista Multimedia Playback and Network Throughput

I developed lots of audio drivers for the older Microsoft OSes, and interestingly, had to use the DPC to process audio mixing and such so that it wouldn't glitch on systems with lots of thread activity.  The trick to avoid using all the CPU cycles for audio was to keep track of how much time was being spent in the DPC vs. the scheduled threads.

In Vista, trying to "guess" how much of the CPU is being used for MM operations is unrealistic, since it spans the domain from 128kb MP3 playback to DRM'ed HD audio/video playback.  

The MMCSS algorithm defaults to 8ms for MM threads and 2ms for the other threads, but the flaw in the algorithm is that the IRPT and DPC times are not subtracted from the wall-clock times.

Suppose that a really busy network transfer used up 50% of the cpu in the DPC routines.  MMCSS should then see that only 4ms is available for mm out of the 8ms maximum.  The audio stack must do it's part, then, by letting the MMCSS know if that 4ms was enough to process 10ms of output.  For simple audio, it may need only 1ms to generate 10ms of audio output.  OTOH, it may need more than 4ms to insure no glitching, and THAT is the point that the network should be throttled back.

I think Vista gives most of the info needed to do this right -- the new kernel thread time accounting now subtracts DPC (and probably IRPT) from the thread's scheduling quantum.  The audio stack "pulls" output from the hardware back through the various drivers, which should allow determination of how well the computational load is being handled.

Looks like SP1 is needed now more than ever!  Tell Balmer to forget the market-timing issues and just get a Vista SP out in the field.  I'm even thinking seriously about ripping Vista off my new laptop and desktop if SP1 is not going to appear in the near future.

Thanks for your as usual well-stated description of the problem!

Tuesday, August 28, 2007 5:05 AM by Jerry Schneider

# re: Vista Multimedia Playback and Network Throughput

Mark,

I've been following your insights for years. Once again, thank you for the excellent explanation. We have 2 users on the network who use Vista and have complained w/network issues.

Once again Mark, thanks for the years and all the great utilities.

PS: I use XP but have tried Vista. After spending over 15 hours of learning and tweaking, I finally gave up and uninstalled it. After 3 weeks of hard disk thrashing (I continually fought w/Vista services which "magically" reset to default thrash mode), it became futile. You may want to pass up the chain, it is for these reasons Vista will not be quickly adopted.

Tuesday, August 28, 2007 8:26 AM by Steve

# re: Vista Multimedia Playback and Network Throughput

All good and nice, but :

1.  It seems that people have this problem even AFTER they disabled MMCSS.

2.  The performance impact is way bigger (from 500..1000 mbps to sub-100mpbs or even lower. This is by no means a minor impact.

3. As other pointed out,it seems like a classic problem of "Shoot yourself in the foot"  because previous versions of windows do not have this issue nor audio playback smoothness problems.

4. If you really want to help maybe you should take a look on the original forum page (http://forums.2cpu.com/showthread.php?t=83112) and see that the impact reported by other users and what you explain here does not quite add up.

5. As a personal note : with today's computing power  this kind of problem on dual/quad core machines is ....well I don't know if I should say "funny" or "pathetic" :)

Tuesday, August 28, 2007 8:44 AM by Gigel

# re: Vista Multimedia Playback and Network Throughput

"100Mb networks can handle at most 12MB/s, so if your system is on a 100Mb network, you typically won’t see any slowdown".

I think you need to learn a bit about networking. Not every frame is full. At half full, the limited number of frames is reached at 7.5MB/s, well within a 100Mb networks ability.

Tuesday, August 28, 2007 9:02 AM by c3

# re: Vista Multimedia Playback and Network Throughput

This whole thing reeks of shoddy hack.

Seriously, who's idea was this? Have they been taken out and shot yet?

Tuesday, August 28, 2007 9:04 AM by Jimbo

# re: Vista Multimedia Playback and Network Throughput

sell out. you where all about discovering and fixing bugs/glitches, or even malware. now you are part of the MS propaganda. this kind of problem shows how flawed windows vista is. davinci would have torn the whole thing apart and started again.

Tuesday, August 28, 2007 9:06 AM by tomthebomb

# re: Vista Multimedia Playback and Network Throughput

Does this take into account jumbo frames if you're using a gigabit network?  If your frame size is, say, 9000 bytes, at 10000 packets/sec and 8bits/packet, that's 720Mbit (theoretical) throughput.  That's still less than 100% but it's not horrible...

Tuesday, August 28, 2007 9:09 AM by steveo

# re: Vista Multimedia Playback and Network Throughput

This doesnt make any sense design wise.

Why does every OS so far, even WinXP and every *NIX OS doesnt have any trouble handling multimedia + network ?

Furthermore pc hardware get faster over time not slower - the thinking behind this design implies that hardware gets slower because previously in other OS it worked fine.

Once again, braindead developers at M$ which doesnt surprise me at all

Tuesday, August 28, 2007 9:09 AM by Mr. Obvious

# re: Vista Multimedia Playback and Network Throughput

So a non-privileged userland application is able to modify global kernel parameters?

Sounds like a good idea to me.

Any other bits of the kernel I can mess with from a non-priv account?

Tuesday, August 28, 2007 9:26 AM by Dave

# re: Vista Multimedia Playback and Network Throughput

Why have to guess a good value for max packets/sec while playing audio? Why not have some intelligent monitoring to find if audio playback thread is starving for cpu? Like check for empty buffers? Profile normal playback cpu needs, and make sure it receives that?

vajk

Tuesday, August 28, 2007 9:30 AM by Vajk

# re: Vista Multimedia Playback and Network Throughput

What I find most discouraging about this isn't the hack to workaround what was probably a non-issue, but the fact that <i>copying a file</i> takes 41% of the CPU.  What kind of networking stack has that kind of processor overhead?

Tuesday, August 28, 2007 9:41 AM by Rob

# re: Vista Multimedia Playback and Network Throughput

To me, this seems like overly optimistic resource allocation...

I agree 100% that it needed to happen, (and I'm greatful for Mark's detailed response/ explanation) however, it seems to be at so much overhead that it degrades other services...

Given the multi-core CPU scenario we live in now, is this optimization even so necessary?

I remember trying to set up MS ISA 2 years ago on a SMP server (in a hurry) and got very poor performance due to "CPU's fighting over controlling the NIC's" - (I realize that now that you can pin a NIC to a particular CPU)... Is this similar?

Is there any way to pin or isolate these competing processes such that we are not stuck with such low limits to utilization?

thanks again for a great explanation Mark.

Tuesday, August 28, 2007 9:46 AM by Chris

# re: Vista Multimedia Playback and Network Throughput

What I think a few posters are missing is the fact that this has nothing to do with overall CPU speed.  Overall, CPUs are fast and getting faster.  This has to do with priority over the course of milliseconds.  

Both networking and media playback require instant results - because TCP/IP gets processed when it's received, and because most audio is sampled at over 40 kHZ.  On a 2 GHz machine, that means that it's playing something every 22 microseconds (about once every 45,350 processor cycles, and it takes a few cycles to play something).  If it delays too long, or drops a few samples, you hear skipping.

As a result, the concern is that if both network functions and multimedia need the CPU *right now* then you have a collision.  That's why MS limited the networking so it only comes in on average once every 100 microseconds, to prevent that.

Tuesday, August 28, 2007 9:54 AM by Zachary Pruckowski

# re: Vista Multimedia Playback and Network Throughput

One addition: Microsoft completely rewrote the TCP/IP stack for Vista, and doing so they surely made some... hm, mistakes. This might be another reason of strange behavior.

See http://www.microsoft.com/technet/community/columns/cableguy/cg0905.mspx and lots of other pages.

Tuesday, August 28, 2007 10:05 AM by K

# re: Vista Multimedia Playback and Network Throughput

Actually, multimedia glitching is a pretty well-known problem to heavy audio and video users (i.e. multitrack work, editing, not just playback).  A whole alternate universe of drivers has sprung up to deal with this.  It's much less of a problem than it used to be, but under heavy loads, it's still an issue.

Sounds like Microsoft tried to be over-proactive about it, and (as others have said) shot themselves in the foot.  I wonder if this isn't something that went into Vista back in 2001 as an obvious necessity, and then wasn't looked at toward the end of the release cycle...

Tuesday, August 28, 2007 10:07 AM by Jay Levitt

# re: Vista Multimedia Playback and Network Throughput

Responding specifically to Chris, my understanding is that Vista can't do the kind of things you talk about (segregate functions by processor) even if it would be an effective solution, because they simply can't target dual-core machines yet.

There's a significant adoption lag that Microsoft has to adapt to.  Let's briefly look at the gaming world for an example, and then come back to this.  Currently, game designers/programmers for many games have to make sure the game can run on computers are far back as pre-HT P4s.  Therefore, they can't fully optimize for a two-core world across the line yet.  

The same situation occurs with Vista.  Given the average age of the PC install base, and Vista's minimum CPU requirements (800 MHz single core), they can't optimize for dual-core in that sort of a manner.  Also, remember that Vista was designed in a period from 2001-2006.  Dual-cores only became available in mid-2005, and weren't really prevalent in most selling models until mid-2006 (release of Conroe in July, and subsequent price war).  The Vista RTM was shortly thereafter.

Tuesday, August 28, 2007 10:09 AM by Zachary Pruckowski

# re: Vista Multimedia Playback and Network Throughput

Something as essential as music playback and file copying should not require days of investigation by users. Someone write better docs!

PS - I have a *phone* running a 200 MHz ARM with a piddly little OS and am able to play streaming MP3's on a broadband wireless network with no hassles. And a dual-core, 2 GHz desktop uses 41% CPU to simply copy files???

Tuesday, August 28, 2007 10:20 AM by s t

# re: Vista Multimedia Playback and Network Throughput

Something as essential as music playback and file copying should not require days of investigation by users. Someone write better docs!

PS - I have a *phone* running a 200 MHz ARM with a piddly little OS and am able to play streaming MP3's on a broadband wireless network with no hassles. And a dual-core, 2 GHz desktop uses 41% CPU to simply copy files???

Tuesday, August 28, 2007 10:21 AM by s t

# re: Vista Multimedia Playback and Network Throughput

I did some playing on my XP system and got similar results as Richard.  I transfered a copy of Win2KSP4 while I listened to a podcast I have downloaded on my local system.  I did not see any decay in the file transfer speed over the LAN.   I do not consider Mark a sellout because he reported  on this issue.  He identified it, he did not condone it.  IMHO, this seems like a misplaced performance tweak on MS part.  I hope it goes away in SP1.  We still have not moved to Vista, I have intentionally ordered new systems with XP.  It works and I am not willing to turn my shop into a test bed for a first release of an OS.

Tuesday, August 28, 2007 10:28 AM by Guy

# re: Vista Multimedia Playback and Network Throughput

Hi, I am the IPv6 Program Manager at Microsoft.  

Regarding the comment "Unless the hostname has an AAAA record associated with it, the system shouldn't use IPv6 to try and communicate with the host."

Even if the destination has an A and an AAAA record, Vista will prefer IPv4 over Teredo.  The order of precedence is IPv6, IPv4, THEN Teredo.  So a destination host with an A and AAAA record will always be reached using IPv4, NOT TEREDO.

The only time Teredo would be used is if the destination host ONLY had a AAAA record, and there are darn few of those out there.  In other words, leaving IPv6 (and Teredo) enabled on your home PC have absolutely no impact on your networking performance.

Tuesday, August 28, 2007 10:52 AM by Sean Siler [MSFT]

# re: Vista Multimedia Playback and Network Throughput

Rob: ... <i>copying a file</i> takes 41% of the CPU.  What kind of networking stack has that kind of processor overhead?

I regarded this kind of behavior when copying files using SMB. The same file copied from the same server using FTP was many times faster.

Regards, Lothar

Tuesday, August 28, 2007 10:54 AM by Lothar

# re: Vista Multimedia Playback and Network Throughput

@Chris   re: (I realize that now that you can pin a NIC to a particular CPU)

can you elaborate?  I'm looking around and can't find any info - this could be helpful on one of my servers.

Thanks

Tuesday, August 28, 2007 11:00 AM by Erik

# re: Vista Multimedia Playback and Network Throughput

I have had a ton of issues with media playback in Vista on both single core and dual core platforms, with and without Aero even when it could easily support it.  My solution has been to shutdown as many ancillary services as possible, and there are a lot.  Most, if not all of the security services are gone, and that was strictly for my sanity.  Many of the disk services and indexing are shutdown. I do actually know where my content is and don't need any help finding it.  And then I shutdown some more things that just seemed to be hanging out and not really providing any immediately useful service.  The result is a fairly smooth running system that runs aero, the sidebar, other applications, and my i-tunes videos full screen without glitches.  Prior to shutting all of these things down, i-tunes videos were unwatchable, streaming internet video (ie simple slideshows) were unwatchable, and the whole media experience was enough to make me beg for XP back, or even 2000 pro.  My disk access has gone from essentially constant to only when I'm actively doing something.  Now I have my LED for power that stays on, and my disk activity LED is no longer solid 24/7.  Things that ran fine on XP systems with less than half the performance of the current system now run like they should.  Moving large files across the Gigabit network also move like they should.  Vista still won't play wmv files that run fine on my other XP boxes.  Winamp will play them just fine on VISTA, but Media Player says there is a problem with my WHQL video card.  

Why all of this?  There are many other issues with Vista playing media than the one network issue mentioned here and eliminating many of the ancillary services will go a long way, but not nearly all the way, to solving them.  There error codes that Windows Media Player provide are of course useless because there is no description for them.  I would have thought that all of the effort going into the driver certification process and application certification process would have resulted in a more stable and better performing system, but alas this has not been the case.

Tuesday, August 28, 2007 11:03 AM by Mike E

# re: Vista Multimedia Playback and Network Throughput

> Despite even this level of throttling, Internet

> traffic, even on the best broadband connection, won’t

> be affected.

This is only a valid analogy for someone downloading a single file off the Internet from a single source.

Throw in P2P-anything and you get huge number of small packets which Vista will happily throttle for you [sigh], regardless of your actual bandwidth.

One problem is you use packets/sec regardless of how 'full' those packets are.  Another is that Vista apparently has a horribly inefficient TCP stack... 40% CPU usage for a file copy?

Tuesday, August 28, 2007 11:07 AM by Redesign It

# re: Vista Multimedia Playback and Network Throughput

"Games are slower overall because of all the added functionality and features in Vista... even with many services, eyecandy, and programs disabled I still find programs run better in XP (and even better in Linux!).  For example, BioShock runs horribly on my computer in Vista with input and audio lag making it unplayable.  This didn't surprise me too much as my computer didn't meet the minimum specs in the CPU department.  But, in XP, it ran quite acceptably."

- Not to mention all the DRM throughout the audio and video stack.  Benefitting who exactly?  Not really the consumer, they're better off with XP arguably.  DX10 runs a fair bit quicker on XP and Linux (with the "backported" un-official release), that surely means something is very wrong.

Tuesday, August 28, 2007 11:11 AM by Ken Davis

# re: Vista Multimedia Playback and Network Throughput

So this explains the problems with MCE's having stutter problems? As they push data over the network while doing 'media playback' at the same time?

The Transcode360 forums are full of this problem too.

Tuesday, August 28, 2007 12:09 PM by Lee

# The *MAXIMUM* standard frame is 1500

Correction: "the standard Ethernet frame size is about 1500 bytes" should be "the *MAXIMUM* standard Ethernet frame size is about 1500 bytes."

This is significant because it is a "well known fact" that most ethernet frames are a lot less than the maximum length.  It is only when pumping lots of data (e.g. file transfers) that maximum size frames are used heavily, and even then the acknowledgment frames going back to the data source are typically 64 bytes (minimum size).

The minimum frame is 64 bytes, which works out to about 164,000 frames/second so throttling at 10,000 frames/second will throttle the network a *whole* lot worse than the calculation based on 1500 byte packets.

Note that there is the same amount of overhead in a 1500 byte frame and a 64 byte frame, meaning the amount of useful data that gets passed in shorter frames is even worse.

Tuesday, August 28, 2007 12:33 PM by gvb

# re: Vista Multimedia Playback and Network Throughput

Robert Love (Linux Kernel Developer and author of Linux Kernel Development) has a good article comparing Linux's implementation vs. Vista's - he states that even on Gigabit Ethernet Linux equivalent of Vista's DPC is unable to generate more than minuscule amount of CPU Usage.

So can it be concluded that Vista's Network performance is abysmal and to plug that Microsoft had to play around with prioritizing Multimedia and penalizing the network performance? Why would this ugly hack be required if Vista network implementation did not consume this much CPU.

Mark - Answer is really appreciated. Thanks.

Here is the URL for Robert Love's blog post - http://blog.rlove.org/2007/08/those-dang-dpcs-clogging-mmcss.html

Tuesday, August 28, 2007 12:36 PM by LinuxGuy

# re: Vista Multimedia Playback and Network Throughput

The real questions, I suppose, have to do with the various virtualization scenarios - XP under Vista, or Vista under XP, and whether this resource allocation can be "fooled" under these circumstances.

Tuesday, August 28, 2007 12:45 PM by Neil Prestemon

# re: Vista Multimedia Playback and Network Throughput

I just tried killing MMCSS along with the registery hack for the audio service, and it works like a charm. I was stuck at 10% transfers, and now I hit 80%. Fantastic! The interesting thing is why I was experiencing this before. Well, the majority of my network transfers are with media files, which means that I open a folder with lots of media files (i.e., AVIs) and copy them to another folder with lots of media files. No music playing, no video playing, and yet Vista slows down. Why? Because explorer is busy creating previews of all those media files, hence kicking in MMCSS and it's silly slowdown! Well there you go, problem solved, another useless piece of software goes down.

Tuesday, August 28, 2007 12:51 PM by FooBarBaz

# re: Vista Multimedia Playback and Network Throughput

And, yes, the same hardware running Linux can play back a glitch free mp3 while keeping the network link fully loaded with send and receive data.

The answer to this particular problem - switch to Linux.

Tuesday, August 28, 2007 1:04 PM by Anon

# re: Vista Multimedia Playback and Network Throughput

"The minimum frame is 64 bytes, which works out to about 164,000 frames/second so throttling at 10,000 frames/second will throttle the network a *whole* lot worse than the calculation based on 1500 byte packets."

True, but how often does one send out more than 10000 little frames per second?  Usually when you're transferring more than 10000 frames it's because they're part of, for example, a large TCP stream, and those frames are likely the maximum size, i.e. that of the MTU.

Tuesday, August 28, 2007 1:06 PM by steveo

# re: Vista Multimedia Playback and Network Throughput

Mark,

Thanks for an honest discussion of the issue. It's sad that MS can spend billions developing Vista and an obvious magic-number hack (and a buggy one too!) like this is acceptable for production release to the world.

Tuesday, August 28, 2007 1:07 PM by Pete R

# re: Vista Multimedia Playback and Network Throughput

"PS - I have a *phone* running a 200 MHz ARM with a piddly little OS and am able to play streaming MP3's on a broadband wireless network with no hassles. And a dual-core, 2 GHz desktop uses 41% CPU to simply copy files???"

Does your phone transfer files at gigabit network speeds? I presume it does not so it appears you didn't read the article and are making an invalid comparison.

Tuesday, August 28, 2007 1:46 PM by Leo Davidson

# re: Vista Multimedia Playback and Network Throughput

"Why would this ugly hack be required if Vista network implementation did not consume this much CPU."

It seems to be a scheduling/throttling issue, rather than one caused by something using too much CPU. The problem is that fixed magic numbers were chosen, perhaps based on the worst-case hardware, and obviously without thinking about the requirements of gigabit networks.

I don't think Vista is actually using 100% CPU in this situation; the problem is that it's still throttling the network "just in case" of something that won't happen.

With gigabit networks so common now, and for years, it's surprising that this slipped through but it looks like it's being addressed and it's great to have it explained openly, and in detail, so soon after the issue was diagnosed.

Tuesday, August 28, 2007 1:51 PM by Leo Davidson

# re: Vista Multimedia Playback and Network Throughput

Leo Davidson said -

"It seems to be a scheduling/throttling issue, rather than one caused by something using too much CPU. The problem is that fixed magic numbers were chosen, perhaps based on the worst-case hardware, and obviously without thinking about the requirements of gigabit networks."

If I could do network I/O without taxing the CPU I would not land into trouble with scheduling and neither would I need to do any throttling. The problem is not magic numbers themselves - the problem lies int the fact that they were needed.

There is no excuse to use more than tiny amount of CPU to do network I/O even at Gigabit speeds if the OS and networking stack are designed sanely. Read the blog post I linked to.

Tuesday, August 28, 2007 2:26 PM by LinuxGuy

# re: Vista Multimedia Playback and Network Throughput

I'm not so sure that the issue as described here scopes the symptoms broadly enough.  I have a 100Mb network, 3GHz PC and all the latest Realtek HD audio drivers. I generally use iTunes connected my file server where my audio files are stored, and a number of other players for DVD and movie files both locally and on the server. The PC was previously happily running Windows 2000 and 2003 Server (it's my development box), so I know it's not the hardware.  Under Vista, sound output is dire- sound playback hops, skips and squeaks to the extent that I now do my editing on another PC running Win2k.  I have the same problem on a Sony AR31S which also has HD Audio playback.  

Using the same perfmon settings as above, I'm not seeing the same profile.  My my network access is very 'spikey'- zero to 400kps spikes while iTunes plays MP3.  

All these drivers are relatively frequently updated and I've noticed that the performance changes somewhat between versions (up and down), but is never resolved. Even setting iTunes to use it's max buffer size (which would hopefully overcome a variable speed network problem) does not help.

I feel that the problem is perhaps deeper in the HD Audio system that Vista introduces.  The early Realtek HD drivers were rubbish and if I recall MS pushed a HD Audio fix early one- perhaps there's more to learn and resolve?  Unfortunately I've not the network diagnostics and audio driver skills to provide better diagnosis.  I cannot identify any perfmon counters to measure HD Audio drivers, but iTunes CPU usage runs around 5-15%.

I'm just hanging out for a proper fix (doh!)  and wait for a fix and currently play music on a regular CD player.  Bummer.

Tuesday, August 28, 2007 2:34 PM by Stewart

# re: Vista Multimedia Playback and Network Throughput

It's refreshing to hear a valid, lucid explanation of what the issue really is instead of the background "noise" created by the FUD mongers at Slashdot.  Thank you once again for leading the way to understanding core Windows technologies.

Tuesday, August 28, 2007 2:35 PM by FusionGuy

# re: Vista Multimedia Playback and Network Throughput

"It's refreshing to hear a valid, lucid explanation of what the issue really is instead of the background "noise" created by the FUD mongers at Slashdot.  Thank you once again for leading the way to understanding core Windows technologies."

This is a "lucid explanation" of a really BAD DESIGN.

This is not FUD, its purely bad design... Do you know another OS in the world that needs more than 40% of the processor to handle a GigE traffic?

Tuesday, August 28, 2007 2:49 PM by Lucio

# re: Vista Multimedia Playback and Network Throughput

Do you even know how to measure kernel time as a percentage of total processing time in any OS?  I'll give you a hint -- it doesn't show up in a typical CPU utilization chart.

Tuesday, August 28, 2007 3:04 PM by Dupe

# re: Vista Multimedia Playback and Network Throughput

This is not only a Windows problem. You can find the same on many other OS.

Tuesday, August 28, 2007 4:04 PM by CableGuy

# re: Vista Multimedia Playback and Network Throughput

Lucio, nobody is claiming their isn't a problem here. MS (or at least two MS employees in their blogs) are being quite open about the fact there is a problem that needs fixing.

The FUD that I believe FusionGuy was referring to was people jumping to baseless conclusions that this issue was caused by the evil multimedia DRM in Vista and other such nonsense.

It's great that Mark and Larry (http://blogs.msdn.com/larryosterman/default.aspx) from Microsoft have been able to shed light on this new issue in such detail and I hope it puts an end to the FUD. The issue itself still needs to be solved but it's clear that it's being taken seriously, being worked on by the people that need to fix it, and not being hidden or excused.

Tuesday, August 28, 2007 5:38 PM by Leo Davidson

# re: Vista Multimedia Playback and Network Throughput

Mark - sorry, I know you're one of the most credible living authorities on Windows, but you have a lot to learn about IP networks.

A "standard Ethernet frame" isn't 1500 bytes at all - as has been pointed out, the minimum is just 64 bytes, and a realistic average for a TCP stream would be somewhere between 600 and 900 bytes. If you're using (the increasingly popular) UDP or some other connectionless protocol, the average tends to get smaller.

Tuesday, August 28, 2007 5:59 PM by Mike

# re: Vista Multimedia Playback and Network Throughput

"It's great that Mark and Larry (http://blogs.msdn.com/larryosterman/default.aspx) from Microsoft have been able to shed light on this new issue in such detail and I hope it puts an end to the FUD. The issue itself still needs to be solved but it's clear that it's being taken seriously, being worked on by the people that need to fix it, and not being hidden or excused."

No.  This is where you're downright WRONG.

There should be no "issue" in the first place.  Have you read any of the notes in this thread?  Re-read everything, re-read all of them.  Everyone is saying the same thing: how exactly did a FLAW like this get 1) created, 2) pass QA/testing, 3) shipped, and 4) not noticed after the release?

What needs to happen is that Microsoft needs to fire the FTE (or terminate the contract of the dashtrash individual) who idealised/designed the methodology used, and also needs to terminate whoever was involved in the writing/production of the related code.

Yes, someone needs to get fired over this.  I will repeat myself: FIRED.  Terminated.  No more job.  No more free Starbucks coffee, no more little ball-rubbing managerial meetings with PMs, no more time wasting.  Whoever wrote the code needs to be axed, NOW.  Oh, but I'm sure if they're a FTE, they won't be terminated, because Microsoft never fires FTEs -- they just relocate them into other depts. and let them make the same mistakes over and over.  Or maybe they'll become a PM.  ;)

I'm well aware of Microsoft's business practises -- I mean, everyone who works there has taken Microsoft's SBC 2007, right?  SBC talks about *taking responsibility* and making good decisions, and that's what Microsoft supposedly prides itself on doing.  Was this engineering flaw even remotely a good decision?  No.  Was this engineering methodology discussed with other programmers?  Probably.  And no one chimed in with "Uh, this probably isn't a good idea, Joe..."?  Fire them all.

So who's going to pay?  Whose ass is on the line?  Oh right, I forgot: the consumer's.  Every Vista customer will suffer because one or two half-ass engineers at Microsoft decided "this way is AWESOME".

This entire engineering flaw proves that Microsoft is intentionally hiring programmers who need to "think opposite of UNIX" (and not "think outside the box"), does *absolutely NO form of decent QA* on their products, and simply put, doesn't give a rats ass about a better end-user/consumer experience -- all they care about is more services, more abstraction (WINDOWS MEDIA LIBRARY ENUMERATION DEVICE MANAGEMENT SERVICE!!!! YEAH!!!), and more CRAP.

Do not tell me Vista needs a high-end P4 CPU when you've got _engineering flaws_ like this in your device ABI.  This is atrocious.  Back to what I said originally: someone, or some people, need to get FIRED over this.

Tuesday, August 28, 2007 6:33 PM by I like sausages

# re: Vista Multimedia Playback and Network Throughput

Mark, I commend to your notice the eHome tool VPT - contact John Pennock for details. Using this tool, I generated an ~85Mbps outbound TCP stream (on a 100bT network), which immediately fell to ~42Mbps when a local media file was played. This seems a bit more intrusive than your analysis indicated.

VPT allows you to drive configurable bit-rates, as well, both in TCP and UDP protocols, using your choice of packet lengths (or msg sizes for some tests).

You might find it interesting that playing media on the sending side of this flood test results in a larger performance penalty than playing on the receiving side. (That's what's described above) Also, increasing the message size to ~10 MTU almost completely eliminates the effect, at least on the receiving side.

Hope this tool helps shed some light on the matter, when used by more capable hands than mine.

Tuesday, August 28, 2007 6:51 PM by Dick Martin Shorter

# re: Vista Multimedia Playback and Network Throughput

Sounds like MS still don't understand scheduling... Or just presume that no one watches a DVD while doing other things.

I tend to agree with "I like sausages"... The people responsible for this should be fired, or at least demoted to a level where they don't touch code at all (janitor, marketing, graphical people, I don't care).

Tuesday, August 28, 2007 7:27 PM by Tim B

# re: Vista Multimedia Playback and Network Throughput

"Tests of MMCSS during Vista development showed that, even with thread-priority boosting, heavy network traffic can cause enough long-running DPCs to prevent playback threads from keeping up with their media streaming requirements, resulting in glitching."  I'm going to assume that it wasn't "normal" media streaming, but DRM-protected media.  And probably not music, but video.  However, the biggest question that I have is who designed the IP stack so that it handles packets for the TCP/IP driver.  Both TCP and UDP are designed to throttle back if the CPU can't keep up with the traffic.  The DPC should be moving the packet into a buffer and then returning, not calculating TCP checksums.  Let someone interrupible do that; if the buffer fills up in the meantime, let the retransmit algoithms deal with it.

Tuesday, August 28, 2007 8:57 PM by samwyse

# re: Vista Multimedia Playback and Network Throughput

I wish the limit was lower yet. On my wireless network, transferring data at even moderate speeds (a few megabits per second) can cause glitching in audio playback. And this is with a fast dual-core processor.

Tuesday, August 28, 2007 8:59 PM by Adam Zey

# re: Vista Multimedia Playback and Network Throughput

@Erik:

We had several Dual Processor P4's with Hyperthreading, and we were trying to set up MS ISA 2004 on Windows 2003 SP1.  The servers had multiple NIC's, and we ran network throughput testing software between the servers either side of the ISA servers we were testing.

I think it was the combination of using NIC bonding (aka Gigabit-etherchannel) as well as having multiple processors.

It seemed like the nics were continually being handled by different CPU's with some sort of context switch penalty when this happened.  We could only manage something like 70Mbits through these servers when the theoretical maximum should have been 2 Gigabit.

We ran out of time and so replaced the servers with linux which did not exhibit the problem, (could exceed at least 1 gigabit throughput)

However afterwards I read somewhere that there is a setting somehow to enforce each nic to a particular cpu only, which may give better performance.

http://support.microsoft.com/kb/252867 explains  how to set the processor affinity to a single nic, as per the ISA recommendation in KB293640.  

My point to Zachary is that it's not about optimizing code specifically for Hyperthreading/SMP, but more that since it's very common now to have HT or SMP that something would need to be done to counteract this symptom.  Perhaps the TCP/IP stack was fixed past Windows 2003 SP1 or in Vista so this is no longer applicable...

Wednesday, August 29, 2007 1:05 AM by Chris

# re: Vista Multimedia Playback and Network Throughput

The whole trick is to enforce unfair rules in a way the user expects.  Prioritizing audio/video playback over network traffic is very reasonable indeed!  The throttling leaves somewhat to be desired though...

DPCs in the TCP/IP stack being long-running and non-preemptible (except by other interrupts) seems a little problematic.  Once the critical sections of the interrupt handlers are finished, is there any reason not to queue up the remaining work so it can be scheduled fairly against other processes?

The current approach essentially results in priority inversion on behalf of the processes that are doing all of the heavy I/O because the time spent in the kernel doing interrupt processing is not charged against them.  When that happens scheduler priority classes become largely irrelevant and the rest of the system starves.  It might also undermine rate management algorithms depending on how much buffering is going on.

Is it possible to prevent this case through more accurate bookkeeping that takes into account the amount of time spent performing kernel services "on behalf of" a user-space process?  If the consuming process can be identified early enough in the processing chain, then subsequent processing could be deferred unless the process has enough available CPU cycles in its budget for it to proceed.  Thus a process that performed very heavy network I/O could only starve other processes if it resided in a sufficiently elevated priority class.

Essentially, that would make "realtime" priority classes meaningful in the face of DPCs.  But it sounds like a whole lot of work.

So...

What happens when using multi-core CPUs?  Are ordinary processes allowed to run concurrently with DPCs?  If so, can this property be used as part of the algorithm to tune throttling?

Wednesday, August 29, 2007 1:35 AM by Jeff Brown

# re: Vista Multimedia Playback and Network Throughput

Surely setting multimedia playback to the highest possible priority means that it takes all the CPU time it needs and the rest can be used for other processes (i.e., handling network traffic). What's the point of having prioritized processes otherwise? It seems like there's some kind of fundamental flaw in Vista process scheduling. God help us all if anybody tries to use Vista for important stuff like medical equipment monitoring.

Wednesday, August 29, 2007 2:30 AM by tck

# re: Vista Multimedia Playback and Network Throughput

At such a low speed this might be due to something else.

Wednesday, August 29, 2007 3:59 AM by James

# re: Vista Multimedia Playback and Network Throughput

Obviously, if network was not demanding so much CPU, you could give more to the audio subsystem. But Vista would not probably be Vista if it did.

Wednesday, August 29, 2007 4:02 AM by ccj

# re: Vista Multimedia Playback and Network Throughput

I very much disagree with "I like sausages".

Someone tells a story (and I can't remember who) about a programmer who makes a mistake which costs his employer a huge sum of money (millions, maybe even 10s of millions). He gets called into see the boss, fully expecting to be fired, and gets a thorough roasting. At the end of the meeting he asks the boss about handing in his security badge, but the boss replies "I just spent x-million on your education. Why would I want to fire you?"

Wednesday, August 29, 2007 4:18 AM by Tom M

# If you think it's trivial, you don't understand the problem

"Heads must roll?".  Before you pass out from over-ranting, go learn about the difference between a real-time OS and one that schedules threads of various priority.

Oversimplifying, real-time means being able to GUARANTEE a RT thread will receive cycles either periodically and/or for a specified period.  A trivial example is the engine computer in your new car -- it must be able to schedule the proper sparkplug to fire within a few microseconds at high rpm.

OSes designed for real-time don't make for very good interactive, graphical user experiences, since it's not important, for example, for all threads, particular the majority that do things like draw pixel, to run at a precisely specified time.  For this use, you want an OS that can schedule threads using variable priority, while avoiding priority inversion and such, to get a responsive interface.

The basic desktop PC OSes (MS, *nix) use the latter kernel scheduling design because you want a nice user experience.  As you can probably deduce, there are a few types of functions running on the desktop OS that require GUARANTEED cycles at specific times, and if they don't get their "tick" EVERY time, you hear a pop or click or video streak or ....

So how do you implement such a hybrid OS?  The old way (before Vista) was to schedule and run the RT-critical stuff in the Interrupt/DPC arena, leaving the scheduler to handle everything else as interruptible threads.  The problem is that as your add more and more time-critical processes that need guaranteed cycles, you start writing a mini-scheduler that controls the stuff in the IRP/DPC world, and it's really difficult to sync and interact this with the other world of scheduled threads.

Vista, using MMCSS and new dynamic thread priority/importance mechanism, is attempting to move this Real-Time stuff back under a generic thread-scheduling design. One could argue that all of the DPC stuff in Vista should be converted to scheduled RT threds, and that would prevent anything except uncontrolled interrupt handlers from delaying RT functionality.

In RTM Vista, it looks like the network hardware stuff is still running in DPCs that are capable of stalling the RealTime stuff.  The obvious question is why are the DPCs doing so much?  If you think it's easy to do things like avoid deadlocks when you have DPCs and RT threads sharing data, then it won't be obvious to you why a network DPC may have to do more than minimal operation on a chunk of data.  

Blaming "some idiot FTE" for these design problems isn't pointing the gun in the right direction.  Rather, the problem is how to dynamically limit the network DPC activity based on how likely the RT threads are to miss their scheduled time.  You obviously don't want to wait until you hear a pop/click before throttling back, so you must design an anticipatory algorithm to prevent interference by throttling the rate of interrupts that run DPCs. (This is extremely hard to do reliably, across all performance levels of CPUs and network hardware, making the "hack" look more appealing.)

My guess is that future Vista scheduling mechanisms will move most or all of the remaining DPC-level code into real-time threads like the multimedia is now, but until that happens, the next best thing will be to come up with a more effective algorithm to dynamically limit the DPC activity based on how likely the RT threads are to miss their scheduled time.

Read Larry Osterman's blog quoted a few comments above about how isochronous streams require RT scheduling, and realize that major redesign of the kernel is not gonna happen in a SP, leaving the likely near-term solution to be limited dynamic network throttling based on a better anticipatory algorithm for the RT stuff.  (I'm glad I'm not the one who has to come up with the solution -- I've been there and done similar, and it's a brain-twister.) However, some time in the future, the beauty of converting DPC tasks to RT threads is that threads scale very nicely to dual/quad/heptapenta core processors.  

And yes, I realize that Linux doesn't have this problem -- their equivalent of kernel DPCs is handled in the scheduler (which is the main reason it was so tough to get it right).

(I don't work for MS -- in fact, I often don't agree with some of the design approaches they have taken, but I hate to see great software engineers flamed just because understanding the problem is beyond the grasp of some.)

As us old kernel farts used to say, flames to /dev/null  (google it if you don't grok)

Wednesday, August 29, 2007 4:49 AM by Jerry Schneider

# re: Vista Multimedia Playback and Network Throughput

I am perplexed by the claim that Internet traffic is not affected by this, and even more perplexed by its reasoning. Is the writer perhaps unaware that people with beefy Internet connections get 9 gigabits per second of bandwidth in long distance benchmarks, and still much more than 1 gigabits in real world applications?

Wednesday, August 29, 2007 6:18 AM by Erno

# re: Vista Multimedia Playback and Network Throughput

Jerry Schneider - Is it not that DPCs consuming large amount of CPU is the main short term problem for MSFT to resolve? If the DPCs operated efficiently (i.e. they did not consume more than few percent CPU) would not the scheduler have more time to give to the RT Audio task and things be fine? Why would one give up on optimizing the DPCs? With modern NICs with offloads and interrupt coalescing and  such sophisticated features DPCs have no business using so much CPU.

Can anyone explain why optimizing DPCs would not work?

Wednesday, August 29, 2007 6:41 AM by LinuxGuy

# re: Vista Multimedia Playback and Network Throughput

The more I read about this issue the stranger it becomes. There are so many problems it's even hard to know where to start.

1. Why does the audio system require the following:

"For instance, in Vista, the audio engine runs with a periodicity of 10 milliseconds. That means that every 10 milliseconds, it MUST wake up and process the next set of audio samples, or the user will hear a "pop" or “stutter” in their audio playback. It doesn’t matter how fast your processor is, or how many CPU cores it has, the engine MUST wake up every 10 milliseconds, or you get a “glitch”."

Audio is not a hard real time system, it should be a soft real time. You process enough ahead of time and send it to the audio card's buffer, that if you miss a period by say +/-5 ms it does not cause a problem. Unless the OS doesn't trust the audio card.

The part with, "it doesn't matter how many CPUs you have" is also quite wrong. If I have more CPUs, then there's a better chance one of them should be free, or not have enough load to be able to process every 10ms.

"So it doesn’t matter how much horsepower your machine has, it’s about how many interrupts have to be processed."

But won't a machine with more horsepower handle interrupts faster? And this gets us into problem 2.

2. "Network DPC receive processing is among the most expensive, because it includes handing packets to the TCP/IP driver, which can result in lengthy computation. The TCP/IP driver verifies each packet, determines the packet’s protocol, updates the connection state, finds the receiving application, and copies the received data into the application’s buffers."

Why does it need to verify each packet, one would think that can be done on the network card. Packet protocol, update of connection and receiving application should be figured quite fast for today's computers, and I do hope that it is not copying the whole data, and just updating pointers (I assume the data has already been copied from the network drive since it does checksum on it). And check this link out (from 2003 none the less):

http://www.microsoft.com/whdc/device/network/NetAdapters-Drvs.mspx

"TCP and IP Checksum Offload: For most common network traffic, offloading checksum calculation to the network adapter hardware offers a significant performance advantage by reducing the number of CPU cycles required per byte. Checksum calculation is the most expensive function in the networking stack"

"In the Windows Performance Lab, we have measured TCP throughput improvements of 19% when checksum was offloaded during network-intensive workloads."

And then there's 3, which is quite funny in itself - hard codding a throttling value in code. This is from the same article from 2003 from MS (different application, but same principle):

"One reason to load the table from the registry is to avoid hard-coded values in the driver. This also allows for experimenting with the table entries to find the optimal values for several representative workloads."

Wednesday, August 29, 2007 9:56 AM by Bogdan Solomon

# re: Vista Multimedia Playback and Network Throughput

Yikes!!!

Why is it that whenever someone from Microsoft gives some decent technical information on a recent problem, especially ones that haven't yet been fixed, everyone seems to pile-in with a "MS-have-catfood-for-brains" mentality?

I have some ideas about these demographic groups, but no substantive evidence. However, I think the groups are:

1) Generally ignorant people

2) *nix advocates

3) People who maybe didn't pass an MS interview, and don't feel it fair... (and if you're in that category and have a fix, now's the time to let the channels know...)

Of course, if you really hate a particular OS behaviour, you can always switch to an alternative, or roll your own. (And if you choose the last option, then scheduling network and MM code will be the least of your worries!!)

Wednesday, August 29, 2007 1:43 PM by Paul G.

# re: Vista Multimedia Playback and Network Throughput

Bogdan,

Regarding your 1st point:

As stated in Larry's blog and his further comments below it, MS are already getting stick for the latency of the sound system in Vista. Adding *more* latency like you suggest, so that the audio thread doesn't have to wake up as often, is not an option. Larry mentions voice communication as being one thing which is particularly sensitive to latency. (Playing musical instruments through your computer is another example.)

(I don't know enough about the other two points to comment.)

Not directed at Bogdan:

A lot of people are still talking about this issue here and elsewhere as if it were one of CPU usage. It isn't. It's not about not being able to do enough on the CPU; it's about being able to schedule things often enough that they don't glitch. The audio thread wakes up once every 10ms and does a miniscule amount of work. It needs very little CPU to work but if it misses that 10ms wake-up call then everything falls apart.

Larry also states that it is very easy to make XP's audio system glitch under load.

Wednesday, August 29, 2007 1:48 PM by Leo Davidson

# DPCs are from Mars, the Scheduler is from Venus

Ignoring multi-core issues for the moment, realize that DPCs are implemented with a queue -- new DPCs are entered either at the front (High Importance DPC) or at the rear (regular DPCs) by interrupt handlers.  The irpt handler is most concerned with getting DPC queued and then re-enabling the hardware interrupts.  As a DPC completes its processing, the next DPC (if any) is removed from the queue and processed. The important take-away is that NO scheduled threads can be run while pending DPCs are present.

If the audio RT threads MUST run every 10ms, then it only takes some DPCs queued just before or during the RT thread's execution to delay the final mixing and output enough to hear a glitch.

So why not give the Audio a larger output buffer so DPC delays don't matter?  Remember that you must fill the buffer before starting the audio to play, and the larger the buffer, the longer the "delay" until the sound is heard from the speakers.  A good musician can detect audio delays in the order of 4ms or so, and most people will notice 20ms delay, particularly in cases like watching video showing the source of an impulse sound such as a drum.  So, keeping a 15ms buffer from under-run requires 10ms audio thread scheduling AND keeping DPCs from delaying that thread by more than say 4ms.  (Actually, multiple threads must run every 10ms because audio data is "pulled" from upstream sources, and these sources must execute within the "pull" to generate new audio data.)  I hope this gives a little insight into how difficult it is to keep audio from glitching.

So that leaves "optimizing" the DPC stuff to prevent it from delaying the RT audio. (I don't know the network stack like I do audio, so I'll only attempt to discuss the issue.)  For a high-end datapoint, a 1gbit LAN interface with a bunch of minimum-sized packets (say 64 bytes each) would queue a DPC about every 2 microseconds.

Because DPCs "preemptively interrupt" the executing thread, sharing any type of data between the DPC and a thread is complicated.  Suppose the DPC needs to look at something like the protocol handler table, and suppose that network threads must frequently "update" this table.  What if a DPC interrupts the thread doing a table update operation?  The DPC can't look at the table because the data may be inconsistent, but the DPC can't go to sleep to let the thread finish updating the table. So, as threads update the table, they must be able to acquire write ownership on it and this might require disabling interrupts during the update.  But what if that thread is swapped with a higher priority thread while IRPTs are disabled?  It's a really complicated problem to solve correctly.

In practice, this serialized-data-access problem means that a DPC must share as little data as possible with the associated threads.  But, for example, if the DPC can't even determine which protocol queue to drop the packet into, networking breaks down.  I suspect that the current DPC-based network handling is forced to maintain DPC-specific state and thus have to process more of the packet in the DPC to avoid the access serialization problems.  Maybe things like firewall packet filtering must be done in th