A few weeks ago we announced the beta release of Dryad, a capability for big data running on Windows HPC Server 2008 R2 and Microsoft HPC Pack 2008 R2-based clusters with Service Pack 2.
There's now a white paper by David Chappell that explains the newly-named LINQ to HPC in some detail:
Introducing LINQ to HPC: Processing Big Data on Windows
This paper provides an overview of the big data capability provided by LINQ to HPC. High performance computing jobs today are most often CPU-bound, and so the goal of running on a cluster is to let them use many CPUs at once. Big data jobs, by contrast, are typically I/O bound—they’re limited by how fast they can read and write data. Rather than relying primarily on processing power, these jobs most often do relatively simple computations on massive amounts of information. Running big data jobs on a cluster still makes sense, but the goal isn’t to use lots of CPUs at once—it’s to use lots of disks at once. When a single application can read from many disk spindles simultaneously, processing that data gets much faster. LINQ to HPC provides a platform for developers to build applications that can process large amounts of unstructured data. LINQ to HPC is part of Windows HPC Server 2008 R2 SP2.
Definitely worth a read if you're considering evaluating LINQ to HPC. Once you’ve read the paper and want to look at some examples of LINQ to HPC then the beta download includes dozens of examples of LINQ to HPC queries.
You can also watch Saptak Sen's talk from TechEd 2011, Running "Big Data" Applications on a Windows HPC Server Cluster on Channel 9.