December, 2006

All Posts
  • Port25

    What Lies Beneath: Setting up underlying HPC tools

    • 0 Comments

    by kishi on December 21, 2006 07:34pm


    This blog continues what I started writing about w/ Thinking About HPC Infrastructure and what Frank wrote in about in Overloading Clusters.

    After reading thru the previous blogs on HPC, someone might ask “What are some of the core components of HPC ?”. After all, once you’ve seen the outside of a Maserati or a Pantera DeTomaso, you’re not going to be satisfied just by ogling at it. Even after a test drive, the engineer in you will want to pop the hood and see what’s inside. Taking a similar approach let’s uncover some underlying HPC technologies by looking at any basic HPC setup. Once all the provisioning has been completed, the HPC system will be physically deployed with an OS and relevant drivers, utilities etc. Yet, before the actual HPC application can get installed across, there remains a critical step in the process, i.e. configuration of cluster and file system along with any tools and interfaces such as MPI (Message Passing Interface) etc. After peeling through the HPC application layer, its worthwhile to do a “deep-dive” into what really runs the HPC clusters. A broad category of these tools are:

      • Cluster Management tools e.g. CSM
      • Job Scheduling tools e.g. SCALI, Maui
      • Resource Management tools e.g. Torque


    If you’re trying to understand the “WHY” behind the existence of these tools and their importance, take a look at Cluster Management for example. Cluster configuration, installation and management can be difficult and requires intimate familiarity with the HPC hardware, OS, underlying architecture etc. Without specific tools that attend to and manage specific underlying HPC sub-components, HPC just won’t be what it is. So, it is worthwhile to understand the unique installation experience of the tools, such as the ones listed above to understand the complexity of HPC systems. Ready – let’s dive in to the installation and function of these tools:

    1. SCALI: The SCALI management and MPI software packages provide deployment, monitoring and job scheduling services for a cluster.  After you deploy this software, you will be able see all the compute nodes that may have been preconfigured or are configured on your system. Scali will enable you to monitor the systems and run jobs using the SCALI graphical interface.  In order to license the SCALI software, you must utilize the scainstall command to produce a license request file.  This file can then be sent to SCALI to receive a permanent key. For those that need some hand-holding through this, luckily SCALI provides very comprehensive documentation on their website.  A large portion of the SCALI Manage User’s Guide is dedicated to pre-setup planning and configuration of the cluster and the network.  The documentation provides detailed recommendations about how you can set up their Ethernet-based network environment and out-of-band management network.  The documentation also provides a general overview about how to install and configure higher performance interconnects, including bonded Ethernet, Infiniband, Myrinet and SCI. The SCALI Manage interface provides simple tools to assist in configuring and testing DET, Infiniband, and Myrinet devices for use with the SCALI MPI implementation.  The SCALI MPI software supports multiple Infiniband stacks including Mellanox, Topspin, Voltaire and Infinicon.

    2. HP-MPI: HP-MPI is Hewlett-Packard’s Linux-based implementation of the Message Passing Interface (MPI).  Many of the utilities distributed with HP-MPI are similar to other common MPI utilities such as MPICH - e.g. mpicc, mpirun, etc. In order to utilize the HP-MPI software, a license is required for each CPU core in the cluster.  To obtain a license file you are required to obtain the MAC address from each node (typically eth0) and input that information into a form at licensing.hp.com.  The resulting file can then be copied to the compute node. The HP-MPI software is non-functional until licensing files are generated for the nodes

    3. CSM (Cluster Systems Management): The CSM software suite is designed to automate the deployment and management of cluster nodes.  Nodes can be remotely installed with an operating system as well as the CSM software for later monitoring.  The CSM software supports RedHat and Novell on multiple platforms.  In order to obtain and install the CSM software one must register with IBM’s website and download the required RPMs. In order to configure CSM, it can remotely install the operating system and/or the CSM software on the compute nodes.  Much like Platform ROCKS, CSM makes use of PXE functionality and RedHat’s kickstart or the autoyast software to remotely install the operating system. The CSM software provides multiple methods for defining the nodes that should be deployed and managed:

    a. The first method involves creating a hostname mapping (hostmap) file, which is a colon-delimited file that defines a number of attributes of each node
    b. The second method also involves manually creating and editing a “node definition” (nodedef) file.  This is the method suggested by the documentation for use with small clusters

    Proper remote power and remote console capabilities greatly ease the administration and deployment of the compute nodes, however according to the CSM FAQ remote power management is not absolutely required. All the compute nodes must be rebooted (remotely or manually).  They are then PXE booted and installed with RHEL4 using the kickstart installation system.

    4. Maui and Torque: Both Torque and Maui are free software which must be compiled from the source distribution on the head node.  Maui is an open-source job scheduler for compute clusters.  It supports a number of task management features not found in other parallel batch processing software including policy-based scheduling and prioritization of tasks. Torque is an open-source resource manager for managing compute nodes and scheduled jobs.  It can integrate with Maui to provide additional features for scheduling and managing scheduled tasks.  Installation of Torque can be done using the guidance available in the Torque 2.0 Admin Manual .

    5. Platform Rocks: Platform Rocks is a cluster deployment software that facilitates the deployment of various software stacks (“rolls”) onto the compute nodes.  The software is capable of deploying the base operating system and utilities required for cluster administration, management and scheduling.  The software can also manage configuration and updates to ensure consistency throughout the cluster. Platform Rocks is a suite of utilities that are packaged together as separate installable rolls.  One of the main goals of the software is to allow for easy installation and integration of third-party rolls and applications.  One unique aspect to the Platform Rocks installation approach is that the software installs an operating system on the head node, and also installs all the required rolls at the same time.  The software can also automatically set up the subsystem required to install an operating system and other packages on the compute nodes (such as management agents, etc).

    That about does it for a quick “deep-dive”. Let me insert a gentle reminder that these are not the only cluster or resource management technologies out there in the HPC space but rather the ones most prevalent. If you have additional tools that you have worked with, we’d like to hear from you and thank you for tuning in to Port 25. HAPPY HOLIDAYS!

  • Port25

    Robotics Redux: Demo My Robot

    • 0 Comments

    by jcannon on December 21, 2006 12:26pm

     

    If you missed our post yesterday, we started a fascinating conversation with the Robotics team about the impetus and design goals of new Robotics Studio, from distributed intelligence & network-based agents, to why web browsers can provide strong interfaces to robotics control. In the second part of this conversation, we get a sense of what early work is possible with demos being run in the Robotics lab.


    You'll note the reference to some of the work in the Channel9 sandbox to provide open code sharing - a simulation sample tutorial is here. Feel free to dig around.

     

     

    Attachment: roboticsp2.mp3
  • Port25

    Paul Thurrott talks Open Source with Sam

    • 0 Comments

    by jcannon on December 21, 2006 11:41am


    Just a quick note & pointer to Paul Thurrott's interview with Sam on Open Source, the Lab and why these intiatives are so important to Microsoft and to our customers. The interview was conducted back in October, but the podcast just went live at Windows IT Pro.

  • Port25

    Something wonderful has happened... Number Five is alive!

    • 0 Comments

    by jcannon on December 20, 2006 11:03am


    Last week, Microsoft released the first version of Robotics Studio, an SDK that contains three sets of tools; first, a common runtime architecture that can be used across robot devices; second, a set of programming tools that harness the power of Visual Studio, and a physics engine, to allow programmers to build & test their robots in simulated 3D space; and finally, a set of tutorials and sample code to get started. In this video (there are two - we'll post the second tomorrow), we learn a little bit about the design goals & architecture behind Robotics Studio...tomorrow we have some very cool demo time with the team.

    Why is robotics such an interesting area to watch? According to the latest statistics from the International Federation of Robotics, about 1 million personal robots will be sold around the world this year. And by 2025, the Japanese Robot Association predicts the personal robot industry will be worth more than $50 billion a year worldwide.


    Additional Links:
    - Robotics Studio Home Page
    - Download Robotics Studio (Free for Non-Commercial Use)
    - Channel9 Interview

    Attachment: roboticsp1.mp3

  • Port25

    Crash Data Collection and Analysis

    • 0 Comments

     

    by anandeep on December 19, 2006 12:35pm

    Archana Ganapathi is a Computer Science graduate student at the University of California, Berkeley.

    Archana has been working in the area of Empirical Computer Science (which relies on real data rather than theory or simulation) and some of her research is on computer crashes. She worked on collecting data on Windows crashes and is in general interested in the idea of using real data to advance Computer Science.

    Here home page is http://www.cs.berkeley.edu/~archanag/ . Her paper on Windows crashes is “Crash Data Collection: A Windows Case Study” and another interesting paper she has written is “Why do Internet services fail, and what can be done about it?

    The Open Data Repository link referenced in the video is </FONTUNDERLINE: single>http://institutes.lanl.gov/data/.  This is a temporary link with a public data set as there is currently not an official link for the repository that will eventually be hosted by USENIX.  We'll be sure to pass along the official link as soon as it is available.

    -Anandeep

     

     

    Attachment: archana.mp3
  • Port25

    Languages Have Become Too Easy...

    • 0 Comments

    by hjanssen on December 18, 2006 02:40pm


    I have finally found a way to write more blogs!!! When I am in the office I have so much work that I rarely get enough time to sit down and concentrate on a blog. When I get home (My wife tells me normally later than she wants me to) I do not always have the desire to write a blog.    But I am flying for work this week and I am finding all kinds of time!

    What for me the line is that epitomizes the fact that I must have turned into my parents is “When I was Young”. Yet I am finding myself starting this blog with exactly that.

    First a let me describe he catalyst for this blog;

    A few months ago I attended OSCON 2006, one of the sessions I went to was called ‘PHP Security Hoedown’ given by Ed Finkler (http://conferences.oreillynet.com/cs/os2006/view/e_sess/9527)

    Basically, what this session was about was talking about PHP security. The session was a response to security problems people have been finding with PHP. Specifically the installations and running of PHP.

    He stated that a large part of the Security problems that PHP seems to be suffering from can be summed like this (I have taken some liberty to paraphrase some of the things that where said, but check the above link to his original presentation.);

    PHP has a fairly shallow learning curve. Because it is a shallow learning curve, there is a lot of variety of people that are wide in range of skill sets. Basically almost anybody can get started in PHP and get something running pretty quickly.

    There are really only a small percentage of top level people who could be considered ‘experts’ in the language.

    So, now we are getting to the part that I warned about. ‘When I was Young’.

    Many moons ago, now more than I am willing to legally admit to, I started my career with Philips/AT&T who at the time had a joint venture, they developed very complex digital telephone switches. The 5ESS line. This was a very sophisticated telephone system that was almost completely written in C.

    When I started my programming career with AT&T (Now over 20 years ago) you had to go through a lengthy process of learning the language C. Carrier grade software was and still is of very complex nature. Since people that have ever written in C know, it is a very powerful language that provides you with a very large gun to shoot yourself in almost every body part you can if you are not careful.  So we where trained very well before we where let loose writing switching code. One of the other things that was required, if you wanted to make the jump into C++ (Mind you this was when there was no C++ compiler yet, but only CFront which was a pre-compiler/parser), you where not allowed to write in C++ unless you have been programming C for at least 3 years consistently.

    There really where not that many higher level languages as there are today.

    For the last few years I have seen more and more computer languages born, and in some cases die. And they all try to fix what their authors thought where missing in the languages that came before it. Another trend has been to make languages more accessible and easier to use to people who want to program of all walks of life.  Imagine that! A language that does not require a 4 year degree to work in!

    Some of these languages for example PHP and Ruby (They sure are not limited to these languages I might add!). They allow people with limited computing background to make in fairly decent programs in a small amount of time.

    But this is where some of the security issues are showing up. The languages are becoming easier to use. But a lot of the operating systems they run on really have not become easier. So, many of these programs are now used without the realization on the part of the installer or programmer what the effect and impact of running their programs are on the operating systems. This seems to be a problem on both Linux and Windows platforms.

    Although I applaud making programming languages easier for the more casual user, I do see that we are forgetting in many cases to make the environments these programs need to run in safer and easier as well.

    I have seen so many times programs that write their files in ‘interesting’ and unsecured places. The presence of multiple libraries that might or might not support the application (heck, I am not sure what makes the thing run, so I will just copy all kinds of libraries in an attempt to make the application work).

    File permissions that are set incorrectly, readable by the world. Incorrect owners etc.

    And these are just some of the issues that seem to be present. And unfortunately a lot of these problems are easily fixed.

    But I think that we need to do more as developers and system architects. Some of the suggestions that come to mind are:

      • Provide Security and architecture primers as part of the languages that are being developed. This should make it easier for the end app developer to have an appreciation of the program they wrote and what environment it will run in. (Tips and tricks documents, do’s and don’t documents etc)
      • Keep up with the development of the operating systems to make if safer/easier to deploy these new languages. UML in Linux might be a step in the right direction, and so is the new security mode that Internet Explorer runs in on Vista. But more needs to be done.
      • Have experts in the language provide more support in the area of the interaction with the OS and application programming for the target audience.
      • Make installers easier to use and smarter. Taking a lot of the work of deployment out of the hands of those who want to write code without needing a masters in the OS they are deploying on.  WIX for Windows does a very nice job. And there are a few on Linux as well (rpm for example) but I would say they have some way to go so that they are easy and safer to use.
      • Have ‘self check’ modes on the languages that are being developed. E.g. Start the program the end user just wrote and the language will have a mode that will warn/comment/suggest things to the app developer. (Such as there was lint in Unix. But it should be part of the execution of the application program. And it has to be user friendly. Lint at times was downright sadistic in trying to decipher J)
      • Force files to be created in safe areas.
      • A lot of OSS software comes with ‘configure’, which is a very old and robust way of building make files and their dependencies. Now create something called ‘deploy’ that will do the same thing for the completed applications the end programmer just created. The things it should check for example are:
      • Are the libraries it needs in the correct place
      • Set up the environment variables if needed
      • Does it follow the language authors best practices for deployment. (Make application programs go to /usr/local/bin instead of /bin)
      • Make sure that the directories it gets deployed in are not owned by the wrong owner/groups
    • Have more interaction with the OS developers and the Language developers to help each other build better languages and safer deployments on the OS.


    It seems to me that languages need to be developed more with the end user in mind regarding deployment and the OS’s they will be running in. A language can have all the cool features you ever thought off, but if on deployment you create system issues of worse a bad security hole, than it all will have been just a hobby.

    I can equate it to getting your drivers license, getting your license is fairly easy (at least in the US it is). And you can get it without knowing anything at all about cars. Car manufacturers have realized this and have made their cars tell the driver what is wrong with it. Now if you keep on driving your car with the ‘check engine light’ on, well than you are on your own.

    If we want languages to be adopted and thrive, we better find a way to build in a ‘check program’ light.

  • Port25

    Barry Wellman, SD Clark Professor of Sociology, on Social Network Analysis and Community

    • 0 Comments

     

    by Bryan Kirschner on December 15, 2006 01:46pm

    Web 2.0. Enterprise 2.0. Open Source 2.0.  All the latest expectations for major revs of a good chunk of the information technology world seem to be heavily based on excitement about the possibilities for new forms of social networking and collaboration. 

    Nobody has more to say about how this can be done right—or wrong—than Barry Wellman. 

    Dr. Barry Wellman is the S.D. Clark Professor of Sociology at the University of Toronto and is the director of Netlab, a scholarly network studying computer networks, communication networks, and social networks.  To quote from an introduction to a tribute event, Barry “pioneered innovative approaches to three fields:

    • Social network/structural analysis,
    • Personal community and social support analysis, and
    • The study of cybersociety (which he calls "living networked in a wired world").”

    He has authored 3 books and more than 200 journal articles.  He is, to use images from social network analysis, perhaps the biggest “hub” of folks—students, former students, and industry and academic collaborators—who study online and offline communities (including open source communities) there is.

    He’s also a really nice guy and was kind enough to take a few minutes to talk with us while he was at the University of Washington’s ISchool talking about “What is the Internet Doing to Community and Vice Versa?.”

     

     

    Attachment: wellman.mp3
  • Port25

    Festivus for the Rest of Us

    • 0 Comments

    by jcannon on December 13, 2006 06:29pm

     

    It’s been an interesting nine months on Port 25. For those keeping track, the endeavors of our lab have taken us to Portland, New York, California, Thailand, Boston and more. We’ve had  the chance to speak to some leading minds in the free and commercial open source world, including Eric Allman, Andi Gutmans, Tim O’Reilly, Matt Asay, Miguel de Icaza, among others.  And there’s more to come.  So we thought, at this time of year, it was time for a pause – a moment of examination - to try something different.

    So here’s the idea.  While we’ve had the fortunate opportunity to talk to many provocative folks across the globe that have been very generous with their time and knowledge, we’ve yet to turn the camera on ourselves and let you ask the questions.  So let’s do exactly that. …We’ll take user-submitted questions (unedited), compile them, and then go around the table with the staff of the Open Source Software Lab to get the answers. Don’t hold back, feel free to air grievances (by grievances, I mean tough questions), or challenging technical issues you’re working on. We’ll try our best to address the most challenging, and most common, submissions. And given the often fiery tone on Port 25, there’s only one guiding principle to be smart about: questions of a derogatory, legal or unprofessional tone will likely be ignored. 

    Otherwise, the ball is in your court to pose whatever Linux, Windows or OSS-related question that’s buzzing in your brain. Use the comments below to post your question (in the interest of total transparency), or if you prefer, you can submit a question via e-mail. We’ll take the top 7-10 questions and get Sam, Kishi, Bryan, Hank and Anandeep all together right after the New Year, and tape a roundtable discussion of the Q&A session. We’ll post the resulting conversation on Port 25, in totality, afterwards. If it’s a productive discussion, we can schedule more – or even think about a live Town Hall chat with more folks from across Microsoft. The tone of the conversation is really up to you.


    Looking forward to hearing your questions ~ have a merry Festivus :)


    PS. It may help to keep in mind the backgrounds of our lab staff – ie, it’s unlikely we can answer questions related to nuclear physics (that I’m aware of – Sam might have a few tricks up his sleeve).

  • Port25

    It's Like This...Or Maybe Like That...(Part 1.2)

    • 0 Comments

    by Bryan Kirschner on December 12, 2006 02:24pm

    It’s been just over a month since I last blogged on the law-and-open-source –analogy, and, despite a cool, unrelated entry in the middle, I feel my blog karma is running dangerously low…  But—proving either that life is a journey of continuous learning and joyful surprise, or, more simply, that good things come to schlubs who drag their feet—last week not only did NPR run a story on legal apprenticeship programs, I also heard a speaker who thoughtfully referenced Foucault in a talk on fostering innovation.

    The first line of reasoning was about  rather than being a case where the “source code” of the domain (law) was “closed,” law is a case where it is really, really “open:”

    legal documents are almost universally  public as well, so you can seek an example of someone else’s filing, brief etc from among literally millions of such documents—from the lowliest pleading to the most momentous Supreme Court argument.  If a situation where the full text of millions of legal artifacts available freely (or for the price of distribution) aren’t like open source code…I’m not sure what is!

    The implication, going way, way back to the first blog in this chain, is that it seems you can make a lot of knowledge (qua legal artifacts) “more like open” without new artifacts becoming “cheap” (lawyers still are, or at least feel, expensive).

    There is a different angle, though: the  second topic was about restricting access to the profession itself.  Is law an example where the “openness” of knowledge is counterbalanced by extreme restrictions on being able to function as a lawyer—specifically  multiple years of (often expensive) law school?  In this blog entry I am going to take a stab at articulating three views on this: let’s call them “the bad scenario,” the “not bad but not great scenario,” and the “pretty cool scenario.”

    (Since we’re comparing to open source development, I am going to make one simplifying assumption: having to pass a test—the bar exam—is entirely compatible with being “open.”    I say this because I see it as analogous to many OSS communities: to become (say) a committer you basically show up and start doing some work to demonstrate your skills; the test is analogous.  If the bar exam is broken somewhere in terms of content or form factor, I see that as a tweak as opposed to fundamental to the show-up-and-demonstrate-your-ability analogy.)

    First “the bad scenario”  (this is where Foucault comes in). To be somewhat painfully reductive, Foucault observed there is a relationship between the structures of power and (ostensibly objective) knowledge, and that a characteristic of the modern world was the application of “discipline.”  Discipline, broadly speaking, is defining and conditioning a state of behavior that creates non-egalitarian power relationships  such that it becomes “normal”  and this dynamic becomes invisible within a formally egalitarian, “fair” and “open” environment.  The real rubber-hits the road example here would be a cycle that works like this: based on a scientific and meritocratic rationale, three years of  law school becomes a standard; as a consequence, most lawyers who pass through this system view it as both feasible(since they did it) and meritocratic (since they by definition did well, or at least well enough, in it).  In tandem, it also means lawyers are relatively scarce and expensive and have similar billing rates and incomes, because they all ponied up fairly homogeneous investments.  Thus, they have psychological and material incentive to perpetuate the system, because in rejecting it they would not only repudiate their own accomplishments, but possibly introduce competitors who could undercut them.  In the end, the system justified its own perpetuation  without controversy…because it seems to comport with the accepted paradigm of a modern profession.

    On to the “not bad but not great” scenario.  This is where Spence and Signalling Theory come in. The Wikipedia entries linker above are concise so I recommend reading them: the upshot is  that for education to be used as a “signal” to help employers choose more valuable employees,  it is not necessary for education to have any intrinsic value.   The reason this is “kind of OK“—despite the fact that investing in education even if it doesn’t increase productivity seems intuitively perverse—is that at least solves a (communication) problem, and can be economically efficient.  By contrast,  the’ bad scenario”  is quite likely to cause the behavior of the system will become very, very inefficient relative to a truly open consideration of all the options.)

    Finally, the “pretty cool” scenario. Seven states enable people to become residents without attending law school through some type of apprenticeship program :

    In Vermont, participants don't need a college degree, but they must have completed three-quarters of their undergraduate course work. Then they have to spend 25 hours a week for four years studying alongside a licensed attorney.

    In Washington state, a “law clerk”

    …shall study for 4 calendar years. Each calendar year shall consist of       12 months, with a minimum of 120 hours of study each month,  including the time spent in performing the duties of a law  clerk. The tutor shall give personal supervision to the law  clerk averaging at least 3 hours each week. "Personal   supervision" is defined as time actually spent with the law  clerk for the exposition and discussion of the law, the  recitation of cases, and the critical analysis of the law clerks  written assignments.

    (In both cases these positions can be paid jobs.)   This suggests a radically different paradigm for entry into the profession (one might, and Foucault might agree, a “throwback” to a previous era), and a lovely mentoring dynamic.   What’s particularly interesting is that women outnumber men in apprenticeship programs, and the typical age of the participants is older than law students.  And while at least according to NPR the likelihood of programs like these increasing, among law schools there is an increasing incorporation of paid internships and flexible schedules.

    I think parallels to each scenario can be made to the open source development domain, and thinking about the “balance” of scenarios in both the legal and software development domains will be a fascinating discussion.  That blog will be along in less than 30 days, I promise….

  • Port25

    The 15 Most Useful Technologies for me in 2006

    • 0 Comments

     

    by billhilf on December 11, 2006 09:00pm

     

    We all use technology every day. This is the list of the 15 technologies that I found most useful (and in some cases extremely fun) in 2006. It includes all sorts of things, devices, software, open source, Apple, Microsoft, and so on. It's not about the manufacturer or the licensing model, just a list of the things I found useful and fun in 2006, and maybe give you some holiday shopping ideas for your geek loved ones.

    1. Sonos music system

    Blasting Styx's 'Lorelei' throughout your house, streamed wirelessly, is just awesome. Sonos is a Linux-based device (built by former Microsoft engineers). I have it set up to network mount my Windows server, which holds all my digitized music, so I can play literally every song I own anywhere in my house (or different music in different rooms) without pulling wires everywhere. And you all love Styx, don't deny it.

    2. Infrant Technologies ReadyNAS

    I back up that Windows server with a great and affordable NAS appliance from Infrant Technologies. It's a Linux-based appliance that holds a terabyte of SATA disks with a nifty technology they call 'X-RAID' which allows you to easily swap disks without reconfiguring the RAID setup.

    3. Newsgator InBox

    Like many, I now live primarily in email and RSS. I rarely visit traditional web sites, most information I get is RSS based. Newsgator delivers my RSS feeds directly into Outlook and is now an indispensible tool for my daily information consumption. Scoble turned me on to Newsgator and I've been a happy user ever since. Some of my favorite feeds? Make magazine, TechCrunch, TED Blog, O'Reilly Radar, National Geographic News, a bunch of personal blogs I follow and customized 'smart feeds' that are like pre-scripted Technorati searches.

    4. iWeb

    iWeb is easily my favorite graphical Web site builder for simple, personal web sites. I don't use the .Mac services (my personal Web sites run on Windows and Linux of course!), but the iWeb tool is still very easy and quickly builds attractive Web sites. My one significant complaint about iWeb is the actual file size of the sites it builds is just ridiculous, the Apple iWeb team really needs to work on optimizing this – hint: just use some typical storage minimums from the top Web hosters as your target.

    5. Ruby on Rails

    I don't do any real development any more but I certainly tinker and I've really enjoyed building some Web applications with Rails. The framework is quick to understand and lightweight enough to get simple web apps up and running. I've built apps on Linux and Windows (for the latter this was useful). I also love PHP and we have some great things going on there, but Rails was my programming experiment for 2006.

    6. GShock Atomic Solar watch

    I paid about $40 for this watch at Costco and I love it. It syncs with the atomic clock in Colorado, so time is always accurate. It runs on solar power (including office lighting). You can drive a tank over it or swim to the bottom of a lake with it on. It has multiple time zones, alarms, and a nifty blue light that automatically turns on when you rotate your wrist to check the time. Only downside is the atomic clock sync is radio based and doesn't work when I'm in Ankara or Manila but it still keeps the last known 'good time' so it's not too big of an issue. It's my perfect watch.

    7. Game Systems: Alienware Area-51 PC and XBOX 360

    I relieve stress through jogging and gaming (not at the same time). I use good running shoes and I use good game systems. These two are the best. My Alienware PC is juiced heavy, with dual nVidia cards SLI configured, 4GB memory, two Intel procs, etc. etc. It rocks. My 360 is also stellar and I just started using the new 1080p HD output, which makes the games that much more real – Splinter Cell baby!

    8. RadioShack switching power supply

    One addition I needed for my 360 system was cooling as I keep the 360 in a tightly closed cabinet space. I use a RadioShack power switching supply to power two cpu fans I have mounted to my wood media cabinet (see bottom of this blog for a photo) to blow out the warm air and keep the 360 in a cool environment. The power supply replaced a kooky 6V battery configuration I had previously and now I just switch on the fans and game away. This isn't necessary for all Xbox systems of course, but my cabinet has really poor air flow. There are some other ways to do this too through liquid cooling (literally modding the console itself), but they are rather complex – check this out.

    9. Games: World of Warcraft (PC) and Gears of War (360)

    It's hard to pick favorite games, but I've had lots of fun with WoW and GoW in 2006. Lately, I've let my 58 Warlock take a holiday in Azeroth while I build my GoW skills and I really do think that Gears is one of the best games for the Xbox 360 system. The graphic details, game play, challenge, and pacing are phenomenal and it's exciting to see titles like this push the 360's capabilities. And do play with a friend, it is much more fun. If you like shooters, GoW is a must have. As for WoW, I hesitate to recommend too strongly as it can occupy a LOT of your time if you get hooked. But… it is easily the best mmorpg I've played and the next major rev, Burning Crusade, looks fantastic.

    10. AvantBrowser

    I love IE7 and I use FireFox quite a bit, but most of 2006 I used AvantBrowser, which is based on the IE engine and adds a load of features, like tabbed browsing and RSS capabilities, and it's fast and functional. It's free, but you should donate if you use it to help the developers out.

    11. Parallels Desktop virtualization for Intel based Macs

    I use Parallels to run various Linux distributions on my MacTel (the Intel based iMac). It's good software and is the only virtualization solution for MacTel's currently. You can also run Windows virtualized. My main complaint: on the wireless iMac keyboard, the right 'Control-Alt' keys escape the mouse from the virtualized OS window but the left 'Control-Alt' keys do not. Silly bug, but annoying.

    12. Motorola Q

    I try not to carry a bunch of gadgets with me (which is why I love the multifunction gshock watch). The Q is the ideal phone for my day to day communications. It runs Windows Mobile 5.0 and allows me to sync with Exchange and read mail in Pocket Outlook (gzip is used in 5.0 and it helps save serious bandwidth when sync'ing mail, see here). The phone is very slim and you can carry in your pocket easily. Battery life could be better but for the form factor, I can live with recharging nightly. I use the Samsung i830 when I go international as it has both GSM and CDMA capable radios, but the Q is the phone to beat.

    13. Microsoft Office 2007 and Office Communicator

    Many if not most of Microsoft employees have been using Office 2007 and Office Communicator for most of 2006. This isn't a product pitch, these technologies have made my life significantly easier since I started using them. The new Office 2007 UI is awesome and makes productivity tool illiterates like me seem like power users. Communicator has also significantly changed how we IM at Microsoft, it's integrated with our Exchange server infrastructure and with our phones so I can right click someone's name in my IM window, select 'Call' and my desk phone will automatically flip to speakerphone and dial their number. It's quite literally changed how we communicate at work.

    14. Windows Vista

    I like Vista. There I said it. Faster, more reliable, more secure, more intuitive and better looking than any desktop system I've used. I'd say this if I worked for Microsoft or not. Chris Sells says it better here. These things among others have made Vista a very important and useful tool for me.

    15. SpamBayes

    SpamBayes is an anti-spam filter written in Python that plugs into Outlook. It's an open source project maintained on sourceforge. I've been using it since before 2006 and it's saved me from the massive waterfall of Spam that I get from all my various email accounts that I route into Outlook. Runs on Windows (Outlook), Linux/Unix and MacOS.

    *Bonus*  15.5 Photosynth

    This isn't really something that made my life more useful in 2006, but it's just smoking hot software that is worth spending a few minutes with. It's a tech preview from our Live Labs and it's a very cool new way to look at photos. Some huge possibilities with this technology. Enjoy!

    (tech preview requires XPSP2 or Vista and IE6 or IE7)

     

    Happy holidays and here's to a great 2007.

    -Bill

     

     

  • Port25

    Windows Unified Data Storage Server

    • 0 Comments

    by MichaelF on December 05, 2006 06:36pm


    Today Microsoft announced the availability of the Windows Unified Data Storage Server 2003 (WUDSS).  Built for mixed environments this solution is highly interoperable, without loss of performance in both NFS and CIFS environments.  Remote administration is also provided both through Active X RDP and Java RDP Client Support.

    This sounded interesting so we sent Hank to talk with Tres Hill, Sr Product Manager, to find out why this announcement is significant for IT professionals running heterogeneous environments.


    Attachment: treshill.mp3

  • Port25

    Experimental or Production?

    • 0 Comments

    by anandeep on December 05, 2006 03:33pm


    There's two things people figure out about me (mainly because I tell them!) - one that I am crazy about airplanes and two that I love stirring controversy! And in this blog I get an opportunity to bring those two favorite things together.

    There are two kinds of light or General Aviation airplanes out there - the "production/certified" airplanes (referred to as "Spam Cans") and "homebuilt/experimental" airplanes. 

    You probably have heard of the manufacturers of the "Spam Cans" - they have names like Cessna, Piper and Beechcraft.  These are large companies with lots of engineers who mass produce airplanes and sell them to you if you part with large sums of money. They also give them to you in any color as long as it it's creamish. These are faithful, reliable if boring airplanes. Nothing wrong with them but they are not fun. They also make a lot of compromises in speed, manueverability, weight carrying ability or runaway length requirements - and usually don't excel in any of those criteria. They are the airplanes every commercial enterprise uses though. Almost everybody learns to fly in them. Some of them are bush planes in Alaska and Africa and are the lifeline of a lot of people there - nothing to sneeze about! Below is a picture of a Cessna 150 (my ex-airplane) that I used to build up hours and tour the Pacific Northwest. One of the "Spam Cans" but beloved nevertheless.

    Then there are people who accept no compromises. They decided they didn't want to accept hired engineer’s opinion of the best design. They went to work designing their own planes and then offering plans or kits so that other people could build them.  One of the early pioneers of this was Burt Rutan, now famous as the designer of the first private spaceplane  "Spaceship One", who offered a kit for an airplane christened "VariViggen" that had its tailplanes in front (in a configuration called a "canard"). It could go faster than any production plane on much less power and was stall proof- which meant that it was a lot safer than the regular planes. The other success story is Richard VanGrunsven - whose company "Vans Aircraft" has built a family of aircraft called "RV"s (there is still some debate as to whether that means "Recreational Vehicle" or "Richard VanGrunsven"). As of the time of writing there were 4861 RVs built and flying - more RVs ship every year than any commercial light plane manufacturer in the world can produce! These aircraft are  speedier, more manueverable, have better weight carrying ability or have less runaway length requirements than comparable production aircraft with the same horsepower. These planes are known as “homebuilts” or “amateur-built” or “experimental”.  The “experimental” title comes from the placard that they have to exhibit by law – this is also the placard all manufactured planes and military aircraft have to exhibit till the time they get certified by the FAA.  It doesn’t necessarily mean that the aircraft is an experiment in progress.

    Ok -where's the controversy? I am saying that the Open Source Software movement is like the experimental aircraft movement and make an assertion that commercial software companies are like production aircraft companies.

    After all there is a community among experimental builders that rivals the OSS community. They share ideas freely, give each other plans for improvements and are very loyal and committed to the cause. One instance of such a community is "Van's Air Force".  These communities and the experimental manufacturers also are on the cutting edge of technology, pioneering  cheap "all glass" computer screen instrumentation in light planes among other things. Like Linux, most successful experimental aircraft have a solid “kernel” that is built and maintained one way but like Linux the “distributions” abound based on builder’s personal preferences. For instance from the very successful Vans RV-4 came the Harmon Rocket.  Ubuntu and RedHat aren’t THAT different! J

    But its not all applehood and mother pie.  Building these airplanes (even if you do all the work yourself) is not much cheaper than buying a general aviation airplane. You do have to build them to get all the advantages and it is considered a truism in the community "build if you like to build, buy if you like to fly!". Which means that it takes serious commitment to build one of these things and you better take a lot of pleasure in just the act of building. Of course, you could buy one of these already built, but would you trust the builder? Build quality is very variable! Certification standards are conservative and lengthy for a reason - a small variation can result in a catastrophic outcome. These aircraft are also more demanding to fly than the boring old "Spam Can".

    I fly a Cessna 182 for the Civil Air Patrol - and I wouldn't want to fly an experimental airplane that I myself hadn't built. (Even if I built one - would I?).  Because we fly in the mountains with heavy loads (survival gear, direction finding equipment, and individuals who are - shall we say - "weight challenged"). I know it won't do things spectacularly but will do its standard thing as long as I follow the manual. Its heavy on the controls, isn't that fast and has a high fuel consumption - but it can carry a heavy load and land in a reasonable distance. And I can be sure that all the improvements that Cessna has mandated have been incorporated, since it would be illegal not to. Not so for the experimentals since the builder not the kit manufacturer is the legal manufacturer and can make his own decisions! It isn’t the “experimental” placard that scares me, its’ the fact that I would have to form a judgment on my own on every INSTANCE of what is fundamentally the same design.

    Am I taking the analogy too far? To be truthful, I don't know - but it is certainly worth thinking about!

    Now if you send e-mail to Sam Ramji telling him how much you liked this blog - I might be able to afford a house in the Puget Sound Area and this RV-8 kit that I want at the same time! :-)

  • Port25

    Thinking about HPC Infrastructure

    • 0 Comments

    by kishi on December 01, 2006 04:21pm


    I started the first HPC blog (See “previous blog“) with an understanding that HPC is an area where there has been a surge of activity from a development/investment standpoint. This segment of Information Technology has experienced a heightened level of engagement from OEM’s and partners, all trying to meet the growing computing needs of their customers. So after getting a basic understanding behind the importance of why HPC matters, the next logical step that needed uncovering was “How to think” about HPC Infrastructure and tap into the “wisdom” behind managing it. You might ask why this is relevant. For starters, setting up HPC Infrastructure is an experience that, just like any other infrastructure, be it Network or Storage, requires intricate planning and intimate familiarity with its individual contributing components. In case of HPC, let’s just say you really need to know your nodes J. Let’s talk more about what’s involved in setting up an HPC Infrastructure and how to think about it as a whole:

    1.    Investment Impetus: To successfully plan and design an HPC Infrastructure, the first and foremost step should be to “look beneath the surface” . This simply means to understand, the primary reason for investing in HPC. The demand for HPC equipment, linked to a set of business objectives should have clear purpose around the outcome and expectation. This is specially true today than at any other moment in time because the consumption of HPC cycles, specifically in the research and development areas across all verticals has seen a steady 70% growth over the past four years (Source: primeur ). Despite this tremendous growth in the proliferation of HPC technology,  the growth pattern itself is sporadic. One of the reasons for it may be the complexity, not only in terms of design but also in terms of consumption as well.  Take the case of SHARCNET in Southern Ontario that developed a long range plan around adoption and implementation of HPC technology. According to the report, some of the elementary challenges around planning for HPC emerge from the fact that “it is an enabling technology for an extremely diverse set of researchers”. This embodies the essence of the sentiment behind the complexity and diversity predominant in the HPC space.

    2.    Planning and Designing Hardware: While thinking about planning and designing an HPC infrastructure implementation, I spoke to several folks in this area, drew from a decade and a half of my experience as an Infrastructure Architect and thought of some key areas that I would consider. These include:

    a.  Facility considerations (Rackspace, Power and Cooling): Talk to any enterprise level Datacenter manager what his/her top 10 pain-points are and you are bound to hear the words “rackspace, power and cooling” in what follows. Dig deeper and you’ll realize that in any datacenter, there’s a fixed number of colo’s (Colocation) you can populate based on the HVAC designs. This means that rackspace is what’s at a premium in each of these colo’s with every “u” accounted for. Packing in dense chipsets in small form-factor server add to existing power and cooling challenges

    Translation – you need more outlets and more airflow per rack than what you did a decade ago with a handful of 4 and 5u servers taking up the entire rack

    b.  Physical Plant planning: Quoting the resident HPC Guru Frank Chism who says “I cannot over emphasize the importance to planning for physical plant in HPC deployments. Things like room and raceways for well managed and planned cabling. HPC uses more cable than anything except maybe SAN. Also, pay attention to floor loads, air flow, clean and redundant power. Finally, never never forget out-of-band management. Deep subfloor really helps with all that cabling”.

    Translation – Effective HPC performance calls for an effective HPC design, which includes tweaking hard as well as soft components. These components can be as covert as chip-design or as overt as subfloor depth.

    c.  Hardware and Processing Power: Pushing the envelope on hardware and processor architectures today translates to increased performance (the heart and soul of HPC). Adding energy efficient hardware on top of the architecture amounts to greater investment in raw computing power, which in turn translates to building a sound HPC infrastructure. The key advantages one needs to look for in this scenario are faster data access and increased instructions. The word “performance” is repeated throughout the theme of this topic because it IS what HPC is all about, the ability to reduce the number of cycles to process data. Addressing the hardware and processing specs as part of core requirements ensures a smoother build-out.

    3.    Implementing HPC Tools and Software: Like any other piece of hardware, a HPC cluster is just that until software and tools exploit the underlying architecture to drive results and performance to do what it does best – compute. When thinking of some core elements of HPC tools and software, here’s how I thought to break them up:

    a.  Setup and deployment systems: Setting up HPC clusters goes back to what I said earlier in Section 1 – what do you want to do with it? Although there are various ways and methods that allow you to drive the software and installation experience of an HPC system, the bottom line is that this depends to a great extent of what components make up the genetic composition of the HPC cluster you ordered. Taking a look at some HPC software setup and deployment tools out there, a few mainstream ones are SCALI and HP-MPI (HP’s message passing interface). These packages provide deployment, monitoring and job scheduling services for managing and administering an HPC cluster just like IBM’s CSM (Cluster Systems Manager). In the Open Source space, there’s Maui and Torque, that work as job scheduler and resource managers for managing compute nodes and clusters. Platform Rocks is another suite of utilities that allow installation and integration of third party apps

    b.  Parallel FS: This is truly what I think is going to be the frontier for some intense activity over the next few years. Using Wikipedia’s description, “Distributed parallel file systems stripe data over multiple servers for high performance. Some of the distributed parallel file systems use object storage device (OSD) (In Lustre called OST) for chunks of data together with centralized metadata servers such as Ceph Scalable, Distributed File System from University of California, Santa Cruz. (Fault-tolerance in their roadmap.), Lustre from Cluster File Systems. (Lustre has failover, but multi-server RAID1 or RAID5 is still in their roadmap for future versions.) and Parallel Virtual File System (PVFS, PVFS2)”.

    Deep-Dive: At Base, parallel file systems are global namespaces for files that achieve high bandwidth via parallelism. That bandwidth comes in three dimensions, high aggregate bandwidth, high single stream bandwidth, and high metadata operations per second. No one seems to have achieved high performance in all of these dimensions. Don’t forget that the volumes of data are so large that backup is a major undertaking and thus, reliability is required as well. Further, nobody seems to be able to make a parallel file system that performance well for high-speed data for short I/Os, like say you do when compiling a major application

    c.  Multiple Networks: A final comment on implementation of HPC is that HPC often has multiple networks. For example, it does little good to have a parallel file system that delivers gigabytes per second of data to single nodes if the network can’t handle that much bandwidth!

    So in conclusion, here’s a recap on the learning behind setting up HPC Infrastructure:

      • Comprehensive understanding beneath WHY you’re investing in HPC and what you expect as an outcome
      • Deep familiarity with the core HPC Hardware and design components
      • Facility and Physical plant considerations to ensure adequate cabling and subfloor space
      • Visibility into prominent HPC based software and toolsets
      • Understanding the three dimensions of bandwidth
      • And finally accommodating the concept of “Multiple Networks” into node design to accommodate the required bandwidth


    Look forward to getting back to you with more on HPC over the new few weeks again. Until then “Happy Computing”!!

Page 1 of 1 (13 items)