by admin on June 06, 2006 03:30pm


Research Strategy Corner:  Disambiguating “Open”

Disambiguate (transitive verb):  to establish the true meaning of an expression, regulation, or ruling that is confusing or that could be interpreted in more than one way

I’m a Research Strategist with the Open Source Lab here at Microsoft.  When folks ask what that means , I usually tell them the  second-best definition of “strategist” I’ve ever head  is “a researcher who gets to make stuff up”—but the first-best is “someone who establishes a series of steps to achieve a goal.”   The latter is what my job is all about—in the process, because it involves synthesizing both technical information and insights from computer science, organizational science, and sociology research, I sorta  get to make stuff up—as long as the math works (which is probably a little bit different from what the average civilian bystander thinks of as “making stuff up.”)

What’s the goal? The title of this post summarizes it concisely: disambiguating “open.” When the phrase “open source” is used for example, it could represent a simple descriptive statement of fact  about code visibility (read any good mash-ups lately?); it could also be referring to software artifacts available under a fairly wide range of  license types…or it could be intended to refer to  something compliant with a very specific set of criteria like the Open Source Initiative’s ten-point  definition (http://opensource.org/docs/definition.php).  It could be referring to one of tens of thousands of single-developer projects on SourceForge—or to highly coordinated efforts like FreeBSD; and on the other hand altogether, it could be in marketing materials from big corporations like IBM or Novell (if this animation is still up on Novell’s site you can experience a dizzying  array of suggestions for what “open means to your enterprise:” http://www.novell.com/solutions/?sourceidint=hdr_productsandsolutions ).

“Open” is one of those words that today in the software domain is increasingly becoming probabilistically uninformative…the word as applied to an endeavor (like a software development project) or an artifact (like a piece of software) less and less enables you to more accurately predict attributes of the endeavor or artifact—because attributes of the endeavor (like who built it and how) and the artifact (like architecture, coupling, interaction paradigm) may actually help you better predict what will happen now and further on down the road if you choose to take a dependency on something.

I don’t care how the characterization “open” makes you feel, whether it is nervous or giddy with excitement: my objective function is the ability to use one bit of information to reliably predict other bits of information.  In this space I’ll share our efforts to go about doing that with regard to “open” and what we find in the lab and in the world of academic research—but first I’ll give you some visibility into how I start out structuring lines of inquiry.

There are a few different approaches you can take to things:  for our purposes here, an analytic approach is like an argument from first principles—the position of Free/Libre software advocates is essentially analytic, as their argument is software should be a certain way given the set of principles they start from, and there really isn’t any evidence you could collectout in the world  that would change their minds.    This discussion isn’t particularly relevant to my day job.  An empirical approach is all about data and probability: if you know foo you predict bar better.  This is entirely relevant but is exactly what we don’t know enough about yet.  A phenomenological approach (http://plato.stanford.edu/entries/phenomenology/  if you really want to know) starts with experience as it is experienced--and this is useful for starting to disambiguate open source.  Here’s why:  I don’t want to argue about what’s open and what’s not or about whether things should be a certain way—I want to build an informative base of data that lets me characterize and analyze endeavors and artifacts underneath this fuzzy term “open”.  So we can start by asking: what would I—and, if there’s enough established shared meaning, other people—experience as  a phenomenon that is certainly “like open source” versus certainly “not like open source.”   I suggest two sets of statements along an axis shown in figure 1, below. Once we have this down, we have a starting for the collection of empirical data that ( if our starting point is right) will position endeavors and artifacts somewhere along a continuum between the two extremes.

Figure 1: Phenomenological approach to characterizing endeavors

I won’t start into operationalized definitions here, because to some extent that would defeat the purpose of capturing a top of mind response to the statements themselves.  So what do you think: top of mind, as you experience these collections of statements, is the essence of like and not like open source represented? Do they raise questions?  Controversy?  Let me know and we can dive in to where some of these come from (yes, I said my experience-as-experienced, but remember, when I make stuff up the math has to work…there’s lot of great research out there that can help tune these  characterizations).

Disambiguation:  Because what you don’t know you don’t know probably will hurt you