Testing with Appium - Exchange data between your tests and your app

In my previous post: Q&A on Appium - Should I use it? I focused on some scenarios where Appium might not be the perfect choice; one of those scenarios is data exchanging. I know I might sound redundant, but let me repeat myself again: data exchange is not an issue in Appium or Selenium; it is a scenario which is simply not supported by the specification, thus it is difficult to achieve. So do not complain with Appium nor with Selenium if you cannot send and receive information to and from your app while testing.

Selenium API at a glimpse

If we take a look at all commands available in most common Appium's drivers, we will probably not find anything suitable for data exchange. Let us consider for example the Javascript Driver and all commands exposed there: there is nothing that specifically focuses on exchanging information. Just automation. The same goes for the other drivers, there is no specific method that enables data exchange between the test and the app being tested. So what do we do? We try to achieve that result by ourselves.

One method to exchange data

There is only one method in Selenium which I did not mention in the previous section, which is actually the solution to our problem here:

 executeScript(script, args)
executeAsyncScript(script, args)

The ones above are the names of the methods available in the WebDriver's W3C specification. Those two commands allow the test to execute some Javascript code in the (webview-based) app. It might not sound so promising but there is one very important thing about those two guys: they can return stuff!

Returned value

There is some complexity behind how those 2 functions are implemented in whatever WebDriver, however the W3C defines the basics of their business logic as well as the protocol to use to successfully return a value to the caller. The W3C specification is very generic, so here I will try to make things a bit easier. According to the specification, when executeScript is invoked (the same goes for executeAsyncScript), the following must happen:

  1. The caller must provide the script in a string.
  2. The string is parsed when reaching the browser's window object (in a webview-based app, remember that we have a browser) and set as the body of a Javascript function object acting on the global context and taking as input the arguments passed to executeScript.
  3. The function is executed in a safe block where its return value is captured or any exception can be caught as well.
  4. In case of exception, the error is set as the response status.
  5. In case a value is returned, it is parsed and sent back in the response's value field. Response's status field is set to success.

Also, when a value is to be returned by the function, the following must happen:

  1. If the value is undefined or null, null is returned.
  2. If value is a number or a string, then no further operation is done and the value is returned as it is.
  3. If value is a DOM element, then the corresponding WebElement is constructed and returned.
  4. If value is an array, then return an array whose elements are built by following these algorithm for each element.
  5. If value is an object consisting of custom properties, then return the same object by applying this algorithm to each property in it.

So as you can see, we can return something from our tests. By invoking a script which returns some interesting information, we can actually send requests and have data in return. However things are not so easy.

No everybody implements the full specification

The problem is that not every driver implements the algorithms I just mentioned 100%, this is especially true when it comes to the algorithm used to build the returned value. Basing on my experience, so far it is safe to return numbers or strings. When it comes to whole objects or DOM elements, sorry to say, it is not certain that what we get back from the device is exactly what we were expecting. For this reason, for the sake of test stability ad reliability, it is better to always keep in mind the following:

Always return simple types only, this will always ensure successful data exchange.

Returning complex data: JSON serialization

So, how can we exchange complex data if we can only return strings or numbers from our app? Actually we do not really care about returning numbers, strings are just fine! The answer to our question is the following:

To enable data exchange, it is possible to send and receive objects as string via JSON serialization.

The following picture will probably provide a better description of the basic idea.

We can make Javascript queries to send via executeScript, and get a JSON string back. When our query is executed, we must take care of stringifying the returned value and return it instead of the JSON object. Once we get the string in our test, we can use one of the many JSON parsing libraries available on the web to convert it into a dynamic structure or an hash table (dictionary).

Possible scenarios

If the idea is clear, then we can move on picturing some possible scenarios. Where might we ever need to exchange data? Consider the following examples:

  • Getting the possible values that a user can select from a combo box.
  • Getting the chosen value from a list of possible options when we are not using form controls in our page, but some HTML or jQuery complex control.
  • Getting the value typed by the user in an editable div of one page of your hybrid app.
  • Checking return value of an AJAX call to the server.

Keeping on listing possible scenarios would be meaningless, you will sure find a situation when data exchange comes into need believe me, especially if your tests are a bit more complicated than usual automation and your apps have articulated user stories.

An example: values in a list

Let's consider a real case and see data exchange in action. Our app deals with restaurants and is able to locate the most famous ones close to where the user is. In a page of our app, after the user issues the search command, we are able to see the list of possible restaurants in a radius of 200 meters. We want to test that some restaurants are displayed when the user is located in a specific area. In our page, the list of restaurants will look like the following:

 <ol id="item_selection">
  <li class="item" data-id="i0463">Item 1</li>
  <li class="item" data-id="i3957">Item 2</li>
  <li class="item" data-id="i9248">Item 3</li>
  <li class="item  data-id="i3451" data-selected">Item 4</li>
</ol>

Remember that our app is an hybrid app, so it relies on HTML. Here we could have used a simple combo box, however my point is proving that we can exchange data coming from every part of your DOM, not just form controls.

Serializing with Javascript

In our test we need to get access to the list, in particular we want the ID (available as a data attribute) of the restaurant which is selected. How can we do this? In our test we can have the following code.

 [TestMethod]
public void TestSelection() {
  AppiumDriver driver = this.Driver; // The test class will host the driver (assuming connected)

  // Retrieving from device
  object ret = driver.ExecuteJavascript(@"
    // Javascript executed in an anonymous function on the webview
    var ret = {};

    var ol = document.getElementById("item_selection");
    if (ol == null) {
      ret.status = -1; // List not found
      ret.message = "Could not find list";
      ret.value = null;
      return JSON.strigify(ret);
    }

    // The ol element could be found
    var lis = ol.getElementsByTagName("li");
    if (lis == null) {
      ret.status = 0; // No items
      ret.message = "No items in list";
      ret.value = null;
      return JSON.strigify(ret);
    }

    // We have the list, need to locate selected one
    ret.status = 1
    ret.message = null;
    ret.value = {};
    ret.value.id = null;
    ret.value.text = null;
    for (var i = 0; i < lis.length; i++) {
      if (lis[i].dataset.selected != null) {
        ret.value.id = lis[i].dataset.id;
        ret.value.text = lis[i].textContext; /* Use innerText on IE */
      }
    }

    return JSON.strigify(ret);
  ");

  if (ret == null) {
    throw new InvalidOperationException("Appium had problems executing script");
  }

  // If we can get to here, then serialized javascript object should be here
  string json = (string)ret;

  // Getting the object, this function will be implemented later 
  // and the type returned will be specified as well
  var obj = this.ParseJSON(json);
  // Code will continue later on...  
}

As you can see, in our invocation, we pass a Javascript function which will be executed on the device by Appium. Serialization is performed by means of JSON.stringify, which is an API introduced not so long ago in ECMAScript.

Deserializing with every other language, e.g. C#

How do we parse JSON in our test? The nice thing of JSON is that all programming languages have at least one API to parse it. C# is no difference and it provides something called JavascriptSerializer in the System.Web.Script.Serialization namespace. This class however gets a bit tricky to use when just deserializing a JSON string which was not serialized from a concrete type using the same class (which is exactly our case). For this reason we are going to rely on .NET's WCF and use the DataContractJsonSerializer which implies a bit more code to be written, but your tests will have a better and more consistent shape.

The first thing we need is a C# representation of the object we will use to exchange data. In WCF we achieve this by means of Data Contracts which are a WCF-specific concept describing the information being sent between two endpoints. The following code will actually describe the ret structure we used in the previous Javascript chunk.

 [DataContract]
public class SelectedItemReturnStruct {
  private int status;
  private string message;
  private SelectedItemValue value;

  /// Data member's Name must match Javascript object ret's property
  [DataMember(Name = "status", IsRequired = true)]
  public int Status { 
    get { return this.status; } 
    set { this.status = value; } 
  }

  /// Data member's Name must match Javascript object ret's property
  [DataMember(Name = "message", IsRequired = true)]
  public string Message { 
    get { return this.message; } 
    set { this.message = value; } 
  }

  /// Data member's Name must match Javascript object ret's property
  [DataMember(Name = "value", IsRequired = true)]
  public SelectedItemValue Value { 
    get { return this.value; } 
    set { this.value = value; } 
  }
}

[DataContract]
public class SelectedItemValue {
  private string id;
  private string text;

  /// Data member's Name must match Javascript object ret's property
  [DataMember(Name = "id", IsRequired = true)]
  public string Id { 
    get { return this.id; } 
    set { this.id = value; } 
  }

  /// Data member's Name must match Javascript object ret's property
  [DataMember(Name = "text", IsRequired = true)]
  public string Text { 
    get { return this.text; } 
    set { this.text = value; } 
  }
}

You will also need to include System.Runtime.Serialization namespace for getting access to DataContract and DataMember attributes.

The key on data contracts is that they provide a mapping between objects of different types. the DataContract attribute tells the framework that an instance of a class can be created as long as the other entity sends an object of a class implementing the same contract. In this case, we will receive a JSON string describing an object with some properties. The serializer will use the contract to associate each Javascript property to a property of our class. The framework will look for matching names, that is why we must ensure that the properties we use in the Javascript object will match those in DataMembers in our classes.

We are now ready to perform deserialization by implementing method ParseJSON introduced in the previous code.

 public SelectedItemReturnStruct ParseJSON(string json) {
  DataContractJsonSerializer js = new DataContractJsonSerializer(typeof(SelectedItemReturnStruct));
  MemoryStream ms = new MemoryStream(System.Text.ASCIIEncoding.ASCII.GetBytes(json));

  SelectedItemReturnStruct ret = (SelectedItemReturnStruct)js.ReadObject(ms);
  ms.Close();
  
  return ret;
}

The previous code assumes that you included in your test project the assembly System.Runtime.Serialization and that you included the following namespace in your file: include System.Runtime.Serialization.Json;. Namespace System.IO will also be needed.

We can now complete our test method we left incomplete before.

 [TestMethod]
public void TestSelection() {
  // We were here...
  var obj = this.ParseJSON(json);
  
  if (obj.Status == -1) {
    // No list
    throw new InvalidOperationException("List could not be found!");
  }
  if (obj.Status == 0) {
    // No items
    Assert.Inconclusive();
  }

  // A list could be found and items as well
  if (obj.Value == null) {
    // No item marked as selected ==> Nothing to test
    return;
  }

  string id = obj.Value.Id;
  string text = obj.Value.Text;
  // Test whatever you want from now on...
}

It was not that difficult and this pattern can be implemented for whatever type of objects you need to exchange.

The views and opinions expressed in this blog are those of the authors and do not necessarily reflect the official policy or position of any other agency, organization, employer or company.