Introducing Linux support on Azure Batch
We are excited to announce today a preview of Azure Batch for Linux virtual machines. This brings the power of Batch “scheduling-as-a-service” to customers with Linux applications and workflows across industries and scientific research.
Under the covers, Batch is using Virtual Machine Scale Sets to deploy and manage Linux virtual machines. The Batch agent that manages job and task execution on compute nodes is written in Python for portability. This compliments the support in Batch for Cloud Services. VM Scale Sets will provide us with additional features down the road such as custom VM images.
This is a short guide about how to use Linux on Batch. A detail document will be available on Azure Batch documentation site shortly.
- Prerequisite
- Using Linux on Batch
- Command Line Support (Azure xplat CLI)
- Troubleshooting - Using SSH
- Linux Support Matrix
- Pricing Information
- Reference
Prerequisite
Batch Account
Batch account can be created on azure management portal. See here for detail instruction.
Client SDK
Batch service provides a set of SDK to interact with Batch service.
- Dot Net SDK can be downloaded from Nuget.
- Python SDK is part of the Azure Python SDK on PyPi.
- Node.js SDK can be found on npm.
- Java support will come shortly in the future.
Using Linux on Batch
If you are not familiar with Azure Batch service, here is a good step by step tutorial on using Batch.
A typical Batch application involves 4 steps - creating a client, creating a pool, creating a job, and adding tasks to the job. The behavior is the same as normal Batch client with the only difference being the way to specify OS of the pool virtual machines.
With Linux on Batch, instead of OS family and version, the client must specify publisher, offer, sku, and version of the VM image, plus node agent sku id. You can find supported Linux distribution in the section later in the blog post. For example, the code in section #2 creates a pool based on latest version of Ubuntu 14.04-LTS.
1. Initializing Batch Client
creds = SharedKeyCredentials(account, key)
config = BatchServiceConfiguration(creds, base_url = batch_url)
client = BatchService(config)
2. Create a pool using Python SDK.
new_pool = PoolAddParameter(id = "<pool name>", vm_size = "e.g. STANDARD_A4")
new_pool.target_dedicated = 4
start_task = StartTask()
start_task.run_elevated = Truestart_task.command_line = '<pool prep task cli>'
new_pool.start_task = start_task
ir = ImageReference(
publisher = "Canonical",
offer = "UbuntuServer",
sku = "14.04.2-LTS",
version = "latest")
vmc = VirtualMachineConfiguration(
image_reference = ir,
node_agent_sku_id = "Batch.Node.Ubuntu 14.04")
new_pool.virtual_machine_configuration = vmc
3. Submit a job using Python SDK.
pool_info = PoolInformation(pool_id = pool_id)
job = JobAddParameter(id = jobId, pool_info = pool_info)
job_preparation_task = JobPreparationTask()
job_preparation_task.command_line = "<start up cli> "
job_preparation_task.run_elevated = True # If specified, task will run as sujob.job_preparation_task = job_preparation_task
client.job.add(job)
4. Add task to the job is also easy.
taskId = '<task name>'
taskCmdLine = '<task cli>'
task = TaskAddParameter(id = taskId, command_line=taskCmdLine)
task.run_elevated = True # If specified, task will run as su
client.task.add(bound_job.id, task)
For advanced features, refer to MSDN and Python API reference document.
Command Line Support (Azure x-plat CLI)
Azure Batch support has been added to the latest version of Azure Command-Line Interface (xplat cli). Install Azure CLI via npm to interact with Batch service through console.
The initial release will provide account management and job/task submission through "azure batch" sub commands. Support of Linux pool management will come shortly.
Troubleshooting - Using SSH
RDP cannot be used against Linux VMs. Instead, Batch expose SSH access to all nodes in the pool. The following Python code shows how to find SSH information of the nodes.
nodes = client.compute_node.list(pool_id)
for node in nodes:
login = client.compute_node.get_remote_login_settings(pool_id,
node.id)
print("{0} {1} {2} {3}".format(node.id,
node.state,
login.remote_login_ip_address,
login.remote_login_port)
Sample output. You will find the IP and SSH port of the node.
tvm-3436469628_1-20160320t055249z ComputeNodeState.idle 52.160.94.74 50002
tvm-3436469628_2-20160320t055249z ComputeNodeState.idle 52.160.94.74 50003
tvm-3436469628_3-20160320t055249z ComputeNodeState.idle 52.160.94.74 50000
tvm-3436469628_4-20160320t055249z ComputeNodeState.idle 52.160.94.74 50001
Remember to call add user beforehand for SSH logon credential. Once logged on, one can setup key based authentication just like normal Linux VM. Here is Python code to add user. Note that one will be prompted for entering a password, however, it's also possible to specify a public key for key based authentication.
import getpass
pool_id = ...
username = ...
password = getpass.getpass()
user = ComputeNodeUser()
user.name = username
user.password = password
user.is_admin = True
user.expiry_time = (datetime.datetime.today() + datetime.timedelta(days=30)).isoformat()
nodes = client.compute_node.list(pool_id)
for node in nodes:
client.compute_node.add_user(pool_id, node.id, user)
Linux Distribution Support Matrix
Distro | Publisher | Offer | SKU | NodeAgentSKUId |
Ubuntu | Canonical | UbuntuServer | 14.04.0-LTS | batch.node.ubuntu 14.04 |
14.04.1-LTS | batch.node.ubuntu 14.04 | |||
14.04.2-LTS | batch.node.ubuntu 14.04 | |||
14.04.3-LTS | batch.node.ubuntu 14.04 | |||
14.04.4-LTS | batch.node.ubuntu 14.04 | |||
15.10 | batch.node.debian 8 | |||
Debian | Credativ | Debian | 8 | batch.node.debian 8 |
SUSE | SUSE | openSUSE | 13.2 | batch.node.opensuse 13.2 |
openSUSE-Leap | 42.1 | batch.node.opensuse 42.1 | ||
SLES | 12 | batch.node.opensuse 42.1 | ||
SLES | 12-SP1 | batch.node.opensuse 42.1 | ||
SLES-HPC | 12 | batch.node.opensuse 42.1 | ||
CentOS | OpenLogic | CentOS | 7.0 | batch.node.centos 7 |
7.1 | batch.node.centos 7 | |||
7.2 | batch.node.centos 7 | |||
Oracle Linux | Oracle | Oracle-Linux-7 | OL70 | batch.node.centos 7 |
Note, this is not an exhaustive list and may subject to change. The client should use ListNodeAgentSKU REST API call to get a full list of supported image on the account.
Pricing
Azure Batch is built on Azure Cloud Service and Azure Virtual Machine technology. Batch itself is offered in free tier which means you are only charged for the compute resource you are using. When creating a pool (either through https://portal.azure.com or API), you can choose what type of pool you want to create. If you choose Cloud Service which is Windows only, you will be charged based on the Cloud Service pricing meters. If you choose Virtual Machine which provides Linux OS, you will be charged based on Linux Virtual Machine pricing meters. At the time of this blog, Linux VM price is lower than Cloud Service/Windows VM.
References
MSDN API reference (Coming soon.)
Samples (Coming soon.)