New Azure Container Instance Vulnerability — #Azurescape— What to do?

David Okeyode
8 min readSep 9, 2021

--

UPDATED:

Microsoft and the threat research unit at Palo Alto networks (Unit42) jointly disclosed a severe vulnerability that a malicious Azure user could have exploited to execute code on other users’ containers and steal customer secrets! The vulnerability has been tagged — #Azurescape. Microsoft has since fixed part of the underlying flaw and notified customers that could have been impacted — https://msrc-blog.microsoft.com/2021/09/08/coordinated-disclosure-of-vulnerability-in-azure-container-instances-service/ (from the information in the disclosure, it seems like the flaw was fixed on 31st August 2021)

This is the second cross-tenant/cross-account vulnerability disclosed in Azure in the past few weeks after being identified by security researchers! The diligent mind behind this one is @yuval_avrahami — a cloud security researcher at Palo Alto networks.

The recent disclosures raises the question around trust of multi-tenant cloud platform services — To achieve ease of use, speed and reduced management cost, the price that we pay is less visibility! But without visibility and the ability to scrutinize/audit underlying implementations, can we really trust what is going on “under-the-hood”? I’ll share some of my thoughts on this in a follow-up post so watch-out for it.

What is ACI?

ACI is a Container as a Service (CaaS) offering in Azure (similar to AWS Fargate and Google Cloud Run). It provides an option to run containers serverlessly. Customers schedule a single container image or a group of containers (called “container groups”; similar to the concept of a pod in Kubernetes) to run on platform managed infrastructure — No need to deal with VMs or orchestrators.

ACI can run both Linux and Windows containers. A lot of customer deployments will be Linux as Windows containers in ACI lacks important capabilities like container groups, volume mounting and private network integration.

Apart from running single containers or container groups, ACI can also be leveraged as Azure Kubernetes Service (AKS) virtual nodes (in scale-out scenarios). This uses a “virtual kubelet” implementation to provisions pods inside ACI from AKS.

REFERENCE: https://docs.microsoft.com/en-us/azure/container-instances/container-instances-overview#linux-and-windows-containers

REFERENCE: For sample use cases for ACI, review this link.

What is the vulnerability that was disclosed?

According to Microsoft’s documentation, ACI customers don’t (or shouldn’t) have direct access to the underlying host OS, container runtime engine or infrastructure. A security researcher decided to test this claim!

The researcher used a tool that they created called “whoC” that I blogged about here to enumerate the underlying container runtime information only to discover that it was using an outdated and vulnerable version of “runC” that was released in 2016! The researcher then used a previously known exploit — CVE-2019–5736 to escape the container to the node, detected that the node is managed by Kubernetes (and again identified that the Kubernetes version that it was using was out of date!). The researcher discovered that when a bridge pod running in the cluster is “tricked” to send an “exec” request to the node, it leaked a privileged Kubernetes service token that gives access to the entire cluster, making it possible to compromise nodes and containers running other customers workloads!

Read the full details of the research here — https://unit42.paloaltonetworks.com/azure-container-instances/

Some urgent actions!

Before you start to investigate ACI and other Azure logs for suspicious events, what actions should you take urgently?

1. Review container instance resources in your Azure subscriptions that pass sensitive data or credential information. This is a recommendation that Microsoft called out on the disclosure blog!

ACI provides multiple ways to pass sensitive configuration information or credentials to containers in a container group at runtime.

  • Passing an “environment variable” (Which makes the value readable in plain text both at the management plane and at runtime)
  • Passing a “secure environment variable” (Which only hides the plain text value from the management plane but it is accessible in plain text at runtime)
  • Using a “secret volume” (Which works in a similar way as a Kubernetes secret. It passes sensitive data in files mounted as a volume. The data is not visible at the management plane but it is also not encrypted at runtime)
  • In rare cases, a “fileShare volume” could also be used (which mounts an Azure file share in read/write mode to the container instance). The common use case for this is to persist data in an external store which presents a different challenge if the data being stored or read is sensitive.

Cloud native security solutions with CSPM capabilities like Prisma Cloud helps you to automate these types of checks across your Azure cloud footprint. You can also use the following checks in Azure Policy. Note that these are not built-in policies but custom checks to create:

a. Azure Container Instance uses environment variables
Assessment: containers[].environmentVariables exists
Action: If sensitive data like API tokens and credential information are used here before the flaw was fixed by Microsoft on 31st August 2021, regenerate and revoke them immediately! Then review logs before you revoked for suspicious usage. Even if you don’t see a suspicious event does not mean an attacker does not have the key and is simply laying low.

b. Azure Container Instance uses a secret volume
Assessment 1: volumes[].secret exists
Assessment 2: containers[].volumeMounts exists
Action: If a secret volume was used to pass sensitive data like API tokens and credential information before the flaw was fixed, regenerate and revoke them immediately! Then review logs for usage.

c. Azure Container Instance uses a file share volume
Assessment: volumes[].azureFile exists
Action: Review the mounted file share for potentially sensitive information. The volume is mounted in read/write mode so check not only for data access but potentially malicious changes. Hopefully, you have Azure diagnostic logs enabled for the storage acccount and the file service (if not and the workload is critical for your organization, why not?). Follow your sensitive data access investigation process for this.

It is worth mentioning that the recommended way to pass sensitive configuration information in ACI is to use managed identity with RBAC to programmaticaly retrieve the sensitive data from a key vault resource. But as I touch on in my second point below, even that can be implemented in a way that increases impact in the case of an incident like this.

2. Review container instance resources in your Azure subscriptions that have an associated managed identity.

Managed identities allows an Azure service to assume permissions for accessing other Azure services (or other Azure AD protected resource) without having credentials in code (similar to IAM role in AWS). Currently, there are about 34 services in Azure that supports this capability including ACI (in preview). If an attacker can gain shell access to a container in ACI, they could potentially request and obtain an OAuth tokens which can then be used to attack other Azure services that the identity has been granted access to (the default token lifetime is 24 hours and it cannot be revoked). I showed an example of this in a demonstration here: https://youtu.be/P-mtmB4xYxI?t=1960

OAuth access token issued with ACI managed identity

a. Azure Container Instance uses a managed identity
Assessment: identity exists
Action: Review the access granted to the identity. There is very little good reason for a managed identity to be granted permissions on a wide scope such as a subscription or a management group. This is just bad practice!! You will want to review Azure activity logs for suspicious events performed with the identity on the management plane if it has that access.

If the identity is used to grant access to a key vault resource, verify if the permissions were granted using an “access policy” or using “RBAC”. This is because the “access policy” model for key vault is scoped at the resource level and could potentially be used to retrieve other sensitive data in the vault (hopefully you have the diagnostic logs enabled to check for this. If not and the vault is used for production workload, why not?). The action here is to regenerate the identity so that issued OAuth access tokens cannot be refreshed!

Other security practices/implementation to look into

Apart from investigating ACI instances and other Azure logs, what else can you do to protect yourself going forward? The two vulnerabilities that we’ve seen in the last few weeks for Azure won’t be the last. It is important to design security in a way that assumes these vulnerabilites. Here are some practical ways to implement this principle in relation with this particular vulnerability:

1. Your containers should be minimalist where possible

Where possible, the container images for your applications should be super-minimalist or “distroless”. This means they should only contain an application and its runtime dependencies without a package manager, shell, or OS. This limits the ability of an attacker to connect to the container over the API! The screenshot below shows a sample error when I attempted to connect to a distroless container in ACI.

Connecting to a distroless image in ACI

2. Implement runtime protection for production container apps regardless of where or how you run them — Azure Kubernetes Service (AKS), Azure RedHat OpenShift (ARO), Virtual Machines, App Service, Functions and even serverless services like Azure Container Instances (ACI).

Runtime protection capabilities allows you to monitor aspects of a running container like process executions, file system calls, network connections. Policies can then be applied to either alert or take an action if a violating event occurs. There are different approaches that vendors take depending on the container runtime environment but for a serverless environment like ACI, I like the Prisma Cloud app-embedded defender approach as no code change is required it can be automated in a pipeline.

With a simple command like the one below, runtime protection can be embedded into a Dockerfile without having to modify any code. This can be done automatically in a pipeline using the twistcli tool or supported extensions like those for GitHub Actions and Azure DevOps.

$ ~/twistcli app-embedded embed -u $TWISTLOCK_USER -p $TWISTLOCK_PASSWORD --address $TWISTLOCK_CONSOLE --app-id $APP_ID --data-folder "/tmp" Dockerfile

Centralized policies can then be configured to monitor/prevent unauthorized processes or network connections (see screenshot below)

See this link for a walkthrough of Prisma cloud runtime protection for ACI — https://github.com/davidokeyode/prismacloud-workshops-labs/blob/main/workshops/azure-cloud-protection-pcce/modules/11-protect-serverless-container-workloads.md

To get into more details on Azure Security, please check out my books on Implementing Azure Security Technologies — Defense and Penetration Testing Azure for Ethical Hackers (co-authored with @kfosaeen) — Offense

--

--

David Okeyode
David Okeyode

Written by David Okeyode

Author of four books on cloud security — https://amzn.to/2Vt0Jjx. I also deliver beginner 2 advanced level cloud security training 2 organizations.

No responses yet