Revisiting EC2 Instance IDs

Revisiting EC2 Instance IDs

Back in September, I published the Anatomy of an EC2 Resource ID where I pointed out some curious patterns in EC2′s ID scheme and proposed a method of “decoding” these patterns to reveal an underlying serial number. In that post I was careful to write that “while the patterns are indisputable, there remain unknowns and quirks that remind us that such “black box” observation has its limits”.

This week, the black box became a little bit whiter.

Sören Bleikertz, a computer science student writing his Masters thesis on EC2 security, poked into the Xen hypervisor used by EC2 and made some observations regarding EC2′s underlying architecture. Among his findings on the storage and networking configurations, Sören pointed out that each instance was given a unique name (the “Xen domain”) such as dom_32504936 and that this seemed to behave like a serial number, growing from day to day. Sound familiar yet?

Well, it turns out that this Xen domain is none other than the underlying instance ID uncovered in my previous research! This revelation gives us an important conclusion: the decoding method was accurate. The serial number exists and based on everyone’s input we even got the formula right.

With Sören’s technique at hand we can now uncover the constants needed for all EC2 regions. Except for us-east-1 which thanks to RightScale enjoyed a 3-year history, we did not have enough data to extract the constants for other regions. Surprisingly, it turns out that the constants are in fact identical for all regions. What threw us off the scent is that as opposed to us-east-1 which very likely started the serial number from zero, the other regions do not. For example, the serial numbers for the 3-month-old us-west-1 region are already in the range of 752 million. Those for eu-west-1 are in the 500 million range. We can safely assume that hundreds of millions of instances have not in fact been spun up. What makes more sense is that each region was assigned a different starting point in order to ensure globally unique instance IDs.

An additional finding of Sören’s is that the image file for the root disk points to a filename on the VM host such as /mnt/instance_image_store_3/262768. It turns out that the number at the end of this file is, again, simply the AMI ID – decoded. For example, we can re-encode 262768 to yield ami-19a34270, which is Alestic’s Ubuntu Karmic Base image. Similar to instance IDs, the underlying image ID also seems to have different ranges in each AWS region.

As a bonus of Sören’s discoveries and the connection to the IDs, it’s now possible to infer your instance ID (and image ID) locally, without even consulting the EC2 user-data. Why someone would prefer this to the user-data is a good question, but it’s a fun exercise nonetheless. Here’s a Ruby script that does just that:

#!/usr/bin/ruby
$stderr.puts("Detecting VM domain ID (may take a few moments)")
dom_id = nil
(1..65535).each do |i|
        if system("xenstore-ls /local/domain/#{i} > /dev/null 2>&1")
                dom_id = i
                break
        end
end

$stderr.puts("VM domain ID is #{dom_id}")

dom_name = `xenstore-read /local/domain/#{dom_id}/name`
$stderr.puts("VM domain name is #{dom_name}")

numeric_id = dom_name.split("_").last.to_i
c1 = numeric_id >> 24
c2 = (numeric_id >> 16) & 0xFF
c3 = numeric_id & 0xFFFF
c3_1 = (numeric_id >> 8) & 0xFF
c3_2 = numeric_id & 0xFF

d1 = c1 ^ c3_2 ^ 0x69
d2 = c2 ^ c3_1 ^ 0x40 ^ 0xe5
d3 = c3 ^ 0x4000

instance_id = sprintf("i-%02x%02x%04x", d1, d2, d3)
puts(instance_id)

This requires xen-utils to be installed on the machine (on Ubuntu, run apt-get install xen-utils-3.3). Here’s an example run:

# ./get_instance_id.rb
Detecting VM domain ID (may take a few moments)
VM domain ID is 1423
VM domain name is dom_32900610
i-6a554602

Thanks once more to Sören for the great detective work.

Thomas Clayton

Thomas Clayton

Thomas Clayton is a cloud computing blogger. This blog shares his cloud market research and commentary.

Add comment