Simple enumaration

Simple enumeration

There are times in life when it is useful to enumerate all possible ordered configurations under a certain length for a given set of symbols.

A set of symbols is sometime called an alphabet, and a finite, ordered combination of elements of the alphabet is called a word.

Let’s write a simple program that will enumerate all possible words up to a given length on a given alphabet. We’ll be using Python, but the basic logic should be easy to adapt to most other high level programming languages.

The basic idea is to create a small procedure that generates all words for a given length, then call it for each length up to a certain maximum length.

Our most basic function with take the form

enumerate(alphabet, length)

with alphabet being a list of characters and length being a positive integer. One could argue that a set would have been the Python data structure most suited to represent an alphabet, since it has no doubles. In practice, however, we do make ample use of the fact that a Python list can be indexed. It is possible to convert a set to a list with the command

l = [ a for a in s ]

with s being a set.

The body of this function will be very simple: we will simply call another function that enumerates all tuples of a given length for all lengths up to the maximum desired length.

def enumerate(alphabet, length):
    for i in range(length + 1):
        enumerate_of_length(alphabet, i)

The next step will be to define the enumerate_of_length function. This function will use the indexability of lists. We start by defining an ‘index vector’, which is initialized to the zero vector of length n. We then print the word associated with this vector, and increment the vector.

def enumerate_of_length(alphabet, n):
    v = [0 for i in range(n)]
    l = len(alphabet) - 1
    while v != [l for i in range(n)]:
        print([alphabet[i] for i in v])
        inc(v, l)
    print([alphabet[l] for i in range(n)])

The process of incrementing the vector is described by the following code:

def inc(v, l):
    i = 0
    inc_idx(v, i, l)

def inc_idx(v, i, l):
    if v[i] == l:
        v[i] = 0
        inc_idx(v, i + 1, l)
    else:
        v[i] += 1

We start at the leftmost character. If it is equal to the “last” character in the alphabet, we move to the second to leftmost character. If that one is also at ther “maximum” value for characters in the alphabet, we go right one more character, and repeat the process.

In other words, we increment the characters by cycling through their order in the list, starting from the rightmost character.

The full source code is

def enumerate(alphabet, length):
    for i in range(length + 1):
        enumerate_of_length(alphabet, i)

def enumerate_of_length(alphabet, n):
    v = [0 for i in range(n)]
    l = len(alphabet) - 1
    while v != [l for i in range(n)]:
        print([alphabet[i] for i in v])
        inc(v, l)
    print([alphabet[l] for i in range(n)])

def inc(v, l):
    i = 0
    inc_idx(v, i, l)

def inc_idx(v, i, l):
    if v[i] == l:
        v[i] = 0
        inc_idx(v, i + 1, l)
    else:
        v[i] += 1



enumerate(['a','b','c'], 3)

This concludes the tutorial on how to enumerate all words of less than a certain length on a given alphabet in Python. A practical application of this is brute forcing passwords.

🏡