Return to blog

How To Write a Spintax Parser in Python

Greg Bizup
Mar 18, 2024

Spinning Endlessly Into The Void

Digital marketers often have the need to randomly adjust the content of their automatic replies and outreach messages. Spintax offers a super simple way to get this done. As a simple example, hello {world|everybody} may be processed as hello world.

In more complicated scenarios, spintax may be nested. For example, you may have {hello {world|everybody}|Uhh, goodbye.}. This allows for us to have tons of variation in the message we send, but increases the complexity of the algorithm a little bit.

A Simple Python Spintax Processing Algorithm

Curious about how Spintax works and is processed? This article will show you one of the many was you can handle spintax parsing in Python. Before I go any further, there's already a library for this, you can check it out on this GitHub repo. While their algorithm works nicely and has a pypi package which you can install using pip, I found the logic behind the code somewhat difficult to follow.

It turns out, there's actually a million implementations of Spintax in PHP, JavaScript, and Python already out there. The one I describe here is the cleanest, I think. For those who are interested in a Java implementation, I have also ported this to Java and it can be found on my GitHub

The Steps To Our Algorithm

Now is a good time to identify how exactly we intend to process our spintax. We need to write an algorithm that can accommodate nested and parallel spintax. Here are the general steps I chose to follow:

  1. If spintax is not present, just return the plain string
  2. Identify the innermost spintax groups, for example {outer group|{inner|group}}
  3. Spin the innermost groups
  4. Replace the spintax in the input string to the spun result
  5. Recursively iterate the process

This algorithm described here uses a few simple methods along with recursion to get a clean-code result that works perfectly. I will show you the code now, and explain it afterwards:

"""Dead simple spintax parser that uses recursion to parse nested spintax"""

import re
import random

s = "I {love {{python|snek language}|{java|coffee language}}|hate ruby} and thats {that|about it|all she wrote}"

def has_spintax(s: str) -> bool:
    if re.search(r".*{.*}.*", s):
        return True
    return False

def extract_innermost_groups(s: str) -> list[str]:
    """Get only the innermost groups that do not contain any other spintax
    
    e.g. "{hello {world|woop}|hiya}" -> ["{world|woop}"]
    """
    return re.findall("{[^{}]*?}", s)

def choose(s: str) -> str:
    """Choose which string to go with from spintax string

    e.g. {hello|world} -> hello 
    """
    s = re.sub("[{}]", "", s)
    return random.choice(s.split("|"))

def spin(s: str) -> str:
    """Recursively spin until no more spintax is available"""
    if not has_spintax(s):
        return s
    for group in extract_innermost_groups(s):
        s = s.replace(group, choose(group))
    return spin(s)


for _ in range(30):
    print(spin(s))

Happy spinning!