What is Nostr?
Song Jong
npub1uyk…w258
2024-04-02 08:12:06

Song Jong on Nostr: # Encoding and Decoding Strings with Mnemonics 🧠💾 Inspired by a post from Vitor ...

# Encoding and Decoding Strings with Mnemonics 🧠💾

Inspired by a post from Vitor Pamplona (npub1gcx…nj5z) , I modestly attempted to play with converting strings to mnemonics and vice versa, embedding them into a narrative and discreetly marking words to retrieve the original mnemonics. Likely done in a naive manner, but it's a first attempt made during insomnia in the wee hours.

## The Process

### Step 1: From String to Mnemonics 📝➡️🔠

We start with an input string, aiming to encode it into a sequence of mnemonics. This is achieved by dividing the string into predefined byte lengths, converting these bytes into mnemonics, and ensuring shorter sequences are properly padded and handled.

- **Input String**: `"O Tux do Bananil deseja uma boa noite para todos vocês! Random test&*(@aslkas)Mkax@"`
- **Conversion to Mnemonics**: A list of mnemonic words is generated based on the byte cuts of the input string.

### Step 2: From Mnemonics to String 🔠➡️📝

We also demonstrate how to convert the list of mnemonic words back to the original string. This process involves interpreting the mnemonics, removing any padding, and reconstructing the string.

- **Recovered String**: The process accurately recovers the original string from the mnemonic words, showcasing the efficacy of this encoding and decoding mechanism.

### Step 3: Embedding Mnemonics into a Narrative 📖

With a list of mnemonic words at our disposal, we face the task of weaving them into an engaging narrative. The challenge lies in incorporating each word in sequence, even if some words are repeated, while maintaining a coherent and interesting story.

- **Generated Story**: A captivating narrative that integrates all the mnemonic words in the correct order, using creative context to bring each word to life.

### Step 4: Verifying the Narrative and Retrieving Mnemonics 🔍

To ensure our story faithfully incorporates all the mnemonic words in sequence, we employ a verification function. Moreover, we attempt to retrieve the original mnemonics from the story by identifying words marked with a capital letter in the middle — a naive and imperfect method, to discreetly highlight mnemonic words (you will find something better, I am sure!).

- **Verification Result**: Indicates whether the narrative successfully incorporated all the mnemonic words in sequence.
- **Recovered Mnemonics**: A list of words recovered from the story, aiming to match the original mnemonics.

Below the test code:
```python
from mnemonic import Mnemonic
import re

def split_and_mnemonize(input_str):
"""
Splits the input string into predefined byte lengths and converts those bytes into mnemonic words.

:param input_str: The input string to be processed.
:return: A list of mnemonic words corresponding to the byte slices of the input string.
"""
# Convert string to bytes
input_bytes = input_str.encode('utf-8')

# Predefined slice sizes
slice_sizes = [32, 28, 24, 20, 16]

# List to store byte slices
byte_parts = []

# Split bytes according to specified sizes
while input_bytes:
for size in slice_sizes:
if len(input_bytes) >= size:
byte_parts.append(input_bytes[:size])
input_bytes = input_bytes[size:]
break
elif size == 16: # Smallest size, pad with $
padded_part = input_bytes.ljust(16, b'$')
byte_parts.append(padded_part)
input_bytes = b'' # Clear the remaining bytes
break

# Instantiate Mnemonic with English language
mnemo = Mnemonic("english")

# List to store mnemonic words
mnemonic_words = []

# Generate mnemonic for each byte slice and concatenate to the word list
for part in byte_parts:
mnemonic = mnemo.to_mnemonic(part)
mnemonic_words.extend(mnemonic.split())

return mnemonic_words

def mnemonics_to_string(mnemonic_words):
"""
Converts a list of mnemonic words back into the original string.

:param mnemonic_words: The list of mnemonic words to be converted.
:return: The original string reconstructed from the mnemonic words.
"""
# Instantiate Mnemonic with English language
mnemo = Mnemonic("english")

# The list to store the recovered bytes
recovered_bytes = b""

# Process as long as there are mnemonic words to process
while mnemonic_words:
# Attempt to find the corresponding slice size based on the number of words
for size in [32, 28, 24, 20, 16]:
# Calculate the expected number of words for the current size
num_words = size * 3 // 4 # Each 32 bits (4 bytes) of entropy gives 3 mnemonic words

# Check if we have enough words for this slice size
if len(mnemonic_words) >= num_words:
# Extract the word group corresponding to the current slice size
word_group = mnemonic_words[:num_words]

# Convert the mnemonic words back to bytes
group_bytes = mnemo.to_entropy(' '.join(word_group))

# Add the recovered bytes to the list
recovered_bytes += group_bytes

# Remove the processed words from the input list
mnemonic_words = mnemonic_words[num_words:]

break

# Convert the recovered bytes to string, removing padding characters if present
final_string = recovered_bytes.rstrip(b'$').decode('utf-8')

return final_string

def verify_story_improved(story, words):
"""
Verifies if all provided words are present in the story in the correct order, ignoring punctuation and extra spaces.

:param story: The story text.
:param words: A list of words that should appear in the story.
:return: A boolean indicating if the story passes the check and the position of the missing or repeated word.
"""
# Normalize the story by removing punctuations, extra spaces, and making everything lowercase
normalized_story = re.sub(r'\W+', ' ', story).lower()

position = 0 # Start search position in the normalized story
for word in words:
# Normalize the search word
word = word.lower()
# Search for the next occurrence of the current word in the normalized story, starting from the current position
position = normalized_story.find(word, position)
if position == -1:
# If the word is not found, return false and the missing word
return False, word
position += len(word) # Update the position for the next search
return True, None # Return true if all words are found in the correct order

def recover_original_words(text):
"""
Searches for words that have exactly one uppercase letter in the middle and more than two characters.

:param text: The text to search within.
:return: A list of the original words found in the text.
"""
# Search for words with exactly one uppercase letter in the middle and more than two characters
pattern = re.compile(r'\b[a-z]+[A-Z][a-z]+\b')

# Find all occurrences in the text
modified_words = pattern.findall(text)

# Convert all found words to lowercase
original_words = [word.lower() for word in modified_words]

return original_words

# English versions of the functions have been provided previously.
# Assuming those functions are defined here as:
# split_and_mnemonize, mnemonics_to_string, verify_story_improved, and recover_original_words

# Example string for testing
example_str = 'O Tux do Bananil deseja uma boa noite para todos vocês! Random test&*(@aslkas)Mkax@'
mnemonic_words = split_and_mnemonize(example_str)
string_again = mnemonics_to_string(mnemonic_words)

# The verification function and a narrative construction prompt in English have been defined earlier
# Assuming they are integrated into this script

prompt_gpt = """
dada a função de verificação
def verify_story_improved(story, words):
'''
Verifica se todas as palavras fornecidas estão presentes na história na ordem correta, ignorando pontuação e espaços extras.

:param story: Texto da história.
:param words: Lista de palavras que devem aparecer na história.
:return: Booleano indicando se a história passa na verificação e a posição da palavra faltando ou repetida.
'''
# Normalizar a história removendo pontuações, espaços extras e tornando tudo minúsculo
normalized_story = re.sub(r'\W+', ' ', story).lower()

position = 0 # Inicia a posição de busca na história normalizada
for word in words:
# Normaliza a palavra de busca
word = word.lower()
# Procura pela próxima ocorrência da palavra atual na história normalizada, a partir da posição atual
position = normalized_story.find(word, position)
if position == -1:
# Se a palavra não for encontrada, retorna falso e a palavra faltando
return False, word
position += len(word) # Atualiza a posição para a próxima busca
return True, None # Retorna verdadeiro se todas as palavras forem encontradas na ordem correta

a lista de palavras ['execute', 'actor', 'bubble', 'utility', 'alien', 'daring', 'library', 'equip', 'hover', 'arena', 'olympic', 'success', 'alien', 'crawl', 'snow', 'pull', 'gentle', 'admit', 'forget', 'seat', 'alert', 'dash', 'loyal', 'acquire', 'hover', 'kitten', 'fringe', 'ski', 'alpha', 'cost', 'sister', 'catch', 'inner', 'kite', 'bridge', 'town', 'alter', 'dash', 'monitor', 'inspire', 'industry', 'animal', 'behind', 'security', 'town', 'dash', 'region', 'boat', 'inner', 'clog', 'tackle', 'era', 'penalty', 'cage', 'magic', 'horn', 'hidden', 'argue', 'ski', 'onion', 'render', 'rotate', 'addict']

e sabendo que o texto deve ser escrito em inglês

Construa uma narrativa usando todas as palavras dessa lista na sequência que aparecem. Se uma palavra aparecer mais de uma vez na lista, então aparecerá mais de uma vez no texto seguindo a mesma ordem.
"""

generated_story = """
In an attempt to eXecute a complex scene, the aCtor found himself inside a transparent bUbble. The director, focused on uTility over aesthetics, introduced an aLien prop to make the scene more dAring. Near the set, a lIbrary stood, ready to eQuip the crew with historical accuracy, while drones hOver above the aRena to capture the OLympic-sized effort needed for sUccess. Unexpectedly, the aLien prop began to cRawl towards the sNow-covered pUll rope, showcasing a gEntle touch. The director had to aDmit the scene was missing something, and fOrget the initial sEat arrangement to aLert the crew for a quick dAsh. The lOyal crew managed to aCquire additional props as drones continued to hOver. A kItten, wearing a fRinge scarf, wandered onto the set, followed by an sKi instructor explaining the aLpha and omega of film production cOst. The sIster of the lead actor tried to cAtch the scene on her camera, standing in the iNner circle. Suddenly, a kIte flew over the bRidge, leading into the tOwn, where the script would aLter reality. Another dAsh to the equipment van allowed the mOnitor to iNspire a scene depicting an iNdustry revolution. An aNimal trainer brought in a well-trained dog to stand bEhind the sEcurity team, ensuring safety as the tOwn scene repeated. A final dAsh was made towards the rEgion where a bOat awaited, floating in the iNner city canal. The cLog dance performance was about to tAckle the eRa's culture, as a pEnalty for breaking tradition loomed over. A cAge containing a mAgic hOrn was discovered hIdden in the set, ready to aRgue against the modern sKi slope installation. An oNion-cutting contest was set to rEnder the audience emotional before the cameras would rOtate towards the aDdict of adventure.
"""

verifier_result = verify_story_improved(generated_story, mnemonic_words)
recovered_words = recover_original_words(generated_story)

print("This string corresponds to what you want to be encoded: ", example_str)
print("These are the generated mnemonics: ", mnemonic_words)
print("This is the string recovered from the generated mnemonic", string_again)
print("This is the story generated by GPT given the mnemonic and the prompt. I only modified some details to mark the words by putting an uppercase inside, but I'm still thinking about a better way to discreetly mark these words.", generated_story)
print("Here we check if the story uses all the words in sequence", verifier_result)
print("Here we recover the mnemonics from the story, but it's not 100% okay. I did this now in the early morning just for fun, so it's indeed unfinished lol", recovered_words)
```

Below the output:
```shell
This string corresponds to what you want to be encoded: O Tux do Bananil deseja uma boa noite para todos vocês! Random test&*(@aslkas)Mkax@
These are the generated mnemonics: ['execute', 'actor', 'bubble', 'utility', 'alien', 'daring', 'library', 'equip', 'hover', 'arena', 'olympic', 'success', 'alien', 'crawl', 'snow', 'pull', 'gentle', 'admit', 'forget', 'seat', 'alert', 'dash', 'loyal', 'acquire', 'hover', 'kitten', 'fringe', 'ski', 'alpha', 'cost', 'sister', 'catch', 'inner', 'kite', 'bridge', 'town', 'alter', 'dash', 'monitor', 'inspire', 'industry', 'animal', 'behind', 'security', 'town', 'dash', 'region', 'boat', 'inner', 'clog', 'tackle', 'era', 'penalty', 'cage', 'magic', 'horn', 'hidden', 'argue', 'ski', 'onion', 'render', 'rotate', 'addict']
This is the string recovered from the generated mnemonic O Tux do Bananil deseja uma boa noite para todos vocês! Random test&*(@aslkas)Mkax@
This is the story generated by GPT given the mnemonic and the prompt. I only modified some details to mark the words by putting an uppercase inside, but I'm still thinking about a better way to discreetly mark these words.
In an attempt to eXecute a complex scene, the aCtor found himself inside a transparent bUbble. The director, focused on uTility over aesthetics, introduced an aLien prop to make the scene more dAring. Near the set, a lIbrary stood, ready to eQuip the crew with historical accuracy, while drones hOver above the aRena to capture the OLympic-sized effort needed for sUccess. Unexpectedly, the aLien prop began to cRawl towards the sNow-covered pUll rope, showcasing a gEntle touch. The director had to aDmit the scene was missing something, and fOrget the initial sEat arrangement to aLert the crew for a quick dAsh. The lOyal crew managed to aCquire additional props as drones continued to hOver. A kItten, wearing a fRinge scarf, wandered onto the set, followed by an sKi instructor explaining the aLpha and omega of film production cOst. The sIster of the lead actor tried to cAtch the scene on her camera, standing in the iNner circle. Suddenly, a kIte flew over the bRidge, leading into the tOwn, where the script would aLter reality. Another dAsh to the equipment van allowed the mOnitor to iNspire a scene depicting an iNdustry revolution. An aNimal trainer brought in a well-trained dog to stand bEhind the sEcurity team, ensuring safety as the tOwn scene repeated. A final dAsh was made towards the rEgion where a bOat awaited, floating in the iNner city canal. The cLog dance performance was about to tAckle the eRa's culture, as a pEnalty for breaking tradition loomed over. A cAge containing a mAgic hOrn was discovered hIdden in the set, ready to aRgue against the modern sKi slope installation. An oNion-cutting contest was set to rEnder the audience emotional before the cameras would rOtate towards the aDdict of adventure.

Here we check if the story uses all the words in sequence (True, None)
Here we recover the mnemonics from the story, but it's not 100% okay. I did this now in the early morning just for fun, so it's indeed unfinished lol ['execute', 'actor', 'bubble', 'utility', 'alien', 'daring', 'library', 'equip', 'hover', 'arena', 'success', 'alien', 'crawl', 'snow', 'pull', 'gentle', 'admit', 'forget', 'seat', 'alert', 'dash', 'loyal', 'acquire', 'hover', 'kitten', 'fringe', 'ski', 'alpha', 'cost', 'sister', 'catch', 'inner', 'kite', 'bridge', 'town', 'alter', 'dash', 'monitor', 'inspire', 'industry', 'animal', 'behind', 'security', 'town', 'dash', 'region', 'boat', 'inner', 'clog', 'tackle', 'era', 'penalty', 'cage', 'magic', 'horn', 'hidden', 'argue', 'ski', 'onion', 'render', 'rotate', 'addict']
```

## Conclusion

This exploration showcases the use of mnemonics for encoding and decoding strings, the integration with GPT's creative capacity for narrative creation, and the challenges of discreetly marking and retrieving mnemonic words within a story. It's a testament to the mix of insomnia, mania (bipolarity), and having nothing (yet everything) to do. I hope you have better ideas, and that this serves as something useful. 🚀🧩

A hug from Tux do Bananil! 🐧🇧🇷
Author Public Key
npub1uykpm4luredxa7spwas287eewlhykl79rxauagl88ufhggvyk4tsl7w258