
How to convert bookmarks.html to workable text of folders, titles and urls?
Hello, any best practices for converting Firefox bookmarks once they're exported? The exported_bookmarks.html are an html file which are impossible to edit. Converting so the data for folders, titles and urls can be edited is what's needed.
I tried Vi with some Regex code, however the cleaned file is not so clean.
All Replies (3)
Hi
Have you tried the backup tool? This exports the data as a JSON that you might find easier to work with.
Thanks, tried html and json formats now. AI is giving wrong code and nothing is working. Think I'll work on some other channels for the conversion code, until Firefox has some easier way to edit bookmarks.
Code examples:
$ cat bookmarks_edit.tsv
jq -r '
# Recursive walk that carries the folder path as an array def walk(path): . as $node | if $node.type == "folder" then ($node.name // "Unnamed") as $fname | ($node.children // [])[] | walk(path + [$fname]) elif $node.type == "bookmark" then "\($node.url)\t\($node.title)\t\(path|join(\"/\"))" else empty end; .folders[] | walk([])
' bookmarks-2025-10-11.json > bookmarks_edit.tsv
HTML to Python:
from bs4 import BeautifulSoup
with open('bookmarks.html', 'r', encoding='utf-8') as f:
soup = BeautifulSoup(f, 'html.parser')
for link in soup.find_all('a'):
print(f"{link.text} | {link.get('href')}")
JSON to Python import json
with open("bookmarks.json", "r", encoding="utf-8") as f:
data = json.load(f)
def extract_bookmarks(node):
if node.get('type') == 'text/x-moz-place': print(f"{node.get('title', )} | {node.get('uri', )}") for child in node.get('children', []): extract_bookmarks(child)
extract_bookmarks(data['roots']['bookmark_bar']) extract_bookmarks(data['roots']['other_bookmarks']) extract_bookmarks(data['roots']['toolbar'])
This is a little out of our remit here. You might want to ask the experts at https://stackoverflow.com/questions for help with this.