Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Big Rewrite #2330

Open
Changaco opened this issue Mar 23, 2024 · 6 comments
Open

The Big Rewrite #2330

Changaco opened this issue Mar 23, 2024 · 6 comments
Assignees

Comments

@Changaco
Copy link
Member

This issue is for the rewrite of a significant portion of Liberapay's source code that I first mentioned in liberapay/salon#541 (comment). The basic idea is to switch from SQL to a new database technology built from scratch in Python. The plan is to close all the technical issues related to the database (namely #245, #762, #980, #1113, #1312, #1595, #1701, #1727, #1736, #1962 and #2010) and take advantage of the migration to fix inaccuracies and inconsistencies in the data.

Sadly the work is still very far from complete. It's a big change and was always going to take a while, but there's also the problem that too much of my time and energy is consumed by other things.

@boehs
Copy link

boehs commented Mar 28, 2024

The basic idea is to switch from SQL to a new database technology built from scratch in Python

Why? I can only see this ending poorly. Databases are immensely difficult to program and people spend PhDs working on them

@Changaco
Copy link
Member Author

Changaco commented Apr 1, 2024

Why?

Because it's painfully slow and difficult to build something correctly when you don't have the right tools. I've been working on this Python+PostgreSQL code base for a decade now, and I've come to the conclusion that SQL isn't the right tool for this job. The work I've done so far on this issue has confirmed that the Liberapay source code can be significantly simplified and improved by replacing SQL queries with Python code.

The new Python module needed to make this work will be significantly simpler than an SQL database, because its job will be simpler. It won't need a query parser and planner, for example. It will fulfill two basic needs: consensus and storage. The consensus part is an implementation of Raft that I've already partly written, from scratch based on the Raft paper. The storage part will be an implementation of well-known data structures. Efficiency will be difficult to achieve with Python, but it might not be necessary to fully optimize the storage layer right from the start.

A distinctive feature of the new API is its intuitive and automatic way of handling the fact that data changes over time. An SQL database can of course store multiple revisions of the same information, but it doesn't know if it's doing that, so for example it can't make it easy to manage how many revisions should be kept or whether to delta-compress them to save space.

@boehs
Copy link

boehs commented Apr 2, 2024

This feels domain specific, which is fair. Efficiency and time for development were my main concerns. Would this replace everything, even stuff like user accounts?

@Changaco
Copy link
Member Author

Changaco commented Apr 2, 2024

I wouldn't say that there's anything “domain specific” in this.

SQL will be completely replaced. Liberapay will no longer require or use a PostgreSQL database.

@boehs
Copy link

boehs commented Apr 4, 2024

Well, I wish you luck in this endeavor

@Changaco
Copy link
Member Author

(Cloudflare is continuing to push SQL-based solutions: Zero-latency SQLite storage in every Durable Object. I wonder if they've already realized that it's counterproductive to use SQL this way and are already aiming to remove that layer of abstraction in the future.)

@Changaco Changaco self-assigned this Oct 29, 2024
@Changaco Changaco pinned this issue Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants