Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is the same topic I had an intense argument with my coworkers at the company formerly called FB a decade ago. There is a belief that most joins are 1-2 deep. And that many hop queries with reasoning are rare and non-existent.

I wonder how you reconcile the demand for LLMs with multihop reasoning with the statement above.

I think a lot what is stated here is how things work today and where established companies operate.

The contradictions in their positions are plain and simple.



There are worst-case optimal algorithms for multi-way and multi-hop joins. This does not require giving up the relational model.


I maintain LadybugDB which implements WCOJ (inherited from the KuzuDB days). So I don't disagree with the idea. Just that it's a graph database with relational internals and some internal warts that makes it hard to compose queries. Working on fixing them.

https://github.com/LadybugDB/ladybug/discussions/204#discuss...


Also an important test is the check on whether it's WCOJ on top of relational storage or is the compressed sparse row (CSR) actually persisted to disk. The PGQ implementations don't.

There are second order optimizations that LLMs logically implement that CSR implementing DBs don't. With sufficient funding, we'll be able to pursue those as well.


CSR is an array-based trie hence very costly to update. It can serve as an index for parts of the graph that basically will almost never change, but not otherwise.


Makes it a good match for columnar databases which already operate on the read-only, read-mostly part of the spectrum.

Perhaps people can invent LSM like structures on top of them.

But at least establish that CSR on disk is a basic requirement before you claim that you're a legit graph database.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: