The thing is, it's actually a very difficult engineering/research/infra problem to run complicated queries on enormous data lakes. All the obvious ways to do it are prohibitively slow and expensive. Every bit of performance you can squeeze out of this, you unlock the ability for people to work with their data more easily. So there is huge value in having some centralized companies sink lots of R&D into trying to solve these problems well.