A big limitation for skills (or agents using browsers) is that the LLM is working against raw html/DOM/pixels. The new WebMCP API solves this: apps register schema-validated tools via navigator.modelContext, so the agent has structured JSON to work with and can be way more reliable.
WebMCP is currently being incubated in W3C [1], so if it lands as a proper browser standard, this becomes a endpoint every website can expose.
I think browser agents/skills+WebMCP might actually be the killer app for local-first apps [2]. Remote APIs need hand-crafted endpoints for every possible agent action. A local DB exposed via WebMCP gives the agent generic operations (query, insert, upsert, delete) it can freely compose multiple steps of read and writes, at zero latency, offline-capable. The agent operates directly on a data model rather than orchestrating UI interactions, which is what makes complex things actually reliable.
For example the user can ask "Archive all emails I haven't opened in 30 days except from these 3 senders" and the agent then locally runs the nosql query and updates.
OpenAPI is primarily for machine-to-machine which needs determinism and optimized for some cases (e.g. time in unix format with ms accuracy). MCP is optimized for another use case where LLM has many limitations but has good "understanding" of text. instead of sending `{ user: {id: 123123123123, first_name: "XYZYZYZ", "last_name": "SDFSDF", "gender": "..."..... } }` you could return "Mr XYZYZYZ" or "Mrs XYZYZYZ"
llm doesn't need all these and can't parse it anyway without additional tools (e.g. why should it spend tokens even trying to convert unix timestamp to understand the time)
The last days I built the WebMCP plugin for the RxDB database [1]
The goal is to let agents interact with apps through explicit tools instead of DOM scraping or visual navigation. This works nicely because agents can run operations directly on the local-first data the UI already uses.
> The Web Locks API allows scripts running in one tab or worker to asynchronously acquire a lock, hold it while work is performed, then release it. While held, no other script executing in the same origin can acquire the same lock, *which allows a web app running in multiple tabs or workers to coordinate work and the use of resources.*
Yes, and by building small things, you can more easily try out different techniques and tools with less risk than doing the same with something big.
I think it's optimizing for learning versus revenue, which don't have to be mutually exclusive. Sometimes you need to start with one to get to the other.
Yes most servers support websockets. But unfortunately most proxies and firewalls do not, especially in big company networks. Suggesting my users to use SSEs for my database replication stream solved most of their problems. Also setting up a SSE endpoint is like 5 lines of code. WebSockets instead require much more and you also have to do things like pings etc to ensure that it automatically reconnects. SEEs with the JavaScript EventSource API have all you need build in:
But why add it to HTTP/3 at all? HTTP/1.1 hijacking is a pretty simple process. I suspect HTTP/3 would be significantly more complicated. I'm not sure that effort is worth it when WebTransport will make it obselete.
I had 2.5mg of thc every day for ~7 years. I couldn't remember the last dream I had when I quit thc in August. After not sleeping for 2-3 weeks I started having vivid nightmares every night for about a week. I'm still having extremely vivid dreams since, but they're no longer all terrifying. Sleeping better than ever and my anxiety is also better than ever.
I just read your comment after posting mine and it sounds like you've had a similar (but unfortunately opposite!) experience. The vivid dreams stop for me a few weeks after they start. Are your vivid dreams "permanent", or has it only been a short while since you started experiencing them?
Indeed, and IME, the dreams I have after taking a break from daily THC use are extremely vivid - to the point that I can remember them in detail for days afterwards. I enjoy that a lot.
I've found Ghana to be the only country in West Africa where you can reliably outsource and get quality work back with no headaches. Maybe if I did business in French, Senegal would be reliable as well. But the rest of West Africa has a long way to go even getting reliable rule of law. (Which is weird, because you would think Nigeria would have its act together.)
But yes, Kenya is the star out in East Africa. Even among a lot of other scrappy nations in the EAC, Kenya stands out. No question.
For African outsourcing, I can't recommend Ghana and Kenya enough. Only problem right now is that, you kind of have to know someone to get access to the really good guys. Demand is high relative to the guys available with known track records.
Very cool, thanks for sharing! I have writing work on the side for engineers, if you know any. Great way to get your writing skills going while getting paid to play with tech.
WebMCP is currently being incubated in W3C [1], so if it lands as a proper browser standard, this becomes a endpoint every website can expose.
I think browser agents/skills+WebMCP might actually be the killer app for local-first apps [2]. Remote APIs need hand-crafted endpoints for every possible agent action. A local DB exposed via WebMCP gives the agent generic operations (query, insert, upsert, delete) it can freely compose multiple steps of read and writes, at zero latency, offline-capable. The agent operates directly on a data model rather than orchestrating UI interactions, which is what makes complex things actually reliable.
For example the user can ask "Archive all emails I haven't opened in 30 days except from these 3 senders" and the agent then locally runs the nosql query and updates.
- [1] https://webmachinelearning.github.io/webmcp/
- [2] https://rxdb.info/webmcp.html