If I'm reading correctly, the indexing (Context Store) is neutral/unopinionated? How does it select fields for indexing?
Have you done any testing on guided indexing, or metadata layers on top of the data? My experience so far on similar work is that getting data in front of an agent isn't enough context to get useful/reliable answers enough of the time. I.e. _what_ you index, and how you signpost for agents, becomes really important (unless your data is super clean I guess). This does look like a good foundation for that kind of tooling though!
Hi, @jessewmc. Thanks for your reply. Regarding your points:
> If I'm reading correctly, the indexing (Context Store) is neutral/unopinionated? How does it select fields for indexing?
While we haven't yet published details on the backend implementation, I can say that our implementation performs very well without needing to prioritize specific fields for indexing. We aim for large text fields to perform decently and retrieval based on small/compressible fields like ints to be fast. (More to come on this in the coming months.)
> Have you done any testing on guided indexing, or metadata layers on top of the data?
We've been testing with different data scales and shapes. Nothing detailed to share yet, but performance has (so far) never itself become the bottleneck in our agent testing. (The LLM thinking itself is often the bottleneck.)
> My experience so far on similar work is that getting data in front of an agent isn't enough context to get useful/reliable answers enough of the time.
Airbyte has rich metadata on our upstream connector's data models, which I think helps us a lot to deliver helpful context to the agent. Another option, when optimizing for specific use cases, is to build your own agent tools on top of our Agent SDK. This allows you to make the calls organic and build the tools in a way that makes natural sense to the agent, regardless of source shape or which system(s) that data is coming from.
> This does look like a good foundation for that kind of tooling though!
We agree! Thanks again for sharing your thoughts here.
Great launch btw! I have some questions if you don't mind
you mentioned that performance was never an issue, I am really intrigued how this is achieved.
I have 3 General questions:
1. How big (estimate in bytes) and complex were the test datasources? I couldn't find this in the benchmark repo.
2. how is the business context managed? In the blog "Airbyte Agents: A New Era for Airbyte" it was mentioned handling the business context but in the context layer docs it only talks about schema discovery (I got a bit confused)
3. When you said performance was never an issue, do you mean the user always got the answer it was looking for?
Although not especially “current,” Normal Accidents: Living with High-Risk Technologies is a 1984 book by Yale sociologist Charles Perrow, which analyses complex systems from a sociological perspective. Perrow argues that multiple and unexpected failures are built into society's complex and tightly coupled systems, and that accidents are unavoidable and cannot be designed around. Several historical disasters are analysed. I read a newer edition published in 1999, and the author had added a chapter on Chernobyl, which turned out to be a textbook example of some Perrow’s theory (in particular, that adding fail-safes also adds complexity, thus not necessarily making for any more safety. The Chernobyl disaster was precipitated at least in part, because they were on a tight schedule to test a fail-safe system.) The book is fascinating and a good page turner, hard to put down.
Perrow’s book is best combined with a reading of The Doomsday Machine: Confessions of a Nuclear War Planner, by Daniel Ellsberg.
I'm a retired neurosurgical anesthesiologist (38 years in practice). I read Perrow's book several years after it was published. I was struck by how relevant his points of failure were to the practice of anesthesiology, the concept of the danger of tight coupling. I referred to this book over subsequent decades in my presentations on Grand Rounds, but to my knowledge none of the residents or other attendings ever read it.
Other books I’ve much enjoyed, when your interest is in structural or other failures:
Why Buildings Fall Down: How Structures Fail by Matthys Levy and Mario Salvadori, a wide ranging history of structural failures of various kinds, and their causes.
Ignition!: An Informal History of Liquid Rocket Propellants by John Drury Clark, which is a personal memoir from a senior researcher with many decades experience developing rocket fuels - he is the proverbial Rocket Scientist. Most interesting, and amusing (in a morbid way), is the quite different culture of safety “back in the day” of this somewhat esoteric engineering/chemistry field.
Sure. Even a history of safety success contributes to this. We haven't had an accident in 3000 days, what was dangerous about this job again? Also what's this stupid policy for anyway, I've never seen anybody even come close to (non-dangerous-sounding fate) while working here.
But probably the policy is in place because it used to happen before the policy was in place. It's just not obvious to people who have never seen the consequences before.
Complacency kills! It's why it's usually the old farmers that die in stupid ways.
I'm also reminded of the Yale machine shop safety supervisor who died by getting herself wound around a lathe spindle. Working alone, late at night on powerful rotating machinery wearing loose clothing.
>Michele Dufault, a 22-year-old physics and astronomy major from Scituate, Massachusetts, was asphyxiated after her hair caught in a lathe in a Sterling Chemistry Laboratory machine shop, where she was working by herself in violation of the existing safety rules.
Hmmm. I was positive that at the time the report came out, they said that she was actually one of the people assigned to monitor machine shop users for safety issues. Maybe that got confused with her taking the advanced safety classes.
Inviting Disaster: Lessons From the Edge of Technology was one of the texts for an aerospace class I didn't take but friends did, but honestly you can just read the book.
There are lots of frameworks for teaching safety and programs for compliance and such but they are far too easy to cargo cult if you don't appreciate safety and the need for safety culture and UNDERSTAND what failures look like.
And when you really understand the need and how significant failures happened... "state of the art" tools and practices take a back seat, they can be useful but they're just tools. What you need is people developing the appropriate vision, and with that the right things tend to follow.
The STPA and CAST handbooks are available for free from the MIT Partnership for Systems Approaches to Safety and Security website. They are phenomenal.
Wow thats a name I haven't heard in a long time! I really miss the way tacops 2.2 felt, never did get along with 3.x versions. Was definitely a formative gaming experience for me as well
What does this mean? Is there a list of approved conversation allowed on trails? What difference does it make if they are talking to someone virtually or physically present on the trail?
I guess it depends. When you're talking to a person directly, most likely your ears aren't blocked. So you get the feedback of your voice levels, and adjust accordingly. This is really difficult to do when you have earphones in, blocking everything. Usually people with both earphones in tend to raise their voices nearly to a shouting level. I don't know if OP does that or not though.
It's more that their full attention isn't on the meeting at hand - how can it be when they're wandering some nature trail taking it all in? If I spent a week busting my ass to get some feature shipped, and in a sprint re-cap meeting all I see is my boss wandering a forest trail going "Uh-hu, uh-hu, wow there's a red-breasted warbler..." I'd become very upset.
That’s not true, you should try it. My full attention is on the call. Walking the trail, as it’s also the same trail everyday, is fully on autopilot.
Really. Your comment kind of annoys me because it seems you have no experience of this, while I have several years experience of it, yet you are sure that you know better than me what my experience of it is?
Edit: Also, my teams know about it and all agree with it, some of them do walks themselves.
Are we gatekeeping walks and meetings now? You can walk and talk and not annoy other people -- if other people are even around. You can also walk and talk and give enough attention to your meeting. Use your judgement.
177ms from west coast of NA, but that is slower than I would have estimated. Other than HN, almost everything else I use regularly feels subjectively way slower, although I've not checked response time.
Congrats! Ordered. Been waiting for a physical copy, being able to scribble notes in the margins and bookmark and flip back and forth by hand just works so much better for me.
If I'm reading correctly, the indexing (Context Store) is neutral/unopinionated? How does it select fields for indexing?
Have you done any testing on guided indexing, or metadata layers on top of the data? My experience so far on similar work is that getting data in front of an agent isn't enough context to get useful/reliable answers enough of the time. I.e. _what_ you index, and how you signpost for agents, becomes really important (unless your data is super clean I guess). This does look like a good foundation for that kind of tooling though!
reply