Consistency, new models don't behave the same on every task as their predecessors. So you end up building pipelines that rely on specific behavior, but now you find that the new model performs worse with regards to a specific task you were performing, or just behaves differently and needs prompt adjustments. They also can fundamentally change the default model settings during new releases, for example Gemini 2.5 models had completely different behavior with regards to temperature settings than previous models. It just creates a moving target that you constantly have to adjust and rework instead of providing a platform that you and by extension your users can rely on. Other providers have much longer deprecation windows, so they must at least understand this frustration.
> Consistency, new models don't behave the same on every task as their predecessors. So you end up building pipelines that rely on specific behavior
If this is a deal breaker, then self-hosting is the only solution. Due to the hardware premium, all models hosted by 3rd-parties will be deprecated to make room for newer, better, and more efficient models.
Sure, but Google also leaves little to no overlap between models and often will leave models in preview mode (which many companies cannot use in production for legal reasons) - right up until the point that the previous model is deprecated.
The point is that if you want to build a platform that customers can rely on based on their own schedules of feature development, you need to support models for longer periods of time. For example, OpenAI is still offering older models like gpt4 which was released in 2023 - this gives customers plenty of time to test, experiment and eventually migrate to a newer model if it makes sense.