AI’s branching future

admin
5 Min Read


Unlock Editor’s Digest Lock for Free

The future belongs to all the powerful and broad range of artificial intelligence agents navigate the world on our behalf. Or will it be entered by many professional digital aides who are each trained to take on narrow tasks and called only when necessary?

Some of the two mixtures seem likely, but the pure pace of change has made even field leaders admit that they have little thought about what things will look like outside a year or two.

There has been a lot of encouragement development for supporters of the idea of ​​”One AI to Dominate Everything.” For example, Openai added a shopping feature to ChatGpt this week. This points to how AI agents can sort the economy of ecommerce. Using a single query, get chatbots to conduct product research, make purchase recommendations, threaten to destroy the entire “funnel” that pilots buyers, and place Openai in the center.

While these advancements may be the most attention, behind the scenes, a new generation of more specialized agents is beginning to take shape. These are narrowly targeted, important considerations and promise to be much cheaper in both building and execution.

This week’s Meta Llamacon developer meeting gave us a glimpse into the state of play. Social networking companies are betting on the adaptability of the AI ​​model, an “open weight” that has limited open source structures. This allows others to adapt using the model.

One indication that the meta has hit the nerve in the broader world of technology is the 1.2 billion downloads that “open” llama models have in the first two years. Most of these include versions of Llama that other developers have adapted for their specific uses and made them available for download by anyone.

Techniques for turning these openweight models into useful tools are evolving rapidly. For example, distillation infuses some intelligence from much larger ones into small models – has become a common technique. Companies with a “closed” model, such as Openai, reserve the right to determine how and how the model can be distilled. By comparison, in an open weight world, developers can freely adapt their models.

Interest in creating more specialized models has recovered in recent months as much of the focus of AI development has changed beyond data-intensive and extremely expensive initial training for the largest models. Instead, many of the special sauces of the latest sauce are created in the next step – “post-training”. This is created during the so-called test time stage, which is used in inference models to solve problems, using a technique known as reinforcement learning to shape results.

According to Ali Ghodsi, Databricks CEO, one of the powerful post-training formats uses company-specific data to form models during the reinforcement learning phase, which makes it much more reliable for business use. Speaking at the Meta event, he said this is only possible with open models.

Another favorite new trick is combining the best parts of the various open models. For example, after Deepseek shocked the AI ​​world with the success of a low-cost R1 inference model, other developers quickly learned how to copy that inference “tracing.” We showed how it worked throughout the problem and showed how to perform these on the llama of meta.

These and other technologies promise a wave of tides for smart agents that require cheaper hardware and consume more power.

On the other hand, for model builders, it adds to the risk of the call. That means cheaper alternatives will undermine the most expensive and advanced models.

However, as AI costs drop, users could be the biggest winner of all. A company that has a place to design and embed specialized agents into daily work processes.

richard.waters@ft.com

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *