Original author: YBB Capital Researcher Zeke
1. It starts with the love of novelty and boredom with the old.
In the past year, due to the lack of narrative at the application layer and the inability to match the speed of infrastructure explosion, the crypto field has gradually become a game of competing for attention resources. From Silly Dragon to Goat, from Pump.fun to Clanker, the attention war has been involuted. Starting with the most clichéd eye-catching monetization, it quickly evolved to a platform model that unifies attention demanders and suppliers, and then silicon-based creatures became new content providers. Among the various carriers of Meme Coin, there finally appeared a kind of existence that can make retail investors and VCs reach a consensus: AI Agent.
Attention is ultimately a zero-sum game, but speculation can indeed drive things to grow wildly. In our article about UNI, we reviewed the beginning of the last golden age of blockchain. The rapid growth of DeFi stems from the LP mining era opened by Compound Finance. Going in and out of thousands or even tens of thousands of mining pools on Apy was the most primitive way of gaming on the chain at that time, although the final situation was that various mining pools collapsed. But the crazy influx of gold miners did leave unprecedented liquidity for the blockchain, and DeFi eventually broke away from pure speculation and formed a mature track, meeting users financial needs in payment, trading, arbitrage, pledge and other aspects. AI Agent is also experiencing this wild stage at this stage. What we are exploring is how Crypto can better integrate AI and ultimately promote the application layer to new heights.
2. How can intelligent agents be autonomous?
In our previous article, we briefly introduced the origin of AI Meme: Truth Terminal, as well as our outlook on the future of AI Agent. This article focuses first on AI Agent itself.
Lets start with the definition of AI Agent. Agent is an old but unclear term in the field of AI. It mainly emphasizes autonomy, that is, any AI that can perceive the environment and make reflections can be called an agent. In todays definition, AI Agent is closer to an intelligent body, that is, a system that mimics human decision-making is set for a large model. In academia, this system is regarded as the most promising way to AGI (artificial general intelligence).
In the early versions of GPT, we can clearly feel that the big model is very similar to humans, but when answering many complex questions, the big model can only give some specious answers. The fundamental reason is that the big model at that time was based on probability rather than causality. Secondly, it lacks the ability to use tools, memory, planning, etc. that humans have, and AI Agent can make up for these shortcomings. So to summarize it with a formula, AI Agent (intelligent body) = LLM (big model) + Planning (planning) + Memory (memory) + Tools (tools).
The large model based on prompts is more like a static person. It comes to life when we input. The goal of the intelligent agent is to be a more real person. Today, the intelligent agents in the circle are mainly fine-tuned models based on Metas open source Llama 70 b or 405 b versions (the two have different parameters). They have the ability to remember and use API access tools. In other aspects, they may need human help or input (including interactive collaboration with other intelligent agents). Therefore, we can see that the main intelligent agents in the circle today still exist on social networks in the form of KOLs. In order to make the intelligent agent more like a human, it is necessary to access the planning and action capabilities, and the sub-item thinking chain in the planning is particularly critical.
3. Chain of Thought (CoT)
The concept of Chain of Thought (CoT) first appeared in the paper Chain-of-Thought Prompting Elicits Reasoning in Large Language Models published by Google in 2022. The paper pointed out that the reasoning ability of the model can be enhanced by generating a series of intermediate reasoning steps, helping the model to better understand and solve complex problems.
A typical CoT prompt consists of three parts: clear instructions, task description, logical basis, theoretical basis or principle to support task solution, examples, and specific solution demonstrations. This structured approach helps the model understand the task requirements and gradually approach the answer through logical reasoning, thereby improving the efficiency and accuracy of problem solving. CoT is particularly suitable for tasks that require in-depth analysis and multi-step reasoning, such as math problem solving, project report writing, and other simple tasks. CoT may not bring obvious advantages, but for complex tasks, it can significantly improve the performance of the model, reduce the error rate through a step-by-step solution strategy, and improve the quality of task completion.
CoT plays a key role in building AI Agents. AI Agents need to understand the information they receive and make reasonable decisions based on it. CoT helps Agents effectively process and analyze input information by providing an orderly way of thinking, and converts the parsing results into specific action guidelines. This method not only enhances the reliability and efficiency of Agent decisions, but also improves the transparency of the decision-making process, making Agent behavior more predictable and traceable. CoT helps Agents carefully consider each decision point by breaking down tasks into multiple small steps, reducing wrong decisions caused by information overload. CoT makes the Agents decision-making process more transparent, and users can more easily understand the basis for Agent decisions. In interacting with the environment, CoT allows Agents to continuously learn new information and adjust their behavior strategies.
As an effective strategy, CoT not only improves the reasoning ability of large language models, but also plays an important role in building more intelligent and reliable AI Agents. By leveraging CoT, researchers and developers can create intelligent systems that are more adaptable to complex environments and have a high degree of autonomy. CoT has demonstrated its unique advantages in practical applications, especially when dealing with complex tasks. By breaking down the task into a series of small steps, it not only improves the accuracy of task solving, but also enhances the interpretability and controllability of the model. This step-by-step problem-solving approach can greatly reduce the number of wrong decisions caused by too much or too complex information when facing complex tasks. At the same time, this approach also improves the traceability and verifiability of the entire solution.
The core function of CoT is to combine planning, action and observation to bridge the gap between reasoning and action. This mode of thinking allows AI Agents to formulate effective countermeasures when predicting possible abnormal situations, and to accumulate new information and verify pre-set predictions while interacting with the external environment, providing new reasoning basis. CoT is like a powerful accuracy and stability engine, helping AI Agents maintain high efficiency in complex environments.
4. Correct pseudo-demand
What aspects of the AI technology stack should Crypto be combined with? In last year’s article, I believed that the decentralization of computing power and data is a key step to help small businesses and individual developers save costs. In the Crypto x AI segmentation track compiled by Coinbase this year, we saw a more detailed division:
(1) Computing layer (referring to the network that focuses on providing graphics processing unit (GPU) resources to AI developers);
(2) Data layer (referring to the network that supports decentralized access, orchestration, and verification of AI data pipelines);
(3) Middleware layer (referring to the platform or network that supports the development, deployment, and hosting of AI models or agents);
(4) Application layer (referring to user-oriented products that utilize on-chain AI mechanisms, whether B2B or B2C).
Each of these four layers has a grand vision, and its goal is to fight against the Silicon Valley giants who dominate the next era of the Internet. As I said last year, do we really have to accept the Silicon Valley giants exclusive control of computing power and data? The closed-source big model under their monopoly is a black box inside. Science is the most believed religion of mankind today. In the future, every sentence answered by the big model will be regarded as the truth by a large number of people, but how to verify this truth? According to the vision of the Silicon Valley giants, the ultimate authority of the intelligent body will be beyond imagination, such as the right to pay for your wallet and the right to use the terminal. How to ensure that people have no evil intentions?
Decentralization is the only answer, but sometimes we need to make a comprehensive consideration. How many people will pay for these grand visions? In the past, we could use tokens to make up for the errors caused by idealization without considering the closed loop of business. But the current situation is very serious. Crypto x AI needs to be designed in combination with the actual situation. For example, how to balance the supply of both ends of the computing power layer when the performance is lost and unstable? In order to achieve the competitiveness of matching centralized clouds. How many real users will there be in the data layer project? How to verify the authenticity and validity of the data provided? What kind of customers need this data? The same is true for the other two layers. In this era, we don’t need so many seemingly correct pseudo-demands.
5. Meme went beyond SocialFi
As I said in the first paragraph, Meme has already taken a very fast approach to develop a SocialFi form that complies with Web3. Friend.tech is the Dapp that fired the first shot in this round of social applications, but unfortunately failed due to the hasty token design. Pump.fun has verified the feasibility of pure platformization, without any tokens or rules. The demanders and suppliers of attention are unified. You can post memes, do live broadcasts, issue coins, leave messages, and trade on the platform. Everything is free. Pump.fun only charges a service fee. This is basically the same as the attention economy model of social media such as YouTube and Ins today, except that the charging objects are different, and Pupm.fun is more Web3 in terms of gameplay.
Bases Clanker is the epitome of all. Thanks to the integrated ecosystem that the ecosystem itself has created, Base has its own social Dapp as an auxiliary, forming a complete internal closed loop. Intelligent Meme is the 2.0 form of Meme Coin. People always seek novelty, and Pump.fun is now at the forefront of the storm. From the trend point of view, it is only a matter of time before the fantasy of silicon-based organisms replaces the vulgar jokes of carbon-based organisms.
I have mentioned Base for the umpteenth time, but the content is different each time. From the timeline, Base has never been the starter, but it is always the winner.
6. What else can an intelligent agent be?
From a pragmatic point of view, it is impossible for intelligent agents to be decentralized for a long time in the future. From the perspective of the construction of intelligent agents in the traditional AI field, it is not a problem that can be solved by simply decentralizing the reasoning process and open source. It needs to access various APIs to access Web2 content, and its operating cost is very expensive. The design of the thinking chain and the collaboration of multiple intelligent agents usually still rely on a human as a medium. We will go through a very long transition period until a suitable fusion form emerges, perhaps like UNI. But as in the previous article, I still think that intelligent agents will have a great impact on our industry, just like the existence of Cex in our industry, which is incorrect but very important.
The article AI Agent Overview released by Stanford Microsoft last month described in detail the applications of intelligent agents in the medical industry, smart machines, and virtual worlds. In the appendix of this article, there are already many experimental cases of GPT-4 V participating as an intelligent agent in the development of top 3A games.
There is no need to demand too much about the speed of its integration with decentralization. I hope that the first puzzle piece that the intelligent agent fills in is the bottom-up ability and speed. We have so many narrative ruins and blank metaverses that need to be filled. At the appropriate stage, we will consider how to make it the next UNI.
References
What kind of ability is the chain of thinking that emerges from the big model? Author: Brain Pole
Understanding Agent in one article, the next stop of big models Author: LinguaMind