Introduction

The era of large language models (LLMs) has undeniably arrived, with widespread adoption by individuals globally. While these models distill high-quality knowledge, their general nature necessitates prompt engineering to meet specific user needs. Synchronizing a user’s context with the model is cumbersome, and despite extensive training on general data, LLMs struggle to empathize with individual perspectives. Even with significant efforts to align contexts, advanced long-context models often fail to perform user-specific tasks accurately. Our experiments show that extracting relevant information from extensive contexts and performing simple reasoning is nearly impossible, yet the demand for personalized solutions continues to grow.

To address these challenges, we propose designing Artificial General Intelligence (AGI) systems with LLMs at their core. This system would store not only raw data but also significant conclusions derived through reasoning, linking semantically related information to simplify complex queries. As an intermediate stage, memory might be represented by natural language descriptions. Ultimately, everyone should have their own Lifelong Personal Model (LPM), envisioned as a memory pearl, encapsulating and compressing all types of personal memories.

Insufficient effective long context length

Although many claim that long-context LLMs are the solution to everything, our Reasoning-in-a-haystack experiments have confirmed that even for the most advanced models like GPT-4o and GPT-4-turbo, it is nearly impossible to extract relevant information and perform simple reasoning from long contexts (corresponding to the dark bottom-right corners of each sub-block in Fig. 1).

Fig. 1: Scores of different models (GPT-4o, GPT-4-turbo, GPT-3.5-turbo) for responses under various context lengths and different levels of question difficulty (1-step reasoning, 2-step reasoning, 3-step reasoning).

AI-Native Memory System

Based on the conclusions from experiments and related research, we propose that AGI should comprise an L0, L1, and L2 System. The L0 System uses raw data itself as memory, with its specific architecture referable to RAG. In the L1 System, memories are stored in the form of natural language, including reasoning, summarization, and induction of single and multi-turn dialogues.

In the L2 System, the LLM serves as the core processor, with the context of the LLM acting as RAM and the memory system functioning as the hard disk storage. Unlike Retrieval-Augmented Generation (RAG), which only handles raw data, this memory system not only tightly connects semantically related information but also simplifies the complex reasoning process during queries. The memory system might exist in the form of natural language descriptions, which users can directly utilize. The parameterized L2 LPM can retrieve all types of L1 data through prompts, eliminating the need for a complex intermediate framework in this end-to-end solution.

Fig 2. Overview of AI-Native Memory System

Dive into AI-Native Memory System

L1 Memory: Natural language memory

In L1, Memory integrates various natural-language elements and multimodal data, such as images, audio, and sensor signals. Key components include user bios, interests, preferences, and social connections. These elements are categorized by granularity, ranging from detailed tags to high-level summaries.

Memory facilitates advanced inference and reasoning by extracting patterns from both single and cross-session interactions. This enables the identification of trends and the provision of personalized insights, enhancing the overall user experience.

L2 System with AI Native Memory

In the L2 stage, memory evolves into a neural network model, becoming “AI native” memory. This L2 Lifelong Personal Model (LPM) encodes all user memories, functioning as a personalized world model. It predicts user behavior and provides suggestions based on historical data, similar to an auto-complete function. Unlike L1, L2 summarizes subtler patterns beyond natural language definitions, offering an end-to-end solution. By prompting the L2 model, all L1-defined information is accessible.

Data and model security are crucial; the LPM ensures user information security by separating user history for independent training.

Specifically, our pipeline is constructed from Multi-Scale Data Augmentation, Hybrid Model Training & Inference, and Automated Evaluation. Multi-Scale Data Augmentation involves the data synthesis process. Since user records are typically quite sparse in data distribution, a reasonable data synthesis strategy is needed. Therefore, we have designed multiple directions for data synthesis, including Me-following, Time-sensitive, Relationship, and Prediction-enhanced schemes.

After data synthesis, the data undergoes fine-tuning and filtering by a Teacher Model. Post-filtering, we use the LLM + LoRA approach for Parameter-Efficient Fine-Tuning (PEFT). Considering the training costs, we need to mix this approach with RAG + Long Context schemes within certain cycles to meet the user’s immediate experience upgrades.

To better train and evaluate the model, we decompose the capabilities of personalized large models into four dimensions for evaluation: memory, understand, predict, and recommend. Automated labeling is conducted with larger-scale models, and we plan to train a Preference Model (Reward Model) using accumulated data to further improve Reinforcement Learning from Personality Feedback (RLPF).

Fig3.  LPM Framework

Based on some volunteer users, we compared the effectiveness of LPM with ultra-long context large models and the current best RAG method. The final results are shown in the table below:

As can be seen, LPM indeed achieves the best results with lower inference costs. We can fully anticipate the day when LPM is integrated into the entire AGI system as AI Native Memory.

Summary

We conducted specific research on AI-Native Memory in conjunction with the Me.bot scenario, supported by volunteer users. We developed a pipeline for Lifelong Personal Models (LPM) and compared it across various scenarios, achieving remarkable results. It demonstrated the potential and necessity of LPM as the AI Native Memory within the AGI system. Moving forward, we will open LPM to more users while ensuring privacy and aim to connect users to numerous third-party platforms and services as LPM becomes more mature, providing the most personalized and efficient experience.

In our vision, memory is strongly associated with the user and remains neutral towards specific applications. We believe that future AGI agents will first interact with AI-Native Memory to see if it can provide the necessary information. If not, AI-Native Memory will interact with the actual user to gather more information. Thus, AI-Native Memory will be the core of all interactions and personalization between users and AGI agents. It is important to note that this personalization is not just traditional content recommendation, but a service that marks the starting point of the AI journey.

More Details:https://arxiv.org/pdf/2406.18312