By now, many of you have likely encountered large language models, either by running the software yourself, subscribing to online services, or using one of the free or beta solutions available. Typically, these models are vast, containing billions of parameters and trained on extensive unstructured language data. In the AI industry, the number of parameters often correlates with a model’s accuracy—the more data and parameters, the broader the scope of information the model can generate or recall. However, a significant issue persists: hallucinations.

The Challenge of Hallucinations

Generative AI models, which include large language models, operate by taking a prompt and generating a response based on probabilistic functions. While these models can provide detailed information, they often produce erroneous outputs, a phenomenon known as hallucinations. These errors can stem from various sources, such as incorrect factual information embedded in the training data or the probabilistic nature of the model’s response generation.

One issue is the models’ difficulty in maintaining factual accuracy. For instance, when asking about a popular figure’s birthday, a general model provided the correct year but incorrectly cited the date associated with the transistor’s discovery, highlighting a common error due to the model’s embedding space.

Another issue is the nature of the training data, which often includes public, sometimes contradictory, information. These models are designed to generate answers, whether correct or not, unless explicitly programmed to avoid certain topics.

Addressing Hallucinations

Several methods have been developed to mitigate hallucinations in AI models:

  1. Domain-Specific Models: Training models exclusively on relevant data can improve accuracy within a specific field, though they may struggle with generalization.
  2. Co-Prompting: This technique involves pairing user prompts with relevant, accurate background information, though it increases computational requirements significantly.
  3. Fine-Tuning: Starting with a general model and refining it with curated data can enhance accuracy, but this process can be computationally intensive.
  4. Retrieval Augmented Generation (RAG): This method allows models to access a validated database to assist in generating accurate responses, with variable performance.
  5. Mixture of Experts (MoE): Utilizing multiple optimized smaller models for specific tasks can improve accuracy and performance, as seen with Mixtral 7x7B.

A New Approach: Memory Tuning

A recent paper introduced a groundbreaking technique called Memory Tuning, which embeds specific data into models efficiently. This method builds on the concept of MoE, using adapters tuned to curated data at a much higher rate than traditional fine-tuning. This approach, coined Mixture of Memory Experts (MoME), enables near-perfect recall of specific information without significantly impacting the model’s general reasoning capabilities.

Memory Tuning allows models to embed hard facts, significantly reducing hallucinations. This technique is particularly effective for models with a few billion parameters, making it suitable for various applications, from product support to language models for coding.

Future Implications

The shift from convolutional neural networks (CNNs) to transformers revolutionized AI, and Memory Tuning could represent a similar leap. By optimizing specific data areas without overhauling entire embedding tables, Memory Tuning offers a cost-effective and computationally efficient solution to hallucinations. The impact on inference costs and hardware requirements will be an area of ongoing research, with the potential to drive significant changes in the AI landscape.


Memory Tuning presents a promising solution to the persistent problem of hallucinations in AI models. By embedding hard facts and optimizing specific data areas, this technique enhances model accuracy and reliability. As AI continues to evolve, innovative methods like Memory Tuning will play a crucial role in advancing the technology and its applications.

Author Jason McArdle

More posts by Jason McArdle

Braincloud Group

1016 W Jackson Blvd
Chicago, Cook County 60607 USA