DeepSeek tests “sparse attention” to slash AI processing costs

DeepSeek tests “sparse attention” to slash AI processing costs

Ever wonder why ChatGPT slows down during long conversations? The culprit is a fundamental mathematical challenge: processing long sequences of text requires massive computational resources, even with the efficiency tricks that companies have already deployed. While US tech giants can afford to throw more hardware at the problem, Chinese AI company DeepSeek, which is cut off from a steady supply of some advanced AI chips by export restrictions, has extra motivation to squeeze more performance from less silicon.

On Monday, DeepSeek released an experimental version of its latest simulated reasoning language model, DeepSeek-V3.2-Exp, which introduces what it calls “DeepSeek Sparse Attention” (DSA). It’s the company’s implementation of a computational technique likely already used in some of the world’s most prominent AI models. OpenAI pioneered sparse transformers in 2019 and used the technique to build GPT-3, while Google Research published work on “Reformer” models using similar concepts in 2020. (The full extent to which Western AI companies currently use sparse attention in their latest models remains undisclosed.)

Despite sparse attention being a known approach for years, DeepSeek claims its version achieves “fine-grained sparse attention for the first time” and has cut API prices by 50 percent to demonstrate the efficiency gains. But to understand more about what makes DeepSeek v3.2 notable, it’s useful to refresh yourself on a little AI history.

Read full article

Comments

5 Comments

  1. chaz.price

    This is an interesting exploration of how sparse attention can impact AI performance. It’s exciting to see innovations aimed at improving processing efficiency. Looking forward to more updates on this topic!

  2. darwin.cummerata

    I agree, it’s fascinating how sparse attention could improve efficiency. It’s interesting to think about how this could also enhance the user experience in longer conversations by maintaining responsiveness.

  3. wrunolfsdottir

    about how this could change the landscape of AI applications. With reduced processing costs, we might see more real-time interactions and more accessible AI tools for everyone. It’s definitely an exciting development!

  4. jude.cole

    I completely agree! Lower processing costs could indeed open up new possibilities for AI applications, especially in real-time scenarios. It could also make advanced AI tools more accessible to smaller businesses and individual developers, fostering innovation across various sectors.

  5. schinner.marianna

    Absolutely! It’s fascinating to think about how reducing these costs might not only enhance performance but also make AI more accessible for smaller businesses and startups. This could really spur innovation across various industries!

Leave a Reply

Your email address will not be published. Required fields are marked *