• queermunist she/her@lemmy.ml
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    6 days ago

    DeepSeek showed there is potential in abandoning the AGI pathway (which is impossible with LLMs) and instead training lots and lots of different specialized models that can be switched between for different tasks (at least, that’s how I understand it)

    So I’m not going to assume LLMs will hit a wall, but it’s going to require something else paradigm shifting that we just aren’t seeing out of the current crop of developers.

    • Voroxpete@sh.itjust.works
      link
      fedilink
      arrow-up
      7
      ·
      6 days ago

      Yes, but the basic problem doesn’t change; you’re spending billions to make millions. And Deepseek’s approach only works because they’re able to essentially distill the output of less efficient models like Llama and GPT. So they haven’t actually solved the underlying technical issues, they’ve just found a way to break into the industry as a smaller player.

      At the end of the day, the problem is not that you can’t ever make something useful with transformer models; it’s that you cannot make that useful thing in a way that is cost effective. That’s especially a problem if you expect big companies like Microsoft or OpenAI to continue to offer these services at an affordable price. Yes, Copilot can help you code, but that’s worth Jack shit if the only way for Microsoft to recoup their investment is by charging $200 a month for it.

      • jumping_redditor@sh.itjust.works
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        6 days ago

        ai has large initial cost, but older models will continue to exist and the open source models will continue to take potential profit from the corps

        • Voroxpete@sh.itjust.works
          link
          fedilink
          arrow-up
          2
          ·
          edit-2
          6 days ago

          It does have a large initial cost. It also has a large ongoing cost. GPU time is really, really pricey.

          Even putting aside training and infrastructure, OpenAI still loses money on even their most expensive paid subscribers. While guys like Deepseek have shown ways of reducing those costs, they’re still not enough to make these models profitable to run at the kind of workloads they’re intended to handle, and attempts to reduce their fallibility make them even more expensive, because they basically just involve running the model multiple times over.

    • skulblaka@sh.itjust.works
      link
      fedilink
      arrow-up
      8
      ·
      6 days ago

      That was pretty much always the only potential path forward for LLM type AIs. It’s an extension of the same machine learning technology we’ve been building up since the 50s.

      Everyone trying to approximate an AGI with it has been wasting their time and money.