• NutWrench@lemmy.ml
    link
    fedilink
    English
    arrow-up
    7
    ·
    2 days ago

    The “1 trillion” never existed in the first place. It was all hype by a bunch of Tech-Bros, huffing each other’s farts.

        • boomzilla@programming.dev
          link
          fedilink
          arrow-up
          5
          ·
          edit-2
          2 days ago

          I watched one video and read 2 pages of text. So take this with a mountain of salt. From that I gathered that deepseek R1 is the model you interact with when you use the app. The complexity of a model is expressed as the number of parameters (though I don’t know yet what those are) which dictate its hardware requirements. R1 contains 670 bn Parameter and requires very very beefy server hardware. A video said it would be 10th of GPUs. And it seems you want much of VRAM on you GPU(s) because that’s what AI crave. I’ve also read 1BN parameters require about 2GB of VRAM.

          Got a 6 core intel, 1060 6 GB VRAM,16 GB RAM and Endeavour OS as a home server.

          I just installed Ollama in about 1/2 an hour, using docker on above machine with no previous experience on neural nets or LLMs apart from chatting with ChatGPT. The installation contains the Open WebUI which seems better than the default you got at ChatGPT. I downloaded the qwen2.5:3bn model (see https://ollama.com/search) which contains 3 bn parameters. I was blown away by the result. It speaks multiple languages (including displaying e.g. hiragana), knows how much fingers a human has, can calculate, can write valid rust-code and explain it and it is much faster than what i get from free ChatGPT.

          The WebUI offers a nice feedback form for every answer where you can give hints to the AI via text, 10 score rating thumbs up/down. I don’t know how it incooperates that feedback, though. The WebUI seems to support speech-to-text and vice versa. I’m eager to see if this docker setup even offers APIs.

          I’ll probably won’t use the proprietary stuff anytime soon.

          • tooclose104@lemmy.ca
            link
            fedilink
            arrow-up
            4
            ·
            3 days ago

            Apparently phone too! Like 3 cards down was another post linking to instructions on how to run it locally on a phone in a container app or termux. Really interesting. I may try it out in a vm on my server.

      • Mongostein@lemmy.ca
        link
        fedilink
        arrow-up
        6
        arrow-down
        6
        ·
        3 days ago

        Yeah, but you have to run a different model if you want accurate info about China.

        • Alsephina@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          21 hours ago

          Unfortunately it’s trained on the same US propaganda filled english data as any other LLM and spits those same talking points. The censors are easy to bypass too.

        • Phoenicianpirate@lemm.ee
          link
          fedilink
          English
          arrow-up
          5
          ·
          2 days ago

          Yeah but China isn’t my main concern right now. I got plenty of questions to ask and knowledge to seek and I would rather not be broadcasting that stuff to a bunch of busybody jackasses.

          • Mongostein@lemmy.ca
            link
            fedilink
            arrow-up
            1
            arrow-down
            2
            ·
            2 days ago

            I agree. I don’t know enough about all the different models, but surely there’s a model that’s not going to tell you “<whoever’s> government is so awesome” when asking about rainfall or some shit.

      • vga@sopuli.xyz
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        2 days ago

        They’d need to do some pretty fucking advanced hackery to be able to do surveillance on you just via the model. Everything’s possible I guess, but … yeah perhaps not.

        If they could do that, essentially nothing you do on your computer would be safe.

    • Cowbee [he/they]@lemmy.ml
      link
      fedilink
      arrow-up
      18
      ·
      3 days ago

      On the brightside, the clear fragility and lack of direct connection to real productive forces shows the instability of the present system.

      • leftytighty@slrpnk.net
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        1
        ·
        3 days ago

        And no matter how many protectionist measures that the US implements we’re seeing that they’re losing the global competition. I guess protectionism and oligarchy aren’t the best ways to accomplish the stated goals of a capitalist economy. How soon before China is leading in every industry?

        • Cowbee [he/they]@lemmy.ml
          link
          fedilink
          arrow-up
          12
          arrow-down
          1
          ·
          3 days ago

          This conclusion was foregone when China began to focus on developing the Productive Forces and the US took that for granted. Without a hard pivot, the US can’t even hope to catch up to the productive trajectory of China, and even if they do hard pivot, that doesn’t mean they even have a chance to in the first place.

          In fact, protectionism has frequently backfired, and had other nations seeking inclusion into BRICS or more favorable relations with BRICS nations.

  • SocialMediaRefugee@lemmy.ml
    link
    fedilink
    arrow-up
    40
    ·
    3 days ago

    This just shows how speculative the whole AI obsession has been. Wildly unstable and subject to huge shifts since its value isn’t based on anything solid.

  • Arehandoro@lemmy.ml
    link
    fedilink
    arrow-up
    41
    arrow-down
    2
    ·
    3 days ago

    Nvidia’s most advanced chips, H100s, have been banned from export to China since September 2022 by US sanctions. Nvidia then developed the less powerful H800 chips for the Chinese market, although they were also banned from export to China last October.

    I love how in the US they talk about meritocracy, competition being good, blablabla… but they rig the game from the beginning. And even so, people find a way to be better. Fascinating.

    • shawn1122@lemm.ee
      link
      fedilink
      English
      arrow-up
      20
      ·
      3 days ago

      You’re watching an empire in decline. It’s words stopped matching its actions decades ago.

    • Breve@pawb.social
      link
      fedilink
      arrow-up
      11
      ·
      edit-2
      3 days ago

      Don’t forget about the tariffs too! The US economy is actually a joke that can’t compete on the world stage anymore except by wielding their enormous capital from a handful of tech billionaires.

  • toothbrush@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    86
    ·
    edit-2
    4 days ago

    One of those rare lucid moments by the stock market? Is this the market correction that everyone knew was coming, or is some famous techbro going to technobabble some more about AI overlords and they return to their fantasy values?

    • themoonisacheese@sh.itjust.works
      link
      fedilink
      arrow-up
      65
      arrow-down
      1
      ·
      4 days ago

      It’s quite lucid. The new thing uses a fraction of compute compared to the old thing for the same results, so Nvidia cards for example are going to be in way less demand. That being said Nvidia stock was way too high surfing on the AI hype for the last like 2 years, and despite it plunging it’s not even back to normal.

      • davel [he/him]@lemmy.ml
        link
        fedilink
        English
        arrow-up
        6
        ·
        4 days ago

        If AI is cheaper, then we may use even more of it, and that would soak up at least some of the slack, though I have no idea how much.

          • Zaktor@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            7
            arrow-down
            1
            ·
            4 days ago

            And the data is not available. Knowing the weights of a model doesn’t really tell us much about its training costs.

  • Clent@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    18
    ·
    3 days ago

    No surprise. American companies are chasing fantasies of general intelligence rather than optimizing for today’s reality.

    • Naia@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      18
      ·
      3 days ago

      That, and they are just brute forcing the problem. Neural nets have been around for ever but it’s only been the last 5 or so years they could do anything. There’s been little to no real breakthrough innovation as they just keep throwing more processing power at it with more inputs, more layers, more nodes, more links, more CUDA.

      And their chasing a general AI is just the short sighted nature of them wanting to replace workers with something they don’t have to pay and won’t argue about it’s rights.

      • supersquirrel@sopuli.xyz
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        3 days ago

        Also all of these technologies forever and inescapably must rely on a foundation of trust with users and people who are sources of quality training data, “trust” being something US tech companies seem hell bent on lighting on fire and pissing off the yachts of their CEOs.

    • Zink@programming.dev
      link
      fedilink
      arrow-up
      2
      ·
      3 days ago

      I don’t have one to cancel, but I might celebrate today by formatting the old windows SSD in my system and using it for some fast download cache space or something.

  • protist@mander.xyz
    link
    fedilink
    English
    arrow-up
    63
    ·
    4 days ago

    Emergence of DeepSeek raises doubts about sustainability of western artificial intelligence boom

    Is the “emergence of DeepSeek” really what raised doubts? Are we really sure there haven’t been lots of doubts raised previous to this? Doubts raised by intelligent people who know what they’re talking about?

    • floofloof@lemmy.ca
      link
      fedilink
      English
      arrow-up
      25
      ·
      edit-2
      3 days ago

      Ah, but those “intelligent” people cannot be very intelligent if they are not billionaires. After all, the AI companies know exactly how to assess intelligence:

      Microsoft and OpenAI have a very specific, internal definition of artificial general intelligence (AGI) based on the startup’s profits, according to a new report from The Information. … The two companies reportedly signed an agreement last year stating OpenAI has only achieved AGI when it develops AI systems that can generate at least $100 billion in profits. That’s far from the rigorous technical and philosophical definition of AGI many expect. (Source)

      • Naia@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        3
        ·
        3 days ago

        Which is actually something Deepseek is able to do.

        Even if it can still generate garbage when used incorrectly like all of them, it’s still impressive that it will tell you it doesn’t “know” something, but can try to help if you give it more context. which is how this stuff should be used anyway.

  • Etterra@discuss.online
    link
    fedilink
    English
    arrow-up
    38
    arrow-down
    4
    ·
    3 days ago

    Good. LLM AIs are overhyped, overused garbage. If China putting one out is what it takes to hack the legs out from under its proliferation, then I’ll take it.

      • ArchRecord@lemm.ee
        link
        fedilink
        English
        arrow-up
        20
        ·
        3 days ago

        Possibly, but in my view, this will simply accelerate our progress towards the “bust” part of the existing boom-bust cycle that we’ve come to expect with new technologies.

        They show up, get overhyped, loads of money is invested, eventually the cost craters and the availability becomes widespread, suddenly it doesn’t look new and shiny to investors since everyone can use it for extremely cheap, so the overvalued companies lose that valuation, the companies using it solely for pleasing investors drop it since it’s no longer useful, and primarily just the implementations that actually improved the products stick around due to user pressure rather than investor pressure.

        Obviously this isn’t a perfect description of how everything in the work will always play out in every circumstance every time, but I hope it gets the general point across.

      • WoodScientist@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        6
        ·
        3 days ago

        It’s not about hampering proliferation, it’s about breaking the hype bubble. Some of the western AI companies have been pitching to have hundreds of billions in federal dollars devoted to investing in new giant AI models and the gigawatts of power needed to run them. They’ve been pitching a Manhattan Project scale infrastructure build out to facilitate AI, all in the name of national security.

        You can only justify that kind of federal intervention if it’s clear there’s no other way. And this story here shows that the existing AI models aren’t operating anywhere near where they could be in terms of efficiency. Before we pour hundreds of billions into giant data center and energy generation, it would behoove us to first extract all the gains we can from increased model efficiency. The big players like OpenAI haven’t even been pushing efficiency hard. They’ve just been vacuuming up ever greater amounts of money to solve the problem the big and stupid way - just build really huge data centers running big inefficient models.

    • dependencyinjection@discuss.tchncs.de
      link
      fedilink
      arrow-up
      11
      arrow-down
      6
      ·
      3 days ago

      Overhyped? Sure, absolutely.

      Overused garbage? That’s incredibly hyperbolic. That’s like saying the calculator is garbage. The small company where I work as a software developer has already saved countless man hours by utilising LLMs as tools, which is all they are if you take away the hype; a tool to help skilled individuals work more efficiently. Not to replace skilled individuals entirely, as Sam Dead eyes Altman would have you believe.

      • WoodScientist@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        4
        ·
        3 days ago

        LLMs as tools,

        Yes, in the same way that buying a CD from the store, ripping to your hard drive, and returning the CD is a tool.

    • shawn1122@lemm.ee
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      3 days ago

      Deepthink R1(the reasoning model) was only released on January 20. Still took a while though.