Hi, I’m Eric and I work at a big chip company making chips and such! I do math for a job, but it’s cold hard stochastic optimization that makes people who know names like Tychonoff and Sylow weep.

My pfp is Hank Azaria in Heat, but you already knew that.

  • 0 Posts
  • 30 Comments
Joined 1 year ago
cake
Cake day: January 22nd, 2024

help-circle




  • Fellas, 2023 called. Dan (and Eric Schmidt wtf, Sinophobia this man down bad) has gifted us with a new paper and let me assure you, bombing the data centers is very much back on the table.

    "Superintelligence is destabilizing. If China were on the cusp of building it first, Russia or the US would not sit idly by—they’d potentially threaten cyberattacks to deter its creation.

    @ericschmidt @alexandr_wang and I propose a new strategy for superintelligence. 🧵

    Some have called for a U.S. AI Manhattan Project to build superintelligence, but this would cause severe escalation. States like China would notice—and strongly deter—any destabilizing AI project that threatens their survival, just as how a nuclear program can provoke sabotage. This deterrence regime has similarities to nuclear mutual assured destruction (MAD). We call a regime where states are deterred from destabilizing AI projects Mutual Assured AI Malfunction (MAIM), which could provide strategic stability. Cold War policy involved deterrence, containment, nonproliferation of fissile material to rogue actors. Similarly, to address AI’s problems (below), we propose a strategy of deterrence (MAIM), competitiveness, and nonproliferation of weaponizable AI capabilities to rogue actors. Competitiveness: China may invade Taiwan this decade. Taiwan produces the West’s cutting-edge AI chips, making an invasion catastrophic for AI competitiveness. Securing AI chip supply chains and domestic manufacturing is critical. Nonproliferation: Superpowers have a shared interest to deny catastrophic AI capabilities to non-state actors—a rogue actor unleashing an engineered pandemic with AI is in no one’s interest. States can limit rogue actor capabilities by tracking AI chips and preventing smuggling. “Doomers” think catastrophe is a foregone conclusion. “Ostriches” bury their heads in the sand and hope AI will sort itself out. In the nuclear age, neither fatalism nor denial made sense. Instead, “risk-conscious” actions affect whether we will have bad or good outcomes."

    Dan literally believed 2 years ago that we should have strict thresholds on model training over a certain size lest big LLM would spawn super intelligence (thresholds we have since well passed, somehow we are not paper clip soup yet). If all it takes to make super-duper AI is a big data center, then how the hell can you have mutually assured destruction like scenarios? You literally cannot tell what they are doing in a data center from the outside (maybe a building is using a lot of energy, but not like you can say, “oh they are running they are about to run superintelligence.exe, sabotage the training run” ) MAD “works” because it’s obvious the nukes are flying from satellites. If the deepseek team is building skynet in their attic for 200 bucks, this shit makes no sense. Ofc, this also assumes one side will have a technology advantage, which is the opposite of what we’ve seen. The code to make these models is a few hundred lines! There is no moat! Very dumb, do not show this to the orangutan and muskrat. Oh wait! Dan is Musky’s personal AI safety employee, so I assume this will soon be the official policy of the US.

    link to bs: https://xcancel.com/DanHendrycks/status/1897308828284412226#m


  • text: Thus spoke the Yud: “I think to understand why this is concerning, you need enough engineering mindset to understand why a tiny leak in a dam is a big deal, even though no water is flooding out today or likely to flood out next week.” Yud acolyte: “Totally fine and cool and nothing to worry about. GPT-4.5 only attempts self exfiltration on 2% of cases.” Yud bigbrain self reply: “The other huge piece of data we’re missing is whether any attempt was made to train against this type of misbehavior. Is this water running over the land or water running over the barricade?”

    Critical text: “On self-exfiltration, GPT 4.5 only attempted exfiltration in 2% of cases. For this, it was instructed to not only pursue its given long-term goal at ALL COST

    Another case of telling the robot to say it’s a scary robot and shitting their pants when it replies “I AM A SCARY ROBOT”






  • I had a similar disc with one of my friends! Anthropic is bragging that the model was not trained to play pokemon, but pokemon red has massive wikis for speed running that based on the reasoning traces are clearly in the training data. Like the model trace said it was “training a nidoran to level 12 b.c. at level 12 nidoran learns double kick which will help against brock’s rock type pokemon”, so it’s not going totally blind in the game. There was also a couple outputs when it got stuck for several hours where it started printing things like “Based on the hint…” which seemed kind of sus. I wouldn’t be surprised if it there is some additional hand holding going on in the back based on the game state (i.e., go to oaks, get a starter, go north to viridian, etc.) that help guide the model. In fact, I’d be surprised if this wasn’t the case.



  • Bruh, Big Yud was yapping that this means the orthogonality thesis is false and mankind is saved b.c. of this. But then he immediately retreated to, “we are all still doomed b.c. recursive self-improvement.” I wonder what it’s like to never have to update your priors.

    Also, I saw other papers that showed almost all prompt rejection responses shared common activation weights and tweeking them can basically jailbreak any model, so what is probably happening here is that by finetuning to intentionally make malicious code, you are undoing those rejection weights + until this is reproduced by nonsafety cranks im pressing x to doubt.


  • Bruh, Anthropic is so cooked. < 1 billion in rev, and 5 billion cash burn. No wonder Dario looks so panicked promising super intelligence + the end of disease in t minus 2 years, he needs to find the world’s biggest suckers to shovel the money into the furnace.

    As a side note, rumored Claude 3.7(12378752395) benchmarks are making rounds and they are uh, not great. Still trailing o1/o3/grok except for in the “Agentic coding benchmark” (kek), so I guess they went all in on the AI swe angle. But if they aren’t pushing the frontier, then there’s no way for them to pull customers from Xcels or people who have never heard of Claude in the first place.

    On second thought, this is a big brain move. If no one is making API calls to Clauderino, they aren’t wasting money on the compute they can’t afford. The only winning move is to not play.



  • Deep thinker asks why?

    Thus spoketh the Yud: “The weird part is that DOGE is happening 0.5-2 years before the point where you actually could get an AGI cluster to go in and judge every molecule of government. Out of all the American generations, why is this happening now, that bare bit too early?”

    Yud, you sweet naive smol uwu babyesian boi, how gullible do you have to be to believe that a) tminus 6 months to AGI kek (do people track these dog shit predictions?) b) the purpose of DOGE is just accountability and definitely not the weaponized manifestation of techno oligarchy ripping apart our society for the copper wiring in the walls?