Unless I’m misreading it which is possible it’s awfully late, he said he processed 60,000 rows didn’t find what he was looking for but his hard drive overheated on the full pass.
Discs don’t overheat because there was load. Even if he f***** up and didn’t index the data correctly (I assume it’s a relational database since he’s talking about rows) The disc isn’t just going to overheat because the job is big. It’s going to be lack of air flow or lack of heatsink.
I guarantee you he was running on an external NVMe, and one of those little shitty-ass Chinese enclosures. Or maybe one of those self immolating SanDisk enclosures. Hell, maybe he’s on a desktop and he slept a raw NVMe on his motherboard without a heatsink
There are times when you want a brilliant college student on your team, But you need seasoned professionals to help them through the things they’ve never seen before and never done before.
Somehow I feel over clicking without understanding of the consequences sounds like something a techbro would do
Can’t be a relational database, Musk said the government doesn’t use SQL.
This cannot be real, wtf. This is cartoon levels of ineptitude.
Or sabotage by someone heading out? Please let this be resistance sabotage they haven’t noticed yet.
You guys arent running your software off raspberry pi’s with sdcards from the gas station?
My allowance is 5$ a month!
Look, all I’m saying is give Pis a chance.
Either she knows something novel, where processing data using voice coils is somehow beneficial, or is someone who calls their computer a ‘hard drive’, which summarily negates any legitimacy of technical competence.
“I store my records on vinyl. You’ve probably never heard of them.”
60k rows of anything will be pulled into the file cache and do very little work on the drive. Possibly none after the first read.
Not if each row is pi!
When the only thing that is stopping kids from dismantling your government is an O(N^N) algorithm
Are you telling me there’s a difference between an inner and a cross join?
Cross join is obviously faster, I don’t even have to write “on”
my hard drive overheated
So, this means they either have a local copy on disk of whatever database they’re querying, or they’re dumping a remote db to disk at some point before/during/after their query, right?
Either way, I have just one question - why?
Even if it was local, a raspberry pi can handle a query that size.
Edit - honestly, it reeks of a knowledge level that calls the entire PC a “hard drive”.
My one question would be “How?”
What the hell are you doing that your hard drives are overheating? How do you even know it’s overheating as I’m like 90% certain hard drives (except NVMe if we’re being liberal with the meaning of hard drive) don’t even have temperature sensors?
The only conclusion I can come to is that everything he’s saying is just bullshit.
They have temp sensors. But have never heard of a overheating drive.
Hard drives do get hot and need some cooling but not at 60k rows. Its either made up or their computer case is made of thermal cladding
Imo if they can’t max out their harddrive for at least 24 hours without it breaking, their computer was already broken. They just didn’t know it yet.
Any reasonable SSD would just throttle if it was getting too hot, and I’ve never heard of a HDD overheating on its own, only if there’s some external heat sources, like running it in a 60°C room
You could query 60,000 rows on a low tier smart phone. Makes no sense at all.
Can we think of any device someone might have that would struggle with 60k? Certainly an ESP32 chip could handle it fine, so most IoT devices would work…
Right? There’s no part of that xeet that makes any real sense coming from a “data engineer.”
Terrifying, really.
dude is 100% talking about ssds. NVME ones at that, he’s just stupid.
Or they’re doing it on a Diamondmax 9.
does elon only hire chip from sales guy vs web dude or something
No he also hire people who created a script to make fake ballots with a bias.
https://bsky.app/profile/denisedwheeler.bsky.social/post/3lhowh3ijgs2f
I cannot believe these people make more than me lol.
Is this a real post? I can’t seemed to find it on that website “X, formerly known as Twitter.”
some poeple have linked the discussion in other comment threads
60k isn’t that much, I frequently run scripts against multiple hundreds of thousands at work. Wtf is he doing? Did he duplicate the government database onto his 2015 MacBook Air?
60k is laughably, embarrassingly small. It’s still sqlite-sized.
Sqlite can easily handle millions of rows. Don’t sell it short
How about a 6.4TB sqlite database?
Should be enough to hold 60k rows
I’m not
I have an sqlite db that is a few GB in size, game saves using the format. Sadly almost all blob data, would love to play with it if it was a bit more readable
i mean its even excel sized depending on how many columns. This is seriously sad and alarming
Hey now that’s real close to the 65,535 16-bit limit (from 20 years ago)
Holy shit if this ids lm issue that’s too funny
60k is single json file
A TI-86 can query 60k rows without breaking a sweat.
If his hard drive overheated from that, he is doing something very wrong, very unhygienic, or both.
He probably mining crypto on top of running his SQL queries.
What? You don’t run your hard drives in the oven while baking brownies? It makes them zesty.
There must be more join statements than column names
I’ve run searches over 60k lines of raw JSON on a 2015 MacBook air without any problems.
Don’t know what Elmos minions are doing, but I’ve written code at least equally unefficient. It was quite a few years ago (the code was in written in perl) and I at least want to think that I’m better now (but I’m not paid to code anymore). The task was to pull in data from a CSV (or something like that, as I mentioned, it’s been a while) and it needed conversion to XML (or something similar).
The idea behind my code was that you could just configure which fields you want from arbitary source data and on where to place them on the whatever supported destination format. I still think that the basic idea behind that project is pretty neat, just throw in whatever you happen to have and have something completely else out of the other end. And it worked as it should. It was just stupidly hungry for memory. 20k entries would eat up several gigabytes of memory from a workstation (and back then it was premium to have even 16G around) and it was also freaking slow to run (like 0.2 - 0.5 seconds per entry).
But even then I didn’t need to tweet that my hard drive is overheating. I well understood that my code is just bad and I even improved it a bit here and there, but it was still so very slow and used ridiculous amounts of RAM. The project was pretty neat and when you had few hundred items to process at a time it was even pretty good, there was companies who relied on that code and paid for support. It just totally broke down with even a slightly bigger datasets.
But, as I already mentioned, my hard drive didn’t overheat on that load.
No, its an external drive, appearently.
I mean if we were to sort of steelman this thing, there sure can be database relations and queries that hit only 60k rows but are still hteavy as fuck.
deleted by creator
Wow.
I’ve been processing a couple of billion rows of data on my machine, the fans didn’t even come on. WTF are they teaching “experts” these days, or has Elmo only hired people who claim that they can “wrangle data” and say “yes” ?
60k rows is generally very usable with even wide tables in row formats.
I’ve had pandas work with 1M plus rows with 100 columns in memory just fine.
After 1M rows move on to something better like Dask, polars, spark, or literally any DB.
The first thing I’d do with whatever data they’re running into issues with is rewrite it as partitioned and sorted parquet.
My go-to tool of late is
duckdb
, comes with binaries for most platforms, works out of the box, loads any number of database formats and is FAST.
Even if querying data was processing-heavy and even if somehow the ‘hard drive’ got warm during this, then there still would need to be a hardware defect in order for the drive to overheat.
Yes, but this may be a symptom of an issue I’ve been seeing with younger programmers; they’ve siloed themselves so specifically into whatever programming they “specialize” in, that they become absolutely useless at dealing with absolutely anything else related to their job. And exasperating this issue is the fact that they’ve grown up with systems that “just work”. Windows, iOS, and android are all at the point where fucking around with hardware issues is very uncommon for the average person.
Asking this guy to solve a hardware problem is like asking hime to tune a carburetor. He likely has not the slightest clue how to start.
In my experience, a lot of software dev degree paths basically don’t even have relevant classes on hardware at all. Classes on hardware are all in IT Helpdesk and Network Admin degree paths whereas the software dev students are dumped straight into Visual Studio right off the bat with no relevant understanding of the underlying hardware or OS.
My experience does not reflect yours. Computer Architecture, Discrete Math (logic gate math), and Operating System Concepts were all required classes in my CS degree from just a few years ago.
Honestly that’s good to hear. I’ve run into some devs who are completely mystified on how to connect to a remote database and couldn’t tell a socket from sandwich.
In my degree, we had to write kernel mods and device drivers
You don’t teach a farmer how an internal combustion engine works. Computers are tools to software engineers. What they need to know is how to operate them, not how to maintain them.
the only reason farmers are afloat financially is BECAUSE they can rebuild an engine if needed.
Just look at the john deere right to repair shit. It’s literally a huge problem.
No, not really. Programming requires understanding of the underlying hardware, at least to a certain extent. Otherwise performance issues will look like dark magic and optimizing anything would be impossible.
Where do you start debugging if something goes wrong with the software and your information level is this low/ do you look at network stats? CPU utilization, paging/swapping? Is the hard disk bandwidth the bottleneck? Without at least some passable understanding of a computer architecture people like this just throw up their hands, or throw whatever tricks they know at the wall and see what sticks.
A lot of farmers are learning how they work cause the companies that sell them the equipment keep fucking them over. I would argue that farmers nowadays needs to know how that works along with basic programming to get past the anti-consumer bullshit companies put in to make it nigh impossible to fix things yourself.
doesnt matter if you know how to program, john deere is just going to put some autistic encryption and ID locking on their shit, what needs to happen is for john deere to stop fucking doing this.
Most tractors are walking computers anyway, farmers are genuinely the most multi talented people you will ever meet in your life.
What the fuck
How is he going to fix his tractor? Wait days for John Deere to send somebody? Let the crop rot on the vine?
It is good for the programmer to know how the computer operates, as well.
Just keep trying to justify your own lack of competency I guess. ¯\_(ツ)_/¯
You’ve never met a farmer in your life.
CS departments were doing poorly, but now they’re putting out farmers? No wonder all these new graduates can’t find a job.
Ooh wait 'til Musk realises he can improve US agricultural efficiency.
That’s the price of specialization. Don’t ask a software engineer to troubleshoot hardware. Don’t ask a backend dev to write a frontend. Don’t ask a proctologist to look at your cough.
You simply cannot be proficient at every sub-sub-specialty. That’s why we collaborate and hand the ‘my computer gets hot’ problems to the hardware people. The alternative would be only moderately useful generalist.
I’m not asking everyone to be able to become a hardware specialist, but if you can’t even figure out “my computer gets hot” I’m not going to be able to trust anything you do. Identifying a heat issue does not take a rocket surgeon.
has Elmo only hired people who claim that they can “wrangle data” and say “yes” ?
There’s two issues going on:
- Elmo’s sociopathic approach to laying people off is public knowledge, and top experts have the luxury of not even applying for his jobs.
- Elmo’s ability to judge engineering talent has likely been wildly exaggerated thanks to how he has successfully bought organizations full of talented people, in the past.
He hired a bunch of 19-25 year old. Not experts
Hey! Thats offensive to 19-25 year olds, there are many who just finished college/university and are more than aware.
They’re just role playing like in movies, with no idea of the consequences.
lol a 19-21 yo isnt going to have a degree lol,
How on earth is it offensive to say they’re “not experts”? They’re not prodigies with PhDs. These specific young men are just technical enough and ideologically aligned.
Except they’re not, as you will know their tweet would be false after your first year of any technical (IT oriented) education.
First year? That shit is like A+ cert level knowledge or below, and A+ is damn near worthless. They would know that in the first few hours of a study guide
I was being generous when you consider the people in school who somehow pass, even when they don’t know a thing 🥲
Technical enough to be hired, is all I meant. 🙄
Apologies, if I came over as hostile. I did not get your meaning through text.
There is nothing wrong with being 19-25. There’s something wrong with being wholly incompetent.
There’s not really anything wrong with being incompetent, so long as you have the humility to admit it and learn from people who know better, and try not to cause harm. That’s not Musk’s minions though.
I think it’s important to differentiate incompetence from ignorance. Ignorance is not knowing. Incompetence is not being able to fulfill the requirements for your assigned task. If you cannot fulfill the requirements for your given task, then you should not be given said task.
You have to understand that the average Trump voter probably knows everything they know about computers from watching the ‘wacky-zaney hacker with personality issues/quirks’ “hack” into things by tippity tapping their fingies on a keyboard in your average copaganda performance.
This is something those types of people will believe.
I’ve used local hard drives from like 1992 and I have never ever gotten them to overheat.
heh
Literally every time someone dismisses Wikipedia, it’s because they believe something crazy that Wikipedia told them is wrong.
I checked conservapedia once, and its actually unhinged. If someone tells you to look at that, or reccommends it, they’re crazy.
Did they ever finish their own bible translation? The one they started because King James was too woke.
YES . its so unhinged . they have an entire page discussing if Obama is actually a Muslim
“I read a book with a typo once. Libraries are a scam.”
Libraries are a scam if they weren’t they’d still have VHS rentals.
“YOU’RE JUST JEALOUS” is such a fucking pussy-ass response, too.
Molly White is very bright, and she makes them feel inadequate so they “have to” attack her. It’s truly pathetic.
But her last name is White so it’s a real dilemma for them.
God they picked out the ONE possible thing they could criticize her for, there’s like 3 other things RIGHT NEXT TO THAT
I used to perform data analysis of robotics firmware logs which would generate several million log lines per hour and that was my second job out of college.
I don’t know how you fuck up 60k lines that bad. Is he nesting 150 for loops and loading a copy of the data set in each one while mining crypto??
Substring searches in unindexed large string columns or cartesian explosion caused by shitty joins would be my initial guess.
Largely ignorant, but data-curious person here.
…what?
If there’s something you want to search by in a database, you should index it.
Indexing will create an ordered data structure that will allow much faster queries. If you were looking for the username gazter in an unindexed column, it would have to check literally every username entry. In a table of 1000000 entries it would check 1000000 times.
In an indexed column it might do something like ask to be pointed to every name beginning with “g”, then of those ask to be pointed to every name with the second letter “a” and so on. It would find out where in the database gazter is by checking only six times.
Substring matching is much more computationally difficult as it has to pull out each potentially matching value and run it through a function that checks if gazter exists somewhere in that value. Basically if you find yourself doing it you need to come up with a better plan.
Cartesian explosion would be when your query ends up doing a shit load of redundant work. Like if the query to load this thread were to look up all the posters here, get all their posts, get the threads from those posts and filter on the thread id.
Storing large volumes of a text in a database column without optimization, then searching for small strings within it. It causes the database to basically search character by character to find a match by reading everything from disk. If you use indexes the database can do a lot of really incredible optimization to make finding values mich faster, and honestly string searching is better suited to a non-relational DB engine (which is why search engines don’t use relational DBs).
Cartesian explosion is where you join related data together in a way that causes your result set to be wayyyy bigger than you expect. For example if you try to search through blog posts, but then also decide to bring in comments to search, then bring in the authors of those comments and all their comments from other posts. Result sets start to grow exponentially in that way, so maybe if you only search a few thousand blog posts you might be searching through millions of records because you designed your queries poorly.