• rumba@lemmy.zip
    link
    fedilink
    English
    arrow-up
    28
    ·
    5 hours ago

    Unless I’m misreading it which is possible it’s awfully late, he said he processed 60,000 rows didn’t find what he was looking for but his hard drive overheated on the full pass.

    Discs don’t overheat because there was load. Even if he f***** up and didn’t index the data correctly (I assume it’s a relational database since he’s talking about rows) The disc isn’t just going to overheat because the job is big. It’s going to be lack of air flow or lack of heatsink.

    I guarantee you he was running on an external NVMe, and one of those little shitty-ass Chinese enclosures. Or maybe one of those self immolating SanDisk enclosures. Hell, maybe he’s on a desktop and he slept a raw NVMe on his motherboard without a heatsink

    There are times when you want a brilliant college student on your team, But you need seasoned professionals to help them through the things they’ve never seen before and never done before.

    • Deathray5@lemmynsfw.com
      link
      fedilink
      arrow-up
      2
      ·
      37 minutes ago

      Somehow I feel over clicking without understanding of the consequences sounds like something a techbro would do

    • exu@feditown.com
      link
      fedilink
      English
      arrow-up
      19
      ·
      3 hours ago

      Can’t be a relational database, Musk said the government doesn’t use SQL.

  • LillyPip@lemmy.ca
    link
    fedilink
    arrow-up
    35
    ·
    6 hours ago

    This cannot be real, wtf. This is cartoon levels of ineptitude.

    Or sabotage by someone heading out? Please let this be resistance sabotage they haven’t noticed yet.

    • turnip@lemm.ee
      link
      fedilink
      English
      arrow-up
      14
      ·
      edit-2
      5 hours ago

      You guys arent running your software off raspberry pi’s with sdcards from the gas station?

      My allowance is 5$ a month!

  • nonentity@sh.itjust.works
    link
    fedilink
    arrow-up
    19
    ·
    6 hours ago

    Either she knows something novel, where processing data using voice coils is somehow beneficial, or is someone who calls their computer a ‘hard drive’, which summarily negates any legitimacy of technical competence.

  • jkercher@programming.dev
    link
    fedilink
    English
    arrow-up
    16
    ·
    8 hours ago

    60k rows of anything will be pulled into the file cache and do very little work on the drive. Possibly none after the first read.

    • wise_pancake@lemmy.ca
      link
      fedilink
      arrow-up
      3
      ·
      6 hours ago

      Are you telling me there’s a difference between an inner and a cross join?

      Cross join is obviously faster, I don’t even have to write “on”

  • zalgotext@sh.itjust.works
    link
    fedilink
    arrow-up
    72
    arrow-down
    1
    ·
    13 hours ago

    my hard drive overheated

    So, this means they either have a local copy on disk of whatever database they’re querying, or they’re dumping a remote db to disk at some point before/during/after their query, right?

    Either way, I have just one question - why?

    • zenpocalypse@lemm.ee
      link
      fedilink
      English
      arrow-up
      13
      ·
      edit-2
      9 hours ago

      Even if it was local, a raspberry pi can handle a query that size.

      Edit - honestly, it reeks of a knowledge level that calls the entire PC a “hard drive”.

    • GoodEye8@lemm.ee
      link
      fedilink
      English
      arrow-up
      13
      ·
      10 hours ago

      My one question would be “How?”

      What the hell are you doing that your hard drives are overheating? How do you even know it’s overheating as I’m like 90% certain hard drives (except NVMe if we’re being liberal with the meaning of hard drive) don’t even have temperature sensors?

      The only conclusion I can come to is that everything he’s saying is just bullshit.

      • Auli@lemmy.ca
        link
        fedilink
        English
        arrow-up
        12
        ·
        10 hours ago

        They have temp sensors. But have never heard of a overheating drive.

        • xthexder@l.sw0.com
          link
          fedilink
          arrow-up
          5
          ·
          edit-2
          9 hours ago

          Imo if they can’t max out their harddrive for at least 24 hours without it breaking, their computer was already broken. They just didn’t know it yet.

          Any reasonable SSD would just throttle if it was getting too hot, and I’ve never heard of a HDD overheating on its own, only if there’s some external heat sources, like running it in a 60°C room

          • Mniot@programming.dev
            link
            fedilink
            English
            arrow-up
            2
            ·
            8 hours ago

            Can we think of any device someone might have that would struggle with 60k? Certainly an ESP32 chip could handle it fine, so most IoT devices would work…

            • zenpocalypse@lemm.ee
              link
              fedilink
              English
              arrow-up
              3
              ·
              8 hours ago

              Right? There’s no part of that xeet that makes any real sense coming from a “data engineer.”

              Terrifying, really.

  • Tiefling IRL@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    96
    ·
    15 hours ago

    60k isn’t that much, I frequently run scripts against multiple hundreds of thousands at work. Wtf is he doing? Did he duplicate the government database onto his 2015 MacBook Air?

    • 4am@lemm.ee
      link
      fedilink
      arrow-up
      60
      ·
      15 hours ago

      A TI-86 can query 60k rows without breaking a sweat.

      If his hard drive overheated from that, he is doing something very wrong, very unhygienic, or both.

    • socsa@piefed.social
      link
      fedilink
      English
      arrow-up
      6
      ·
      12 hours ago

      I’ve run searches over 60k lines of raw JSON on a 2015 MacBook air without any problems.

    • IsoKiero@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      10
      ·
      edit-2
      14 hours ago

      Don’t know what Elmos minions are doing, but I’ve written code at least equally unefficient. It was quite a few years ago (the code was in written in perl) and I at least want to think that I’m better now (but I’m not paid to code anymore). The task was to pull in data from a CSV (or something like that, as I mentioned, it’s been a while) and it needed conversion to XML (or something similar).

      The idea behind my code was that you could just configure which fields you want from arbitary source data and on where to place them on the whatever supported destination format. I still think that the basic idea behind that project is pretty neat, just throw in whatever you happen to have and have something completely else out of the other end. And it worked as it should. It was just stupidly hungry for memory. 20k entries would eat up several gigabytes of memory from a workstation (and back then it was premium to have even 16G around) and it was also freaking slow to run (like 0.2 - 0.5 seconds per entry).

      But even then I didn’t need to tweet that my hard drive is overheating. I well understood that my code is just bad and I even improved it a bit here and there, but it was still so very slow and used ridiculous amounts of RAM. The project was pretty neat and when you had few hundred items to process at a time it was even pretty good, there was companies who relied on that code and paid for support. It just totally broke down with even a slightly bigger datasets.

      But, as I already mentioned, my hard drive didn’t overheat on that load.

    • vga@sopuli.xyz
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      13 hours ago

      I mean if we were to sort of steelman this thing, there sure can be database relations and queries that hit only 60k rows but are still hteavy as fuck.

  • Onno (VK6FLAB)@lemmy.radio
    link
    fedilink
    arrow-up
    214
    ·
    18 hours ago

    Wow.

    I’ve been processing a couple of billion rows of data on my machine, the fans didn’t even come on. WTF are they teaching “experts” these days, or has Elmo only hired people who claim that they can “wrangle data” and say “yes” ?

    • wise_pancake@lemmy.ca
      link
      fedilink
      arrow-up
      2
      ·
      6 hours ago

      60k rows is generally very usable with even wide tables in row formats.

      I’ve had pandas work with 1M plus rows with 100 columns in memory just fine.

      After 1M rows move on to something better like Dask, polars, spark, or literally any DB.

      The first thing I’d do with whatever data they’re running into issues with is rewrite it as partitioned and sorted parquet.

      • Onno (VK6FLAB)@lemmy.radio
        link
        fedilink
        arrow-up
        2
        ·
        6 hours ago

        My go-to tool of late is duckdb, comes with binaries for most platforms, works out of the box, loads any number of database formats and is FAST.

    • bleistift2@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      136
      ·
      18 hours ago

      Even if querying data was processing-heavy and even if somehow the ‘hard drive’ got warm during this, then there still would need to be a hardware defect in order for the drive to overheat.

      • IrateAnteater@sh.itjust.works
        link
        fedilink
        arrow-up
        57
        ·
        18 hours ago

        Yes, but this may be a symptom of an issue I’ve been seeing with younger programmers; they’ve siloed themselves so specifically into whatever programming they “specialize” in, that they become absolutely useless at dealing with absolutely anything else related to their job. And exasperating this issue is the fact that they’ve grown up with systems that “just work”. Windows, iOS, and android are all at the point where fucking around with hardware issues is very uncommon for the average person.

        Asking this guy to solve a hardware problem is like asking hime to tune a carburetor. He likely has not the slightest clue how to start.

        • Snot Flickerman@lemmy.blahaj.zone
          link
          fedilink
          English
          arrow-up
          33
          arrow-down
          1
          ·
          edit-2
          17 hours ago

          In my experience, a lot of software dev degree paths basically don’t even have relevant classes on hardware at all. Classes on hardware are all in IT Helpdesk and Network Admin degree paths whereas the software dev students are dumped straight into Visual Studio right off the bat with no relevant understanding of the underlying hardware or OS.

          • atomicbocks@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            26
            ·
            16 hours ago

            My experience does not reflect yours. Computer Architecture, Discrete Math (logic gate math), and Operating System Concepts were all required classes in my CS degree from just a few years ago.

          • bleistift2@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            18
            ·
            17 hours ago

            You don’t teach a farmer how an internal combustion engine works. Computers are tools to software engineers. What they need to know is how to operate them, not how to maintain them.

            • KillingTimeItself@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              1
              ·
              7 hours ago

              the only reason farmers are afloat financially is BECAUSE they can rebuild an engine if needed.

              Just look at the john deere right to repair shit. It’s literally a huge problem.

            • hayalci@fstab.sh
              link
              fedilink
              English
              arrow-up
              13
              ·
              17 hours ago

              No, not really. Programming requires understanding of the underlying hardware, at least to a certain extent. Otherwise performance issues will look like dark magic and optimizing anything would be impossible.

              Where do you start debugging if something goes wrong with the software and your information level is this low/ do you look at network stats? CPU utilization, paging/swapping? Is the hard disk bandwidth the bottleneck? Without at least some passable understanding of a computer architecture people like this just throw up their hands, or throw whatever tricks they know at the wall and see what sticks.

            • chickenf622@sh.itjust.works
              link
              fedilink
              arrow-up
              8
              ·
              16 hours ago

              A lot of farmers are learning how they work cause the companies that sell them the equipment keep fucking them over. I would argue that farmers nowadays needs to know how that works along with basic programming to get past the anti-consumer bullshit companies put in to make it nigh impossible to fix things yourself.

              • KillingTimeItself@lemmy.dbzer0.com
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                6 hours ago

                doesnt matter if you know how to program, john deere is just going to put some autistic encryption and ID locking on their shit, what needs to happen is for john deere to stop fucking doing this.

                Most tractors are walking computers anyway, farmers are genuinely the most multi talented people you will ever meet in your life.

            • bane_killgrind@slrpnk.net
              link
              fedilink
              English
              arrow-up
              6
              ·
              16 hours ago

              What the fuck

              How is he going to fix his tractor? Wait days for John Deere to send somebody? Let the crop rot on the vine?

            • sepi@piefed.social
              link
              fedilink
              English
              arrow-up
              4
              ·
              16 hours ago

              CS departments were doing poorly, but now they’re putting out farmers? No wonder all these new graduates can’t find a job.

        • bleistift2@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          13
          arrow-down
          9
          ·
          17 hours ago

          That’s the price of specialization. Don’t ask a software engineer to troubleshoot hardware. Don’t ask a backend dev to write a frontend. Don’t ask a proctologist to look at your cough.

          You simply cannot be proficient at every sub-sub-specialty. That’s why we collaborate and hand the ‘my computer gets hot’ problems to the hardware people. The alternative would be only moderately useful generalist.

          • IrateAnteater@sh.itjust.works
            link
            fedilink
            arrow-up
            23
            ·
            17 hours ago

            I’m not asking everyone to be able to become a hardware specialist, but if you can’t even figure out “my computer gets hot” I’m not going to be able to trust anything you do. Identifying a heat issue does not take a rocket surgeon.

    • MajorHavoc@programming.dev
      link
      fedilink
      arrow-up
      30
      ·
      16 hours ago

      has Elmo only hired people who claim that they can “wrangle data” and say “yes” ?

      There’s two issues going on:

      1. Elmo’s sociopathic approach to laying people off is public knowledge, and top experts have the luxury of not even applying for his jobs.
      2. Elmo’s ability to judge engineering talent has likely been wildly exaggerated thanks to how he has successfully bought organizations full of talented people, in the past.
      • Kane@femboys.biz
        link
        fedilink
        arrow-up
        51
        arrow-down
        3
        ·
        18 hours ago

        Hey! Thats offensive to 19-25 year olds, there are many who just finished college/university and are more than aware.

        They’re just role playing like in movies, with no idea of the consequences.

        • entropicdrift@lemmy.sdf.org
          link
          fedilink
          arrow-up
          22
          arrow-down
          6
          ·
          edit-2
          18 hours ago

          How on earth is it offensive to say they’re “not experts”? They’re not prodigies with PhDs. These specific young men are just technical enough and ideologically aligned.

          • Kane@femboys.biz
            link
            fedilink
            arrow-up
            17
            ·
            18 hours ago

            Except they’re not, as you will know their tweet would be false after your first year of any technical (IT oriented) education.

            • Zorsith@lemmy.blahaj.zone
              link
              fedilink
              English
              arrow-up
              14
              ·
              edit-2
              18 hours ago

              First year? That shit is like A+ cert level knowledge or below, and A+ is damn near worthless. They would know that in the first few hours of a study guide

              • Kane@femboys.biz
                link
                fedilink
                arrow-up
                3
                ·
                18 hours ago

                I was being generous when you consider the people in school who somehow pass, even when they don’t know a thing 🥲

              • Kane@femboys.biz
                link
                fedilink
                arrow-up
                6
                ·
                18 hours ago

                Apologies, if I came over as hostile. I did not get your meaning through text.

      • Jo Miran@lemmy.ml
        link
        fedilink
        arrow-up
        21
        ·
        18 hours ago

        There is nothing wrong with being 19-25. There’s something wrong with being wholly incompetent.

        • ploot@lemmy.blahaj.zone
          link
          fedilink
          English
          arrow-up
          13
          ·
          17 hours ago

          There’s not really anything wrong with being incompetent, so long as you have the humility to admit it and learn from people who know better, and try not to cause harm. That’s not Musk’s minions though.

          • Jo Miran@lemmy.ml
            link
            fedilink
            arrow-up
            8
            ·
            17 hours ago

            I think it’s important to differentiate incompetence from ignorance. Ignorance is not knowing. Incompetence is not being able to fulfill the requirements for your assigned task. If you cannot fulfill the requirements for your given task, then you should not be given said task.

    • Jax@sh.itjust.works
      link
      fedilink
      arrow-up
      7
      ·
      18 hours ago

      You have to understand that the average Trump voter probably knows everything they know about computers from watching the ‘wacky-zaney hacker with personality issues/quirks’ “hack” into things by tippity tapping their fingies on a keyboard in your average copaganda performance.

      This is something those types of people will believe.

  • vga@sopuli.xyz
    link
    fedilink
    arrow-up
    15
    ·
    13 hours ago

    I’ve used local hard drives from like 1992 and I have never ever gotten them to overheat.

  • golden_zealot@lemmy.ml
    link
    fedilink
    English
    arrow-up
    30
    ·
    edit-2
    15 hours ago

    I used to perform data analysis of robotics firmware logs which would generate several million log lines per hour and that was my second job out of college.

    I don’t know how you fuck up 60k lines that bad. Is he nesting 150 for loops and loading a copy of the data set in each one while mining crypto??

    • ButtDrugs@lemm.ee
      link
      fedilink
      arrow-up
      10
      ·
      13 hours ago

      Substring searches in unindexed large string columns or cartesian explosion caused by shitty joins would be my initial guess.

        • manicdave@feddit.uk
          link
          fedilink
          arrow-up
          2
          ·
          9 hours ago

          If there’s something you want to search by in a database, you should index it.

          Indexing will create an ordered data structure that will allow much faster queries. If you were looking for the username gazter in an unindexed column, it would have to check literally every username entry. In a table of 1000000 entries it would check 1000000 times.

          In an indexed column it might do something like ask to be pointed to every name beginning with “g”, then of those ask to be pointed to every name with the second letter “a” and so on. It would find out where in the database gazter is by checking only six times.

          Substring matching is much more computationally difficult as it has to pull out each potentially matching value and run it through a function that checks if gazter exists somewhere in that value. Basically if you find yourself doing it you need to come up with a better plan.

          Cartesian explosion would be when your query ends up doing a shit load of redundant work. Like if the query to load this thread were to look up all the posters here, get all their posts, get the threads from those posts and filter on the thread id.

        • ButtDrugs@lemm.ee
          link
          fedilink
          arrow-up
          2
          ·
          edit-2
          10 hours ago

          Storing large volumes of a text in a database column without optimization, then searching for small strings within it. It causes the database to basically search character by character to find a match by reading everything from disk. If you use indexes the database can do a lot of really incredible optimization to make finding values mich faster, and honestly string searching is better suited to a non-relational DB engine (which is why search engines don’t use relational DBs).

          Cartesian explosion is where you join related data together in a way that causes your result set to be wayyyy bigger than you expect. For example if you try to search through blog posts, but then also decide to bring in comments to search, then bring in the authors of those comments and all their comments from other posts. Result sets start to grow exponentially in that way, so maybe if you only search a few thousand blog posts you might be searching through millions of records because you designed your queries poorly.