The whispering is all in her head and says she sucks

  • just_an_average_joe@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    2
    ·
    7 hours ago

    Not necessarily, CVs have complicated formatting. Nobody (should) write blocks of text, and you don’t know how many columns the candidate is using. Is the candidate using a specific section to show star based skill rating or word based? So you can still search for individual keywords but if you try copying the whole pdf and paste it in txt (which is what will be forwarded to ATS), it does not make much sense. The structure is too complicated extract where you studied, what did you studied and your grade, what other experiences you have and how long you worked there etc.

    Extracting structured data is in its own right a different field of science. There is plenty of recent research on extracting structured data from academic pdfs (I was working on this in a research institute in germany around 2022), even when LLMs are used it can get really complicated to the point that there are specialized LLMs for just that.

    But ATS systems are cheap/not high enough priority to even use OCR let alone LLMs so unfortunately the responsibility of making an easily parsable CV comes down to the candidate.

    Try this next time you see your CV, copy its text to a txt then think about if you can write a program that can reliably extract your experience, education, interests etc. Its going to be super difficult and even then it won’t generalize to thousands of other CVs.

    • bufalo1973@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 hours ago

      All those “problems” apply to Word too. Maybe you use tables, maybe you use lists, maybe you use stars, maybe … So there’s no advantage in forcing people to use Word “because the machine can understand it better”. Because that’s a lie.

      • FlorianSimon@sh.itjust.works
        link
        fedilink
        arrow-up
        2
        ·
        2 hours ago

        Exactly what I was about to reply. Try copying a crazy multi-column Word document into text, and you’ll get similar results.

        Copy-pasting parts of your PDF document is not any more difficult than doing the same thing for a Word document.