Each time I try AMD graphics, something is fucked for me. Back with fglrx, fglrx just sucked, so I used Nvidia. Then I had an AMD right around when they finally had opensource drivers, but it was still buggy as hell. So I went with Nvidia again (first a GTX 790, then a GTX 1060). In the meantime I had a new work notebook where I also went with an AMD APU, and had driver crashes for a long time when I was in video calls and it had to decode multiple streams. That thankfully stabilized with Linux 6.4.

Since sooo many people in the community swear by AMD, I thought “dammit, let’s try it again for my new desktop” and got an 7800rx … and I have to reboot ~5 times until I finally make it to a running xserver or wayland session. Apparently I am hit by this problem (at least I hope so). But that doesn’t even read nice … the fix seems to be to revert another fix for powermanagement. So I either have a mostly non-booting card or suboptimal power management.

I start to regret having chosen AMD … again :-/ I seem to be cursed.

    • aksdb@lemmy.worldOP
      link
      fedilink
      arrow-up
      0
      ·
      10 months ago

      I did live like this with all my intel/nvidia systems just fine, though. If AMD tends to have bugs like this, they still seem to suffer from the same shitty software development attitude as they did back in the fglrx days… with the added advantage that people from the community can now firefight some of the problems. For a product I paid a few hundred euros for I expect some quality assurance for its driver development - that seems to work with nvidia.

  • haui@lemmy.giftedmc.com
    link
    fedilink
    arrow-up
    1
    ·
    10 months ago

    This reads like an alternate reality for me. I bought a new 3060 ti and using wayland with it is nearly impossible for me. I tried in ubuntu and had tons of errors and in debian/kde it wont even login without x11 enabled.

    When you go to protondb.com every game has tons of fixes for nvidia cards and every forum has fixes for nvidia cards while amd mostly works oob.

          • narc0tic_bird@lemm.ee
            link
            fedilink
            arrow-up
            1
            ·
            9 months ago

            Great that it works well on COSMIC. They don’t want to use COSMIC and that’s their choice. You don’t have to be so salty about it.

            • Michael Murphy (S76)@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              9 months ago

              What makes you think I’m “salty”? I’m not the one complaining about NVIDIA not working in Wayland, or saying that I’m going to sell my GPU.

              The only person who is salty is the one who would rather sell their GPU than use a Wayland desktop environment that supports NVIDIA as a first class citizen.

  • narc0tic_bird@lemm.ee
    link
    fedilink
    arrow-up
    1
    ·
    9 months ago

    Ohh, so that’s the bug I’ve been experiencing ever since Fedora 39 updated to kernel 6.7. But I only get this on restarts, so cold starts work just fine. I actually have a 7800 XT as well.

    But other than that I only noticed one issue: video playback in Firefox sometimes shows visual artifacts across the screen while a game is running in the background (well, with Baldur’s Gate 3 at least). Fedora 39, KDE Plasma. Kernel 6.6 or 6.7 (or 6.5 for that matter). That said I also had some suboptimal experiences with browser video playback on an AMD APU notebook under Windows (severe framedrops), so I’m not sure where to point my fingers at.

    Other than that it’s honestly been great. I switched from Windows + Nvidia to Linux + AMD basically January 1st of this year and only ever booted Windows twice to transfer game saves over for the few games that don’t have Steam Cloud.

    Turns out most of the problems I had with Linux desktop was with Nvidia. I spent more time troubleshooting than actually using software. AMD isn’t perfect on Linux and with new kernel versions you’re suspect to run into more issues, but AMD (and Intel) mostly work out of the box.

  • 1984@lemmy.today
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    10 months ago

    It also matters what Linux distro you have. Some of them are horrible. I’m super happy with amd graphics on arch, and have no issues whatsoever, with probably 30 games in steam library that all works very well.

    So I think it may be your system and what drivers you installed, or some other config.

    I have a 6900 XT card, latest kernel, latest drivers. But I’ve had this graphics card since kernel 5.8 I think, with no issues.

  • loaExMachina@sh.itjust.works
    link
    fedilink
    arrow-up
    0
    ·
    10 months ago

    Using amd GX 6600… Mostly going fine, tho I haven’t tried any big heavy games. One thing tho… Everytime I turn on my computer, no display. I reboot it and then ot works fine, but ot never does the first time. One path I’ll investigate is the monitor: my monitors are both older and use DVI or VGA ports, so I have to use converters. I might try and get my hand on a more recent monitor to see if I still get the same problem. But if I do, I’m not even sure where to ask. I don’t even think it’s a linux problem, because I tried removing my drive with linux living one with windows and the problem remains. I also was using mint when the problem started and switched to Arch (btw) since and it doesn’t change a thing.

    • cevn@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      10 months ago

      I had a similar problem which was resolved by disabling the motherboard integrated graphics in bios settings.

      • loaExMachina@sh.itjust.works
        link
        fedilink
        arrow-up
        0
        ·
        10 months ago

        Thank you ! It didn’t seem to work on it’s own, but I also noticed I wasn’t booting in EFI mode, so maybe if I just change my booting partition and combine it with your advice it’ll work…

        • cevn@lemmy.world
          link
          fedilink
          arrow-up
          0
          ·
          10 months ago

          Mine went back to no display only on boot, so I guess it didnt work for me either :( good luck tho!!

          • loaExMachina@sh.itjust.works
            link
            fedilink
            arrow-up
            2
            ·
            7 months ago

            I still haven’t found the solution, have you had any luck with yours?

            I tried switching every UEFI setting that seemed to have something to do with booting or gpus, reinstalled gpu bios, upgrading mobo bios, getting a monitor I could plug without a switch… All to no avail.

            Well, I think before upgrading the BIOS, one thing had a slightly different result: Setting the boot mode to UEFI and disabling CSM made it display “no gop (graphic output protocol)” after a few minutes, and it offered to either take me to the uefi settings or loading defaults (which implied going back to CSM), after which it boot this time go back to doing the same thing.

            I don’t think I’ve had this error since the mobo bios upgrade, but still no display unless I reboot, unless the computer had been turned in until recently. I’m kinda out of ideas…

            • cevn@lemmy.world
              link
              fedilink
              arrow-up
              2
              ·
              7 months ago

              …unfortunately no… I work around it by knowing what buttons to press but it’s pretty stupid.

  • Captain Janeway@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    10 months ago

    I’ve had similar issues. I don’t understand the love for AMD. My whole rig is AMD, but it’s constantly having GPU crashes. All games run at high FPS and my CPU temps seem nominal. But the games will crash. Everything from RimWorld to Baldurs Gate 3. They all run pinned at 60fps but randomly crash. I’ve tried a thousand different configurations and drivers. I’ve tried Ubuntu and Linux Mint. I’m now just accepting that I can’t rely on it as a gaming rig. I like that AMD is trying to be progressive with open source drivers but the quality doesn’t seem to be there. My next rig might be Nvidia and Intel. But we will see.

    • bazsy@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      10 months ago

      Did you check the system logs to see what caused it?

      Many things can result in seemingliy random crashes. Any overclock (including XMP and Expo) or undervolt or even a bios version can be problematic.

      I would check first if it’s stable on windows.

      • Captain Janeway@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        10 months ago

        It’s not stable on Windows either. But I haven’t looked at logs because I didn’t really know what - or how - to check.

        • bazsy@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          10 months ago

          Most distros use systemd and its logging solution: journald. You can use journalctl to read the logs around the time of the crash for e.g.:

          • journalctl -S -5m this shows the last 5 minutes. Use this when a game crashes but the system continues working and did not reboot.
          • journalctl -b -1 -S -10m this shows the last 10 minutes from the previous boot. Use this if the crash froze the whole system and rebooted.

          Look for red lines (errors) and what wrote them. AMD GPU faults usually have the ‘amdgpu’ mentioned, memory errors could appear as ‘protection fault’.

          • Captain Janeway@lemmy.world
            link
            fedilink
            arrow-up
            0
            ·
            10 months ago

            journalctl -S -5m

            Looks like this is the errors I’m seeing. I know it’s not helpful to just drop this in the chat, but I’m doing it for posterity (and to let you know your comment did in fact help me)!

            Feb 04 16:47:40 computer kernel: [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out!
            Feb 04 16:47:40 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=17063130, emitted seq=17063132
            Feb 04 16:47:40 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 161654 thread redDispatcher9 pid 161668
            Feb 04 16:47:40 computer kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
            Feb 04 16:47:40 computer kernel: amdgpu 0000:0b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
            Feb 04 16:47:40 computer kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
            Feb 04 16:47:40 computer kernel: amdgpu 0000:0b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
            Feb 04 16:47:40 computer kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
            Feb 04 16:47:40 computer kernel: [drm:gfx_v10_0_cp_gfx_enable.isra.0 [amdgpu]] *ERROR* failed to halt cp gfx
            
            • bazsy@lemmy.world
              link
              fedilink
              English
              arrow-up
              0
              ·
              edit-2
              10 months ago

              Happy to help! Tough you are right, this is a rather generic error that doesn’t help much just confirms that the GPU is the issue.

              At this point it could be a driver issue since there are similar open bug reports. A hardware problem is still possible since you previously said that it’s unstable on windows too, and power related issues can also lead to this error message.

              • Captain Janeway@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                edit-2
                9 months ago

                EDIT: Tentative solution: CoreCtrl

                CoreCtrl allowed me to underclock my Radeon 5600XT GPU (currently set values to GPU 800MHz and memory set to 500MHz). I say “tentative” because this problem has been persistent for years, but I’ve been running Cyberpunk for 1 hour at 60FPS on High settings (and mostly 60FPS on Ultra, but I had some FPS drops). Even if this solution isn’t 100% perfect, I think some combination of changing the GPU values is probably going to make my rig much more functional.

                I found CoreCtrl based on a Reddit thread last night but didn’t have time to test it until this evening after work. Seems to have made a world of a difference.


                Yeah I’ve tried just about every feasible kernel parameter for amdgpu module, updated my kernel, to 6.2 on Linux Mint, and I’ve tried several different BIOS settings. My system runs everything reasonably. Even Cyberpunk 2077 is generally at 60FPS. But after about 5minutes of gaming on Cyberpunk 2077, it crashes. Other games last longer, which is why I use Cyberpunk 2077 to stress test my system.

                These are my system specs:

                • PSU: 850 Watt 80 PLUS Gold Fully Modular ATX
                • CPU: AMD Ryzen 7 2700 Eight-Core Processor × 8
                • GPU: Radeon 5600XT
                • RAM: G-SKill DDR4-3600 CL16-19-19-39 1.35V (2x16GB = 32GB total system memory)
                • SSD: Samsung (MZ-V7E500BW) 970 EVO SSD 500GB - M.2 NVMe
                • MOBO: Asus x470 Pro
                • Other: TP-Link AC1200 PCIe WiFi Card for PC (Archer T5E) - Bluetooth 4.2, Dual Band Wireless Network Card installed in PCIEx1_3 which seems like it could be a variable I should remove, but I’ve tried removing it and didn’t see any changes in behavior. I’ve tried various PCIEx1_* slots with similar results.

                I don’t really see where I might be going wrong here. I bought this all ~4 years ago and I’ve always had these intermittent crashes. It’s admittedly worse on Linux, but it still occurred on Windows.

                Anyways, I spent about 5 hours last night reading bug forums, testing various amdgpu mod parameters, settings in my BIOS, and even re-configuring my fans to provide (potentially) more optimal cooling. None of this really made a difference. I run two 1080p monitors (not exactly breaking the bank here). I had a lot of hope regarding one forum about ring gfx_1.0.0 errors related to how AMD reads the GPU in Linux. My graphics card is detected as: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] and apparently some machines used to accidentally use the total allocated memory for 5700XT instead of the 5600XT. This resulted in some form of corrupt memory allocation. That sort of behavior would make sense for my system since it runs well, but just fails suddenly.

                Other errors I’ve seen are:

                Feb 04 20:17:01 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=116669, emitted seq=116671
                Feb 04 20:17:01 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 3668 thread redDispatcher12 pid 3684
                ...
                Feb 04 20:26:16 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=34068, emitted seq=34071
                Feb 04 20:26:16 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 4208 thread redDispatcher13 pid 4232
                Feb 04 20:26:17 computer kernel: [drm:do_aquire_global_lock.isra.0 [amdgpu]] *ERROR* [CRTC:77:crtc-0] hw_done or flip_done timed out
                ...
                Feb 04 21:00:43 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring comp_1.3.0 timeout, signaled seq=3085, emitted seq=3086
                Feb 04 21:00:43 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 3771 thread redDispatcher8 pid 3783
                ...
                Feb 04 22:28:50 computer kernel: [drm:amdgpu_device_ip_early_init [amdgpu]] *ERROR* early_init of IP block  failed -19
                Feb 04 22:28:50 computer kernel: [drm:amdgpu_device_ip_early_init [amdgpu]] *ERROR* early_init of IP block  failed -19
                Feb 04 22:36:57 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=171774, emitted seq=171776
                Feb 04 22:36:57 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 4122 thread redDispatcher5 pid 4131
                ...
                Feb 04 22:45:46 computer kernel: [drm:do_aquire_global_lock.isra.0 [amdgpu]] *ERROR* [CRTC:77:crtc-0] hw_done or flip_done timed out
                Feb 04 22:45:56 computer kernel: [drm:do_aquire_global_lock.isra.0 [amdgpu]] *ERROR* [CRTC:80:crtc-1] hw_done or flip_done timed out
                Feb 04 22:46:19 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring comp_1.1.0 timeout, signaled seq=123, emitted seq=124
                Feb 04 22:46:19 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 4187 thread redDispatcher8 pid 4202
                ...
                Feb 04 23:49:45 computer kernel: [drm:gfx_v10_0_priv_reg_irq [amdgpu]] *ERROR* Illegal register access in command stream
                Feb 04 23:49:45 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=435155, emitted seq=435157
                Feb 04 23:49:45 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 3668 thread redDispatcher12 pid 3690
                ...
                Feb 04 23:58:58 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=66268, emitted seq=66270
                Feb 04 23:58:58 computer kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process GameThread pid 4180 thread redDispatcher11 pid 4196
                Feb 04 23:58:58 computer kernel: [drm:do_aquire_global_lock.isra.0 [amdgpu]] *ERROR* [CRTC:77:crtc-0] hw_done or flip_done timed out
                

                ^ These are all errors which occurred from various tests of amdgpu module settings and/or BIOS settings. The common thread is some form of ring XXXX timeout.

                These two threads seemed like my best chance, but their proposed solutions didn’t help:

                1. https://bugzilla.kernel.org/show_bug.cgi?id=201957
                2. https://bugzilla.kernel.org/show_bug.cgi?id=202665#c7
  • Hellmo_luciferrari@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    10 months ago

    And here I am with a 3090 having more issues than I have time for wishing I went with an AMD card. Sadly we both can see grass ain’t necessarily greener.

      • Hellmo_luciferrari@lemm.ee
        link
        fedilink
        arrow-up
        0
        ·
        10 months ago

        I’ve tried the open source drivers, the proprietary dkms variant, and standard proprietary drivers and all give me issues.

          • Hellmo_luciferrari@lemm.ee
            link
            fedilink
            arrow-up
            2
            ·
            9 months ago

            Wow, I can’t believe I missed your response. Sorry for such a late reply.

            General instability, absolutely. Multi display issues. And seemingly no matter what I do Wayland on KDE is basically unusable for me.

            • aksdb@lemmy.worldOP
              link
              fedilink
              arrow-up
              2
              ·
              9 months ago

              Ah, I can relate then. I drove my previous NVidia also on X11, with only occasional experiments into Wayland. Since X11 was good enough for me, I wasn’t too sad about this.

              • Hellmo_luciferrari@lemm.ee
                link
                fedilink
                arrow-up
                2
                ·
                9 months ago

                Even with X11 I have had nothing but instability sadly.

                I wanted to switch to Arch like I did for my laptop, but the cons outweighed the pros ultimately for me.