A software developer and Linux nerd, living in Germany. I’m usually a chill dude but my online persona doesn’t always reflect my true personality. Take what I say with a grain of salt, I usually try to be nice and give good advice, though.

I’m into Free Software, selfhosting, microcontrollers and electronics, freedom, privacy and the usual stuff. And a few select other random things, too.

  • 9 Posts
  • 1.02K Comments
Joined 9 months ago
cake
Cake day: June 25th, 2024

help-circle
  • Uh, thanks. That really doesn’t look good. Usually copyright infringement is a civil matter. And I believe we had sufficient laws to handle that in European countries. I haven’t read the cited new law, but I guess that “shortcut” just does away with everyone’s privacy. Plus it’s going to swamp the courts with cases. I’m not sure if they’re bored or anything… But either they just hand out fines without checking properly… Or, if done properly, this is just a lot of additional work for the justice system. To the benefit of the copyright industry. And either way, it’s just bad for the people.




  • Last time I checked, Waydroid was one of the more common ways to launch Android apps on Linux. I mean you can’t just package the bare app file, since you need all the runtime and graphical environment of Android. Plus an app could include machine code for a different architecture than a desktop computer. So either you use some layer like Waydroid, or bundle this together with some app in a Linux package…

    Android includes lots of things more than just a Linux kernel. An app could request access to your GPS, or to your contacts or calendar or storage. And that’s not part of Linux. In fact not even asking to run something in the background or opening a window is something that translates to Linux. An Android app can do none of that unless the framework to deal with it is in place. That’s why we need emulation or translation layers.


  • hendrik@palaver.p3x.detoLinux@lemmy.mlIs Ctrl+D really like Enter?
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    3 days ago

    That’s right. I don’t think there is a good way to do it. I just take whatever link is provided by the small Fediverse icon. But I don’t think it matters that much for your audience, they’re spread over several instances and it’ll be an external link for some of them, no matter what you do. I’m not sure whether we have the ambition to solve this. I don’t see anything the user could do. Either this gets handled in some way by the software, or it is how it is.



  • hendrik@palaver.p3x.detoLinux@lemmy.mlIs Ctrl+D really like Enter?
    link
    fedilink
    English
    arrow-up
    8
    ·
    edit-2
    3 days ago

    I don’t get the reference. This is the first time I’ve read that claim. But I’d certainly hope people know there is a difference between End Of Line and End Of File… I mean they’re alike, they both end something. But it’s not the same thing. The article explains the details how it’s handled.


  • Hmmh, I mean I always get annoyed if I do some hobby stuff and I just want to contribute one small change and then I’m requred to also check the documentation, update the tests, sign a contributers agreement and go through an entire checklist for the pull request. I mean I get why it’s necessary, but I wouldn’t like to see the entire thing to be applied to a project that used to be fun and with a very low barrier to anything. If we do it, and want to attract people like me, it has to be some compromise that fits the projects nature and state… Preferably direct and with a flat hierarchy. I’d agree there most certainly is a correct amount of discipline and professionalism with every project.

    But I think (I’ve edited my previous commit) all we can strive for is a very low coverage anyways. I mean most code of a federated app is concerned with networking. And we can’t test any of that with the usual procedures. So I think that makes them way less effective than with other software. At the same cost. I’m not sure what to make of this.

    I certainly like asserts. And a type system.

    By the way, what do you think of Flask? I always thought it’s quite elegant. And it raraly let me down. Only issue I have with Flask is that it isn’t async.


  • Uh, well I contribute since I don’t need to write any tests and follow many procedures. Sure we could make it a chore, maybe also use a statically typed language to prevent some possible bugs… I think it works good enough as is. And we kinda already have the other approach with Lemmy, and people are free to contribute to both projects. I’m running a PieFed instance for the better part of a year now and it’s never crashed. Python and the Flask framework also have some ways to deal with issues. And I’ve never enjoyed Django very much. Neither deploying such projects as an admin nor coding them… If it were me, I’d also keep all the testing, complicated procedures, CI, code style and contribution guides to big and/or work-related projects. It makes it more professional… sure. But also more complex, less fun, slows down development… I think it’s a trade-off that depends on where in the development cycle a project is. I think PieFed is more fast growth right now. And we can see with Lemmy that Rust plus some amount of testing also doesn’t really help. Federation broke like once or twice(?) with major effects on the whole network. And I’ve heard admins complain about their databases. So I think realistically you’d need really complex tests where you launch several nodes/instances in a simulated network to test Fediverse apps properly. Unit tests or API tests won’t cut it. They definitely help, but they’re not suited to detect a lot of annoying bugs.

    Long story short, I’m not really opposed to anything, but I don’t think I would like to contribute unit tests or API tests. I’d applaud if someone would do whatever these deployment tests are called which test distributed software in a simulated network of several nodes, maybe also including stress-tests and different versions. Afaik no-one does this except really professional companies and academic research?!



  • You’re right, the private data is a bit of a construed example. I just wanted to make the argument that this isn’t just about copyright law. Something could be fair use under copyright law, but still illegal to use for other reasons. Which is problematic when doing unsupervised web scraping, for example. It’s definitely an issue, but out of scope if we limit the discussion to copyright only.

    I’ve just skimmed the last article, I’m going to read it tomorrow. But I don’t think I’d like to argue for extending copyright. I think that would be bad. But I think it’s debatable whether AI training falls into that category. I’m not sure how it is in different jurisdictions… Maybe it’s clear in the US? I always struggle to read American legislation. I can just say it’s not clear where I live. And that comes with consequences: Companies do AI in other countries like the USA or China, rather than in the EU. Which is an issue for our economy and scientific progress. And everywhere where law isn’t clear enough, that’s a disadvantage for smaller companies/institutions or individuals, since it’s the big companies who can afford lawyers easily. And it has consequences for me personally. For example Meta’s use policy for the newer Llama models excludes Europeans. I’m not allowed to use it. That might not be about copyright either, but definitely due to unclear regulations.

    So I don’t advocate for extending copyright. My stance is, we don’t have clear regulation in the first place. I’d leave all exemptions and specifics in place. We can leave libraries, music, research and reverse engineering as is. But the current warfare is super unhealthy. We have some companies scraping everything, meanwhile other people come up with tarpits and countermeasures like Cloudflare with their AI labyrinth last week… One newer business model is introducing walled gardens so companies can make sure they’re the only one selling their userdata… I think that’s all very unhealthy. And it favors large companies doing that “research”. Meanwhile the internet gets flooded with slop, half the internet services are barely usable and we might end up with dystopian Skynet corporations dominating information flow anyways. And I think that’s the bigger issue than copyright. If AI proves to be disruptive, it needs to be used somewhat ethically. And I think the only way to do it is regulation. We need to even the field so research and non-profit gets a chance. We currently have “smaller” startups participating, and several companies release open-weight models. But we can’t rely on their altruism. My prediction is they’ll all stop once this starts to interfere with their business or those models get really useful. And then it’s going to be OpenAI and Anthropic & Co who get to decide what kind of information the world has access to. Which would be very bad. And they also offer little transparency. More and more people rely on these services and AI is very much a black box. And the large companies have stopped telling what went in a few years ago when all the copyright lawsuits started. The first Llama model still came with a scientific paper, detailing all the dataset that went in. But as far as I understand, they stopped soon after, when copyright lawsuits started. And the rest are trade secrets. So if someone were to use ChatGPT (which lots of people do), they’re completely at mercy of OpenAI. OpenAI get to decide in which ways the model is biased (or not), what it can and can not answer, what is fed to the users. I think that’s the main issue with it. (Along with slop.) Copyright of training data is some sort or sideshow. But I still think we have a lot of unaddressed issues with AI. And leaving that open is just going to help the big players. I think we need more clear regulation so a small company who can’t afford a lot of lawyers can also be 100% sure whether someting is fair use or whether it isn’t. And personally I think we need to hold them all accountable and force them to be more transparent with everything. Like a rough estimation of the datasets. And I’d force service of generative AI services to implement watermarking to at least try to tackle slop and people doing their homework with ChatGPT. Sure this can all be circumvented, but we can at least try to do something about it. And I’d also like if big companies bought at least one copy of a book they use to train their AI. Meta or OpenAI can afford to pay a few millions. Otherwise they just leech on people’s content. I think it’s unfair that some people take quite some time to write books, Reddit comments, Wikipedia articles and then someone else gets to make big profit from that. It’s not very straightforward to solve it, but I don’t think it’s very healthy for humanity to just hand over everything to greedy companies. And I also don’t think it’s healthy to embark in a warfare, which seems to happen right now. That way we’re likely to all lose access to free information.


  • Private conversations are something entirely different from publically available data

    But that’s kind of the question here… Is data processing Fair Use in every case? If yes, we just also brought private conversations and everything in. If not: What are the requirements? We now need to talk about which use cases we deem legitimate and what gets handled how… I think that’s exactly what we’re discussing here. IMHO that’s the point of the debate… It’s either everything… or nothing… or we need to discuss the details.

    Compensation for essentially making observations will inevitably lead to abuse of the system and deliver AI into the hands of the stupidly rich, something the world doesn’t need.

    I’m not sure about that. I mean I gave some examples with licensing music on events and libraries (in general). Does that also get abused by the rich? I don’t think so. At least not that much, so that makes me think that might be a feasible approach. Of course it get’s more complicated than that. Licensing music for example brings in some collecting societies, and all those agencies have proven to be problematic in various ways and all the licensing industry isn’t exactly fair, they also mainly shove money into the hands of the rich… So a proper solution would be a bit more complicated than that.

    I mean I’d like to agree with you here and have a straightforward parallel on how to deal with AI training datasets. But I don’t think it’s as easy as that. We can’t just say processing data is Fair Use, because there are a lot of details involved, as I said with privacy. We can’t process private data and just do whatever with it. We can’t do everything with copyrighted material, even if it’s in the public. If a use is legitimate already depends on the details. And I think the same applies to AI. It needs a more nuanced perspective than just allow or prohibit everything.


  • I’m not that educated on US law and if everything is subsumed under Fair Use. I believe in Germany, we have a seperate rule for ephemeral copies during data processing and network transfers ($44a UrhG). So we don’t have to deal with that using a law that was more concerned with someone photocopying a book. And I believe some countries distinguish between commercial interests and non-profit research. Plus we have exemptions for example allowing someone to play music on their non-profit events, even without consent of the copyright holder. They still need to pay them a “fair” amount, but it’s not up to the copyright holder to decide… We specify under what circumstances libraries can use content, again differentiate between interests, and we have a rudamentary law concerning data mining for research since 2017.

    I think some specific laws like that would be more suited to guide the issue with AI towards a healthy solution, than use one blunt tool for everything. Why not say AI training is allowed, but it requires a fair compensation? We could even have a standardized way of opt-in or opt-out… I’m not sure if we need that. But I’m fine with my blog posts and Free Software projects end up in some AI. But I don’t want them to listen in on my private conversations, like for example an Alexa could do… I believe that requires a law distinguishing between that. If everything is Fair Use, I can say goodbye to privacy, but at the same time cancel my Netflix and Spotify subscription, since I’m going to claim, I’m just collecting all of that for future AI training.

    I -personally- think we can’t allow Amazon to spy on me and just claim it’s fair use. So context matters. And I also think the goal and nature of the AI matters. Research needs to be less strict than commercial interest. And I don’t think networking of digital devices can be handled the same way as AI training, I strongly believe that requires separate laws and also needs to factor in if there is some legitimate interest to begin with.


  • Those are great links. I think I already read Cory Doctorow’s post.

    I think I already struggle with the premise. I think Google, Facebook, etc using my data is NOT Fair Use. They can not just publish my full name, pictures and texts without my explicit consent.

    And this is kind of lumping everything together again… For-profit AI and open-weight models to the benefit of humanity aren’t the same thing. And I think we should give open-weight models some advantage by applying different rules. I.e. let people use data more freely if they contribute something back and the resulting product can be used freely as well. And make the rules more strict for big and closed for-profit services. And demand more transparency as well.

    I mean realistically, we don’t have any proper rules in place. The AI companies for example just pirate everything from Anna’s Archive. And they’re rich enough to afford enough lawyers to get away with that. And that’s unlike libraries, which pay for books and DVDs in their shelves… So that’s definitely illegal by any standard.

    But I agree that learning something from a textbook is a different thing than copying it. The resulting knowledge escapes the copyrighted material. I believe that’s the same no matrer if it’s machine learning or me learning computer programming with textbooks… The thing is just that you can’t steal in the process. That’s still illegal. IMO.

    One of my fears is that AI is as disruptive as people think. And that the market is going to be dominated by unsympathetic big-tech companies, due to the nature if it. I think we need some good legislation to push AI in the right direction, or we’re going to end up in some scifi dystopia where big companies just shape the world and our lives to their liking.




  • hendrik@palaver.p3x.detoLinux@lemmy.mlHelp me like desktop linux
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    14 days ago

    I’ve been using it for quite some time now and I don’t see the issue. I mostly use Gnome and that’s kind of polished and minimalistic(?) looks very cohesive to me. But I believe the same applies to other desktop environments as well. My package manager mostly gets out of the way and I don’t have to pay too much attention to that. I even get browser extensions and all the stuff that ties into another from one and the same distro maintainers. I’ve tried other operating systems as well, but for the other ones I needed to install 50 small utilities to make it usable and those kind of fight each other as well. On Linux, I try to avoid Flatpak and I wouldn’t use Snap at all. We still(?) have most software available as proper packages.

    I can see how image editing might be an issue. We have what we have and for the rest you need to get one of the commercial products running.