I recently discovered that some popular federated instances have been using LLM-assisted moderation tooling that evaluates whether someone has said something bannable. They do this by running a script/app that sends the user’s comment history to OpenAI with the question “analyze this content for evidence of specific political ideology sentiment. Also identify any related political ideology tropes“.

OpenAI’s LLM (they’re using GPT-5.3-mini) then responds with something like:

image

and so on, hundreds of comments.

I have not named the instances or people involved, to give them time to consider the results of this discussion, make any corrective changes they want and disclose their practices at their own pace and in their own way. I have also redacted the evidence to avoid personal attacks and dogpiling. Let’s focus on the system, not the individuals involved. Today these instances and people are using it and maybe we’re ok with that because it’s being used by groups we agree with but what if people we strongly disagree with used it on their instances tomorrow?

The use and existence of this tooling raises a lot of other questions too.

What are the risks? Fedi moderators are often unsupervised, untrained volunteers and these are powerful tools.

What safeguards do we need?

Would asking a LLM “please evaluate this person’s political opinions” give different results than “find evidence we can use to ban them” (as used in the cases I’ve seen)?

What are our transparency expectations?

Is this acceptable and normal?

Should this tooling be disclosed? (it was not – should it have been?)

If you were given a choice, would you have opted out of it?

Can we opt out?

Are there GDPR implications? Privacy implications? Should these tools be described in a privacy policy?

Are private messages being scanned and sent to OpenAI?

How long should these assessments be retained and can we request to see it, or ask for it to be deleted?

Once the user’s comments are sent to OpenAI, is it used to train their models?

What will the effect be on our discourse and culture if people know they are being politically profiled?

Where are the lines between normal moderation assistance tools, political profiling and opaque 3rd-party data processing?

I hope that by chewing over these questions we can begin to establish some norms and expectations around this technology. The fediverse doesn’t have any centralized enforcement so we need discussions like this to develop an awareness of what people want in terms of disclosure, privacy, consent and acceptable use. Then people can make choices about which instances they join and which ones they interact with remotely.

And of course there are the other issues with LLMs relating to environmental sustainability, erosion of worker’s rights, increasing the cost of living and on and on. I can’t see PieFed adding any functionality like this anytime soon. But it’s happening out there anyway so now we need to talk about it.

What do you make of this?

  • technocrit@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    12 days ago

    There is no “AI” so I’m not sure what we’re talking about here.

    But statistical methods will lead to the average. I’m guessing that means whatever is hegemonic and “normative” will be promoted. That’s most likely the point.

  • humanspiral@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    13 days ago

    Never mind the issue of incorrect political bias classification, is political bias a bannable offense? That seems to be the prompt focus being used.

  • ResistingArrest@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    14 days ago

    I think this will exemplify the beauty of federation. If I find out my instance mods are running all of my comments through a company’s ai model, I’ll switch instances. This is in great disparity to something like Instagram or Snapchat where every photo I post is immediately fed to ai and my only options are: be okay with it, never post, or delete Instagram.

    • sp3ctr4l@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      13 days ago

      Yep.

      Unless somebody manages to … inject a hostile/unauthorized LLM as a mod or admin or something, in an instance they’re not an admin of, in a comm they’re not a mod of…

      Then people react by personally blocking or perhaps instance wide defederating or maybe conceiveably someone actually uses this in a generally good way, to identify trolls/sock puppets.

      As to… LLM scraping of comments?

      Lemmy is public, anyone can do that.

      I’ve done it to myself with a local LLM hooked up to a search engine, and I’m not a mod or admin of anything.

      Hence why you probably should use a pseudonym and not give too much information about yourself, if you’re concerned about privacy… same… rules the internet has always had.

      I suppose that instances could implement various anti-scraping measures, but that’s never going to be 100% effective as scrapers vs anti-scrapers has also basically always been a constantly escalating arms race.

    • Tollana1234567@lemmy.today
      link
      fedilink
      English
      arrow-up
      0
      ·
      14 days ago

      this is what reddit does, and destroyed thier communities and left it with bots on most subs. reddit also lets you hide your history so you cant sniff out bots/ or chronic spammers.

      • OpenStars@piefed.social
        link
        fedilink
        English
        arrow-up
        0
        ·
        13 days ago

        Although it should only matter if you chose to subscribe to that community where the mod is in charge.

          • OpenStars@piefed.social
            link
            fedilink
            English
            arrow-up
            0
            ·
            13 days ago

            Obviously, but unless the modlog is being spammed with many entries, and I receive notifications for each one individually (which is actually happening right now from communities on dbzer0, and unlike Lemmy, PieFed actually sends notifications for such events, but I will set that sub-topic aside for s moment), then in theory I do not care if I am (preemptively?) banned from let’s say c/conservative@newreddit.com or c/extremetankiedeathsquad@thereallemmy.fuckallwesterners (to be clear, these are hypothetical made-up names!🤪), if I never wanted to post or vote on their content anyway? They are even doing me a favor if it prevents it from showing in my feed (which it would on Lemmy iirc, but on PieFed it would not).

    • bonenode@piefed.social
      link
      fedilink
      English
      arrow-up
      0
      ·
      14 days ago

      Seeing that every single post we make is completely public there is a high chance someone out there already used all your comments for training an AI model. As you say, the only thing you can do is just not post anything anymore.

    • quediuspayu@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      13 days ago

      But you don’t even need to be a mod to do that. Anyone at any moment can run someone else’s entire comment and post histories through an LLM.

    • Goferking0@ttrpg.network
      link
      fedilink
      English
      arrow-up
      0
      ·
      13 days ago

      Are they farming for more or trying to distract from their last attempt at just asking questions about the statistics I spent time crafting a specific and poorly thought out hypothesis

  • Obinice@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    13 days ago

    You stay far, FAR away from that shit, is what you do.

    Scanning people’s entire history for political leanings, etc? That’s some deeply dystopian stuff right there.

    It’s easy to forget that these sorts of communities are dictatorships with only as much transparency as the owner wants to share. Usually they’re benevolent dictators, so we don’t think about it too much. But they can change in a heartbeat - and we don’t ever really know what they’re really thinking, or doing behind the scenes.

    When the mask slips and they reveal this sort of thing, thinking we’ll just accept it and keep living under their rule, it’s time to read the red flags and GET OUT.

    Hopefully someone compiles a list of places that do this stuff, so we can avoid them like the plague <3

    • davel@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      13 days ago

      All of that was already happening, because all Lemmy posts & comments are public.

    • wewbull@feddit.uk
      link
      fedilink
      English
      arrow-up
      0
      ·
      13 days ago

      Scanning people’s entire history for political leanings, etc? That’s some deeply dystopian stuff right there.

      Yep. It’s Cambridge Analytica and Palantir level shit.

      • sleepundertheleaves@infosec.pub
        link
        fedilink
        English
        arrow-up
        0
        ·
        13 days ago

        Don’t give it too much credit. It’s Reddit level shit. Current models are so good at providing the kind of reports mods want because Reddit’s automated mod tools have been running these assessments on hundreds of thousands of users for years and feeding the results back as training data.

        And let’s be real, a tool that assesses the public posts of a specific account isn’t doing anything different than mods already did. (Not to mention users - how many people, when they get into an online argument with someone, start going through their post history to find something to gotcha them with?). The LLM just does it faster.

  • Eugene V. Debs' Ghost@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    13 days ago

    This is the person calling you a tankie. Someone so afraid of words that they need a hallucinating robot to hold their hand and confirm that everything is a secret plot against them. The absolute only way I could see this being useful is for something like trying to sniff out if a Lemmy.world mod account is a leftist infiltrator or not. Someone who had a different opinion on a current event.

    You could maybe run a speech pattern comparison but that’s it. For everything else you just made Stupid Reddit and the purpose of their forum is to feed training data to ChatGPT so that it can profile Fediverse users.

    This is the kind of shit dystopian novels are made out of. So angry about people calling out actions you built a tool to analyze why they did it, so you can purge users from your digital kingdom.

    I for one welcome flat.world and Piefed showing their true intentions. Digital colonization of activitypub and removal of the people who helped to built it. They didn’t want to leave reddit, they wanted to be reddit. This is some Spez shit.

    Maybe in 2 weeks Piefed will hard code that anyone Rimu has tagged for disagreeing with them mild criticism to be unable to make accounts or federate posts with a false error code.

  • skisnow@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    12 days ago

    LinkedIn’s LLM-powered automation banned my account on a false positive a few months ago, and it took ages to get it sorted out and they treated me like shit the entire way through even after acknowledging that they’d made a mistake. Sadly it’s extremely difficult to operate in my field without a LinkedIn account, because I would love to be able to delete it.

    This shit is poison

  • j_z@feddit.nu
    link
    fedilink
    English
    arrow-up
    0
    ·
    13 days ago

    I guess, given the already open nature of the Fediverse, my takeaway from this thread is that op is using their freedom to say they don’t like this particular style of moderation. Which might be useful, or not, for some moderators

  • Alvaro@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    13 days ago

    Wthout going into the issue itself, it is such a ridiculous waste to use an llm for something that a far simpler model could do like 100x faster and locally for essentially free…

    Just search for “machine learning text moderation” and you will find all kinds of options. Not to talk about the fact that a simple 4B LLM could do this as well.

    One thing I really hate is how LLMs have completely overshadowed the entire ML/AI field and people just use them for everything.

    Using a trillion parameter LLM model for basic text moderation is like using a gaming rig to play candy crush.

    • Sl00k@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      ·
      13 days ago

      I’ve been doing a lot of work around this area and the issue tends to pop up around context. There are instances a haiku model will catch something far more accurately than gpt-oss-safeguard. Obviously you pay in costs, realistically, you need a full system here for a proper implementation so some flows to tiny LLM some flows exclusively ML, some flows to a higher intelligence.

      People on the fediverse are generally pretty anti ai but it’s basically impossible to scale a platform without AI moderation. I would fully welcome any instance trying an implementation of [Osprey](https://discord.com/blog/osprey-open-sourcing-our-rule-engine with LLMs.

      • bdonvr@thelemmy.club
        link
        fedilink
        English
        arrow-up
        0
        ·
        12 days ago

        it’s basically impossible to scale a platform without AI moderation.

        ???

        Did platforms not scale before the rise of LLMs?

      • forestbeasts@pawb.social
        link
        fedilink
        English
        arrow-up
        0
        ·
        12 days ago

        Why not just, not try to scale a platform, then? Not everything needs to Scale™.

        The twitterside of the fediverse (Mastodon and suchlike) do perfectly fine with 100% real moderation by actual people. I’m honestly kind of surprised people are trying to run the redditside of the fediverse totally differently.

        – Frost

      • TRBoom@lemmy.zip
        link
        fedilink
        English
        arrow-up
        0
        ·
        13 days ago

        Yes, a Convolutional Neural Net could do it or a plain Neural Net even.

        You’ll want to create a sample of your handwriting with the letters isolated into their own picture and labelled. Maybe 10 of each letter to start with? If you get bad results with that make more samples. Each picture should be the same size as all the others. The starter course of Machine Learning I took way back when had us using a database of labeled numbers, each picture was 10 pixels by 10 pixels.

        Then pick a CNN model (or better yet several) and train them on your handwriting. You can find some here: https://huggingface.co/models?other=CNN

        Pick the one that does best. As part of that course I mentioned, I created an evolutionary algorithm to mutate, combine, and propagate CNNs to find out the best configurations for identifying images. The ones that performed the best got to combine with other top performers.

        You might also be able to find a CNN specific to handwriting and then fine tune it to yours with your samples.

        This is doing it raw and will have a lot of education for you along the way. There may be some prebuilt handwriting model you can just fine tune with easy instructions from the person who made it all wrapped up into a nice bit of python for you. Maybe.