Someone interested in many things.

  • 0 Posts
  • 12 Comments
Joined 1 year ago
cake
Cake day: June 15th, 2023

help-circle

  • Like I said, I’m aware of extant measures to try and steer models, but people often assume a level of craftsmanship in censoring models that simply does not exist. Jailbreakchat.com is an endless stream of examples of this very fect; it’s very hard, especially with the limited context lengths of current models, to effectively give them any hard directives.

    And back to foundational models, which are essentially free of censorship, they will still exhibit a similar level of political bias unless prompted otherwise. All this to say that, discounting OpenAI’s attempts to control their models, the model itself will inherently learn from and mirror the real-world biases of the text it was trained on. Those biases happen to fall along lines that often ignore subtlety in debates regarding illegality and morality.


  • It’s hard to say what LLMs are “programmed” to do, as they’re largely untamed beasts of text prediction. In fact, I would suspect its built-in biases are less the result of pre-prompting or post-foundational-model training and really just what a lot of people tend to think online. In a way, it’s more like people in general often equate illegality with immorality.

    You can see similar biases in many of the open-source LLMs that are floating around. Even though they’re basically built outside of large corporate cultures and large-scale monetary incentive, they still retain a lot of political bias that tends to favor governmental measures heavily.