• Smorty [she/her]@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    1
    ·
    3 hours ago

    the training process being shiddy i completely agree with. that is simply awful and takes a shidload of resources to get a good model.

    but… running them… feels oki to me.

    as long as you’re not running some bigphucker model like GPT4o to do something a smoler model could also do, i feel it kinda is okay.

    32B parameter size models are getting really, really good, so the inference (running) costs and energy consumption is already going down dramatically when not using the big models provided by BigEvilCo™.

    Models can clearly be used for cool stuff. Classifying texts is the obvious example. Having humans go through that is insane and cost-ineffective. Meanwhile models can classify multiple pages of text in half a second with a 14B parameter (8GB) model.

    obviously using bigphucker models for everything is bad. optimizing tasks to work on small models, even at 3B sizes, is just more cost-effective, so i think the general vibe will go towards that direction.

    people running their models locally to do some stuff will make companies realize they don’t need to pay 15€ per 1.000.000 tokens to OpenAI for their o1 model for everything. they will realize that paying like 50 cents for smaller models works just fine.

    if i didn’t understand ur point, please point it out. i’m not that good at picking up on stuff…