- cross-posted to:
- technology@lemmy.ml
- hackernews@lemmy.bestiver.se
- cross-posted to:
- technology@lemmy.ml
- hackernews@lemmy.bestiver.se
You must log in or register to comment.
Isnt deepseek based on qwen? at least the distilled models?
I think so, but this looks like an update of qwen with some new tricks.
can grab it here
I find it absolutely wild how quickly we went from needing a full blown data centre to run models of this scale to being able to run them on a laptop.