RustGPT: A pure-Rust transformer LLM built from scratch

(github.com)

253 points | by amazonhut 6 hours ago

19 comments

ramon156 4 hours ago
Cool stuff! I can see some GPT comments that can be removed
// Increased for better learning
this doesn't tell me anything
// Use the constants from lib.rs
const MAX_SEQ_LEN: usize = 80;
const EMBEDDING_DIM: usize = 128;
const HIDDEN_DIM: usize = 256;
these are already defined in lib.rs, why not use them (as the comment suggests)
[-]
- ericdotlee 1 hour ago
  Do you think vibe coded rust will rot the quality of language code generally?
  [-]
  - adastra22 1 hour ago
    These things will be corrected over time.
    [-]
    - yahoozoo 47 minutes ago
      How do you mean?
- tialaramex 2 hours ago
  For the constants is it possible the author didn't know how? I remember in my first week of Rust I didn't understand how to name things properly, basically I was overthinking it.
  [-]
  - vlovich123 1 hour ago
    Lots of signs this is an LLM-generated project. All the emojis in the README are a hint as well.
  - tayo42 1 hour ago
    From his reddit post
    https://old.reddit.com/r/rust/comments/1nguv1a/i_built_an_ll...
- tmaly 32 minutes ago
  did you add these as a PR ?
- sloppytoppy 3 hours ago
  [flagged]
untrimmed 5 hours ago
As someone who has spent days wrestling with Python dependency hell just to get a model running, a simple cargo run feels like a dream. But I'm wondering, what was the most painful part of NOT having a framework? I'm betting my coffee money it was debugging the backpropagation logic.
[-]
- ricardobeat 4 hours ago
  Have you tried uv [1]? It has removed 90% of the pain of running python projects for me.
  [1] https://github.com/astral-sh/uv
  [-]
  - mtlmtlmtlmtl 2 hours ago
    I'm sure it's true and all. But I've been hearing the same claim about all those tools uv is intended to replace, for years now. And every time I try to run any of those, as someone who's not really a python coder, but can shit out scripts in it if needed and sometimes tries to run python software from github, it's been a complete clusterfuck.
    So I guess what I'm wondering is, are you a python guy, or are you more like me? because for basically any of these tools, python people tell me "tool X solved all my problems" and people from my own cohort tell me "it doesn't really solve anything, it's still a mess".
    If you are one of us, then I'm really listening.
    [-]
    - OoooooooO 10 minutes ago
      As a mainly Python guy (Data Engineering so new project for every ETL pipeline = a lot of projects) uv solved every problem I had before with pip, conda, miniconda, pipx etc.
    - hobofan 2 hours ago
      I'm one of you.
      I'm about the highest tier of package manager nerd you'll find out there, but despite all that, I've been struggling to create/run/manage venvs out there for ages. Always afraid of installing a pip package or some piece of python-based software (that might muck up Python versions).
      I've been semi-friendly with Poetry already, but mostly because it was the best thing around at the time, and a step in the right direction.
      uv has truely been a game changer. Try it out!
    - Yoric 42 minutes ago
      As a developer: it basically solved all of my problems that could be solved by a package manager.
      As an occasional trainer of scientists: it didn't seem to help my students.
      [-]
      - buildbot 29 minutes ago
        It installs stuff super fast!
        It sadly doesn’t solve stuff like transformer_engine being built with cxx11 ABI and pytorch isn’t by default, leading to missing symbols…
    - tinco 1 hour ago
      As a Ruby guy: uv makes Python feel like it finally passed the year 2010.
      [-]
      - llIIllIIllIIl 1 hour ago
        Don’t forget to schedule your colonoscopy as a Ruby guy
    - OrderlyTiamat 1 hour ago
      I'm (reluctantly) a python guy, and uv really is a much different experience for me than all the other tools. I've otherwise had much the same experience as you describe here. Maybe it's because `uv` is built in rust? ¯\_ (ツ)_/¯
      But I'd also hesitate to say it "solves all my problems". There's plenty of python problems outside of the core focus of `uv`. For example, I think building a python package for distribution is still awkward and docs are not straightforward (for example, pointing to non-python files which I want to include was fairly annoying to figure out).
    - J_Shelby_J 2 hours ago
      Isn’t UV essentially cargo for python?
      [-]
      - adastra22 1 hour ago
        Somewhat literally so. It is written in Rust and makes use of the cargo-util crate for some overlapping functionality.
    - jhardy54 2 hours ago
      I’m a “Python guy” in that I write Python professionally, but also am like you in that I’ve been extremely underwhelmed by Portry/Pipenv/etc.
      Python dependencies are still janky, but uv is a significant improvement over existing tools in both performance and ergonomics.
  - DiabloD3 4 hours ago
    uv is great, but I think the real fix is just abandoning Python.
    The culture that language maintains is rather hostile to maintainable development, easier to just switch to Rust and just write better code by default.
    [-]
    - trklausss 3 hours ago
      Every tool for the right job. If you are doing tons of scripting (for e.g. tests on platforms different than Rust), Python can be a solid valid alternative.
      Also, tons of CAE platforms have Python bindings, so you are "forced" to work on Python. Sometimes the solution is not just "abandoning a language".
      If it fits your purpose, knock yourself out, for others that may be reading: uv is great for Python dependency management on development, I still have to test it for deployment :)
      [-]
      - aeve890 3 hours ago
        >Every tool for the right job. If you are doing tons of scripting (for e.g. tests on platforms different than Rust), Python can be a solid valid alternative.
        I'd say Go is a better alternative if you want to replace python scripting. Less friction and much faster compilation times than Rust.
        [-]
        DiabloD3 3 hours ago
        I am not a huge fan of Go, but if all the world's "serious" Python became Go, the average code quality would skyrocket, so I think I can agree to this proposal.
        physicsguy 2 hours ago
        Go performance is terrible for numeric stuff though, no SIMD support.
        [-]
        9rx 2 hours ago
        That's not really true, but we're talking about a Python replacement for scripting tasks, not core compute tasks, anyway. It is not like Python is the paragon of SIMD support. Any real Python workloads end up being written in C for good reason, using Python only as the glue. Go can also interface with C code, and despite all the flack it gets for its C call overhead it is still significantly faster at calling C code than Python is.
        [-]
        adastra22 1 hour ago
        For the record of people reading this, I wrote a multithreaded SIMD-heavy compute task in Go, and it suffered only 5% slowdown vs the original hand-optimized C++ version.
        The low level SIMD stuff was called out to over the c FFI bridge; golang was used for the rest of the program.
        DiabloD3 2 hours ago
        (given the context of LLMs) Unless you're doing CPU-side inference for corner cases where GPU inference is worse, lack of SIMD isn't a huge issue.
        There are libraries to write SIMD in Go now, but I think the better fix is being able to autovectorize during the LLVM IR optimization stage, so its available with multiple languages.
        I think LLVM has it now, its just not super great yet.
        wild_egg 2 hours ago
        Lots of packages out there using SIMD for lots of things.
        You can always drop into straight assembly if you need to as well. Go's assembler DX is quite nice after you get used to it.
        pclmulqdq 2 hours ago
        There are Go SIMD libraries now, and there's also easy use of C libraries via Cgo.
    - pjmlp 2 hours ago
      I know Python since version 1.6.
      It is great for learning on how to program (BASIC replacement), OS scripting tasks as Perl replacement, and embedded scripting in GUI applications.
      Additionally understand PYTHONPATH, and don't mess with anything else.
      All the other stuff that is supposed to fix Python issues, I never bothered with them.
      Thankfully, other languages are starting to also have bindings to the same C and C++ compute libraries.
    - airza 4 hours ago
      There's not really another game in town if you want to do fast ML development :/
      [-]
      - DiabloD3 3 hours ago
        Dunno, almost all of the people I know anywhere in the ML space are on the C and Rust end of the spectrum.
        Lack of types, lack of static analysis, lack of ... well, lack of everything Python doesn't provide and fights users on costs too much developer time. It is a net negative to continue pouring time and money into anything Python-based.
        The sole exclusion I've seen to my social circle is those working at companies that don't directly do ML, but provide drivers/hardware/supporting software to ML people in academia, and have to try to fix their cursed shit for them.
        Also, fwiw, there is no reason why Triton is Python. I dislike Triton for a lot of reasons, but its just a matmul kernel DSL, there is nothing inherent in it that has to be, or benefits from, being Python.... it takes DSL in, outputs shader text out, then has the vendor's API run it (ie, CUDA, ROCm, etc). It, too, would benefit from becoming Rust.
        [-]
        wolvesechoes 3 minutes ago
        > It, too, would benefit from becoming Rust.
        Yet it was created for Python. Someone took that effort and did it. No one took that effort in Rust. End of the story of crab's superiority.
        Python community is constantly creating new, great, highly usable packages that become de facto industry standards, and maintain old ones for years, creating tutorials, trainings and docs. Commercial vendors ship Python APIs to their proprietary solutions. Whereas Rust community is going through forums and social media telling them that they should use Rust instead, or that they "cheated" because those libraries are really C/C++ libraries.
        mountainriver 2 hours ago
        I love Rust and C, I write quite a bit of both. I am an ML engineer by trade.
        To say most ML people are using Rust and C couldn’t be further from the truth
        [-]
        Narishma 1 hour ago
        They said most people they knew, not most people.
        nkozyra 2 hours ago
        > Dunno, almost all of the people I know anywhere in the ML space are on the C and Rust end of the spectrum.
        I wish this were broadly true.
        But there's too much legacy Python sunk cost for most people though. Just so much inertia behind Python for people to abandon it and try to rebuild an extensive history of ML tooling.
        I think ML will fade away from Python eventually but right now it's still everywhere.
        [-]
        DiabloD3 1 hour ago
        A lot of what I see in ML is all focused around Triton, which is why I mentioned it.
        If someone wrote a Triton impl that is all Rust instead, that would do a _lot_ of the heavy lifting on switching... most of their hard code is in Triton DSL, not in Python, the Python is all boring code that calls Triton funcs. That changes the argument on cost for a lot of people, but sadly not all.
        airza 2 hours ago
        Okay. Humor me. I want to write a transformer-based classifier for a project. I am accustomed to the pytorch and tensorflow libraries. What is the equivalent using C?
        [-]
        adastra22 1 hour ago
        You do know that tensorflow was written in C++ and the Python API bolted on top?
        [-]
        wolvesechoes 10 minutes ago
        It could be written in mix of Cobol and APL. No one cares.
        People saying "oh those Python libraries are just C/C++ libraries with Python API, every language can have them" have one problem - no other language has them (with such extensive documentation, tutorials etc.)
    - Exuma 3 hours ago
      i hate python, but the idea of replacing python with rust is absurd
    - WhereIsTheTruth 1 hour ago
      abandoning Python for Rust in AI would cripple the field, not rescue it
      the disease is the cargo cult addiction (which Rust is full of) to micro libraries, not the language that carries 90% of all peer reviewed papers, datasets, and models published in the last decade
      every major breakthrough, from AlphaFold to Stable Diffusion, ships with a Python reference implementation because that is the language researchers can read, reproduce, and extend, remove Python and you erase the accumulated, executable knowledge of an entire discipline overnight, enforcing Rust would sabotage the field more than anything
      on the topic of uv, it will do more harm than good by enabling and empowering cargo cults on a systemic level
      the solution has always been education, teaching juniors to value simplicity, portability and maintainability
  - TheAceOfHearts 3 hours ago
    Switching to uv made my python experience drastically better.
    If something doesn't work or I'm still encountering any kind of error with uv, LLMs have gotten good enough that I can just copy / paste the error and I'm very likely to zero-in on a working solution after a few iterations.
    Sometimes it's a bit confusing figuring out how to run open source AI-related python projects, but the combination of uv and iterating on any errors with an LLM has so far been able to resolve all the issues I've experienced.
  - shepardrtc 1 hour ago
    uv has been amazing for me. It just works, and it works fast.
- codetiger 4 hours ago
  I guess, resource utilization like GPU, etc
- Galanwe 3 hours ago
  > spent days wrestling with Python dependency hell
  I mean I would understand that comment in 2010, but in 2025 it's grossly ridiculous.
  [-]
  - adastra22 59 minutes ago
    Yeah, because of a tool written in Rust, copying the Rust way of doing things for Python developers.
    [-]
    - Galanwe 3 minutes ago
      I am not even thinking of `uv`, but rather of pyproject.toml, and the various improvements as to how dependencies are declared and resolved. You don't get much simpler than a toml file listing your dependencies and constraints, along with a lock file.
      Also let's keep middle school taunts at home.
- zoobab 3 hours ago
  "a simple cargo run feels like a dream"
  A cargo build that warms up your CPU during winter while recompiling the whole internet is better?
  [-]
  - surajrmal 1 hour ago
    It has 3 direct dependencies and not too many more transitively. You're certainly not recompiling the internet. If you're going to run a local llm I doubt you're building on a toaster so build speed won't be a big ordeal either.
- taminka 5 hours ago
  lowkey ppl who praise cargo seem to have no idea of the tradeoffs involved in dependency management
  the difficulty of including a dependency should be proportional to the risk you're taking on, meaning it shouldn't be as difficult as it in, say, C where every other library is continually reinventing the same 5 utilities, but also not as easy as it is with npm or cargo, because you get insane dependency clutter, and all the related issues like security, build times, etc
  how good a build system isn't equivalent of how easy it is include a dependency, while modern languages should have a consistent build system, but having a centralised package repository that anyone freely pull to/from, and having those dependencies freely take on any number of other dependencies is a bad way to handle dependencies
  [-]
  - dev_l1x_be 4 hours ago
    > lowkey ppl who praise cargo seem to have no idea
    Way to go on insulting people on HN. Cargo is literally the reason why people coming to Rust from languages like C++ where the lack of standardized tooling is giant glaring bomb crater that poses burden on people every single time they need to do some basic things (like for example version upgrades).
    Example:
    https://github.com/facebook/folly/blob/main/build.sh
    [-]
    - taminka 3 hours ago
      i'm saying that ease of dependency inclusion should not be a main criterion for evaluating how good a build system is, not that it isn't the main criterion for many people...
      like the entire point of my comment is that people have misguided criteria for evaluating build systems, and your comment seems to just affirm this?
      [-]
      - CodeMage 1 hour ago
        Dependency management should most definitely be one of the main criteria for evaluating how good a build system is. What's misguided is intentionally opting for worse dependency management in an attempt to solve a people problem, i.e. being careless about adding dependencies to your project in circumstances when you should be careful.
      - Sl1mb0 2 hours ago
        > dependency inclusion _should not_ be a main criterion for evaluating how good a build system is
        That's just like, your opinion, man.
        [-]
        lutusp 1 hour ago
        > That's just like, your opinion, man.
        I would love to know how many younger readers recognize this classic movie reference.
        taminka 2 hours ago
        i mean, unless you have some absolute divine truths, that's kind of the best i have :shrug
      - adwn 2 hours ago
        > like the entire point of my comment is that people have misguided criteria for evaluating build systems, and your comment seems to just affirm this?
        I think dev_l1x_be's comment is meant to imply that your believe about people having misguided criteria [for evaluation build systems] is itself misguided, and that your favored approach [that the difficulty of including a dependency should be proportional to the risk you're taking on] is also misguided.
        [-]
        taminka 2 hours ago
        my thesis is that negative externalities of build systems are important and i don't know how to convince of importance of externalities someone whose value system is built specifically on ignoring externalities and only factoring in immediate convenience...
    - huflungdung 4 hours ago
      [dead]
  - quantumspandex 4 hours ago
    Security is another problem, and should be tackled systematically. Artificially making dependency inclusion hard is not it and is detrimental to the more casual use cases.
  - hobofan 1 hour ago
    > but having a centralised package repository that anyone freely pull to/from, and having those dependencies freely take on any number of other dependencies is a bad way to handle dependencies
    So put a slim layer of enforcement to enact those policies on top? Who's stopping you from doing that?
  - itsibitzi 4 hours ago
    What tool or ecosystem does this well, in your opinion?
    [-]
    - taminka 2 hours ago
      any language that has a standardised build system (virtually every language nowadays?), but doesn't have a centralised package repository, such that including a dependency is seamless, but takes a bit of time and intent
      i like how zig does this, and the creator of odin has a whole talk where he basically uses the same arguments as my original comment to reason why odin doesn't have a package manager
      [-]
      - zoobab 1 hour ago
        "a standardised build system (virtually every language nowadays?)"
        Python packages still manage poorly dependencies that are in another lang like C or C++.
  - MangoToupe 27 minutes ago
    > the difficulty of including a dependency should be proportional to the risk you're taking on
    Why? Dependency hell is an unsolvable problem. Might as well make it easier to evaluate the tradeoff between dependencies and productivity. You can always arbitrarily ban dependencies.
  - IshKebab 4 hours ago
    This is the weirdest excuse for Python's terrible tooling that I've ever heard.
    "It's deliberately shit so that people won't use it unless they really have to."
    [-]
    - taminka 3 hours ago
      i just realised that my comment sounds like it's praising python's package management since it's often so inconvenient to use, i want to mention that that wasn't my intended point, python's package management contains the worst aspects from both words: being centralised AND horrible to use lol
      my mistake :)
  - jokethrowaway 4 hours ago
    Is your argument that python's package management & ecosystem is bad by design - to increase security?
    In my experience it's just bugs and poor decision making on the maintainers (eg. pytorch dropping support for intel mac, leftpad in node) or on the language and package manager developers side (py2->3, commonjs, esm, go not having a package manager, etc).
    Cargo has less friction than pypi and npm. npm has less friction than pypi.
    And yet, you just need to compromise one lone, unpaid maintainer to wreck the security of the ecosystem.
    [-]
    - taminka 2 hours ago
      nah python's package management is just straight up terrible by every metric, i just used it as a tangent to talk about how imo ppl incorrectly evaluate build systems
techsystems 5 hours ago
> ndarray = "0.16.1" rand = "0.9.0" rand_distr = "0.5.0"
Looking good!
[-]
- worldsavior 19 minutes ago
  This doesn't mean anything. A project can implement things from scratch inefficiently but there might be other libraries the project can use instead of reimplementing.
- kachapopopow 5 hours ago
  I was slightly curious: cargo tree llm v0.1.0 (RustGPT) ├── ndarray v0.16.1 │ ├── matrixmultiply v0.3.9 │ │ └── rawpointer v0.2.1 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-complex v0.4.6 │ │ └── num-traits v0.2.19 │ │ └── libm v0.2.15 │ │ [build-dependencies] │ │ └── autocfg v1.4.0 │ ├── num-integer v0.1.46 │ │ └── num-traits v0.2.19 () │ ├── num-traits v0.2.19 () │ └── rawpointer v0.2.1 ├── rand v0.9.0 │ ├── rand_chacha v0.9.0 │ │ ├── ppv-lite86 v0.2.20 │ │ │ └── zerocopy v0.7.35 │ │ │ ├── byteorder v1.5.0 │ │ │ └── zerocopy-derive v0.7.35 (proc-macro) │ │ │ ├── proc-macro2 v1.0.94 │ │ │ │ └── unicode-ident v1.0.18 │ │ │ ├── quote v1.0.39 │ │ │ │ └── proc-macro2 v1.0.94 () │ │ │ └── syn v2.0.99 │ │ │ ├── proc-macro2 v1.0.94 () │ │ │ ├── quote v1.0.39 () │ │ │ └── unicode-ident v1.0.18 │ │ └── rand_core v0.9.3 │ │ └── getrandom v0.3.1 │ │ ├── cfg-if v1.0.0 │ │ └── libc v0.2.170 │ ├── rand_core v0.9.3 () │ └── zerocopy v0.8.23 └── rand_distr v0.5.1 ├── num-traits v0.2.19 () └── rand v0.9.0 ()
  yep, still looks relatively good.
  [-]
  - imtringued 3 hours ago
```
    cargo tree llm v0.1.0 (RustGPT)
    ├── ndarray v0.16.1
    │   ├── matrixmultiply v0.3.9
    │   │   └── rawpointer v0.2.1
    │   │       [build-dependencies]
    │   │       └── autocfg v1.4.
    │   ├── num-complex v0.4.6
    │   │   └── num-traits v0.2.19
    │   │       └── libm v0.2.15
    │   │           [build-dependencies]
    │   │           └── autocfg v1.4.0
    │   ├── num-integer v0.1.46
    │   │   └── num-traits v0.2.19 ()
    │   ├── num-traits v0.2.19 ()
    │   └── rawpointer v0.2.1
    ├── rand v0.9.0
    │   ├── rand_chacha v0.9.0
    │   │   ├── ppv-lite86 v0.2.20
    │   │   │   └── zerocopy v0.7.35
    │   │   │       ├── byteorder v1.5.0
    │   │   │       └── zerocopy-derive v0.7.35 (proc-macro)
    │   │   │           ├── proc-macro2 v1.0.94
    │   │   │           │   └── unicode-ident v1.0.18
    │   │   │           ├── quote v1.0.39
    │   │   │           │   └── proc-macro2 v1.0.94 ()
    │   │   │           └── syn v2.0.99
    │   │   │               ├── proc-macro2 v1.0.94 ()
    │   │   │               ├── quote v1.0.39 ()
    │   │   │               └── unicode-ident v1.0.18
    │   │   └── rand_core v0.9.3
    │   │       └── getrandom v0.3.1
    │   │           ├── cfg-if v1.0.0
    │   │           └── libc v0.2.170
    │   ├── rand_core v0.9.3 ()
    │   └── zerocopy v0.8.23
    └── rand_distr v0.5.1
        ├── num-traits v0.2.19 ()
        └── rand v0.9.0 ()
```
  - cmrdporcupine 4 hours ago
    linking both rand-core 0.9.0 and rand-core 0.9.3 which the project could maybe avoid by just specifying 0.9 for its own dep on it
    [-]
    - Diggsey 18 minutes ago
      It doesn't link two versions of `rand-core`. That's not even possible with rust (you can only link two semver-incompatible versions of the same crate). And dependency specifications in Rust don't work like that - unless you explicitly override it, all dependencies are semver constraints, so "0.9.0" will happily match "0.9.3".
- tonyhart7 5 hours ago
  is this satire or does I must know context behind this comment???
  [-]
  - stevedonovan 5 hours ago
    These are a few well-chosen dependencies for a serious project.
    Rust projects can really go bananas on dependencies, partly because it's so easy to include them
  - obsoleszenz 5 hours ago
    The project only has 3 dependencies which i interpret as a sign of quality
Snuggly73 4 hours ago
Congrats - there is a very small problem with the LLM - its reusing transformer blocks and you want to use different instances of them.
Its a very cool excercise, I did the same with Zig and MLX a while back, so I can get a nice foundation, but since then as I got hooked and kept adding stuff to it, switched to Pytorch/Transformers.
[-]
- icemanx 4 hours ago
  correction: It's a cool exercise if you write it yourself and not use GPT
  [-]
  - Snuggly73 4 hours ago
    well, hopefully the author did learn something or at least enjoyed the process :)
    (the code looks like a very junior or a non-dev wrote it tbh).
jlmcgraw 4 hours ago
Some commentary from the author here: https://www.reddit.com/r/rust/comments/1nguv1a/i_built_an_ll...
Charon77 5 hours ago
Absolutely love how readable the entire project is
[-]
- emporas 5 hours ago
  It is very procedural/object oriented. This is not considered good Rust practice. Iterators make it more functional, which is better, more succinct that is, and enums more algebraic. But it's totally fine for a thought experiment.
- koakuma-chan 4 hours ago
  It's AI generated
  [-]
  - Revisional_Sin 4 hours ago
    How do you know? The over-commenting?
    [-]
    - koakuma-chan 4 hours ago
      I know because this is how an AI generated project looks. Clearly AI generated README, "clean" code, the way files are named, etc.
      [-]
      - magackame 4 hours ago
        Not sure myself. Commit messages look pretty human. But the emojis in readme and comments like "// Re-export key structs for easier access", "# Add any test-specific dependencies here if needed" are sus indeed.
      - cmrdporcupine 4 hours ago
        To me it looks like LLM generated README, but not necessarily the source (or at least not all of it).
        Or there's been a cleaning pass done over it.
        [-]
        koakuma-chan 4 hours ago
        I think pretty clearly the source is also at least partially generated. None the less, just a README like that already sends a strong signal to stop looking and not trust anything written there.
    - adastra22 58 minutes ago
      Because the author said so on Reddit.
    - GardenLetter27 4 hours ago
      The repeated Impls are strange.
      [-]
      - magackame 4 hours ago
        Where? Don't see any on latest main (685467e).
        [-]
        yahoozoo 3 hours ago
        `llm.rs` has many `impl LLM` blocks
- yieldcrv 5 hours ago
  Never knew Rust could be that readable. Makes me think other Rust engineers are stuck in a masochistic ego driven contest, which would explain everything else I've encountered about the Rust community and recruiting on that side.
  [-]
  - GardenLetter27 4 hours ago
    Most Rust code looks like this - only generic library code goes crazy with all the generics and lifetimes, due to the need to avoid unnecessary mallocs and also provide a flexible API to users.
    But most people aren't writing libraries.
    [-]
    - cmrdporcupine 1 hour ago
      Don't underestimate what some programmers trying to prove their cleverness (or just trying to have fun) can do if left unchecked. I think most Rust code does indeed look like this but I've seen plenty of projects that go crazy with lifetimes and generics juggling where they don't have to.
  - jmaker 5 hours ago
    Not sure what you’re alluding to but that’s just ordinary Rust without performance or async IO concerns.
ndai 5 hours ago
I’m curious where you got your training data? I will look myself, but saw this and thought I’d ask. I have a CPU-first, no-backprop architecture that works very well on classification datasets. It can do single‑example incremental updates which might be useful for continuous learning. I made a toy demo to train on tiny.txt and it can predict next characters, but I’ve never tried to make an LLM before. I think my architecture might work well as an on-device assistant or for on-premises needs, but I want to work with it more before I embarrass myself. Any open-source LLM training datasets you would recommend?
[-]
- electroglyph 5 hours ago
  https://huggingface.co/datasets/NousResearch/Hermes-3-Datase...
  [-]
  - Snuggly73 4 hours ago
    To my untrained eye, this looks more like an instruct dataset.
    For just plain text, I really like this one - https://huggingface.co/datasets/roneneldan/TinyStories
- kachapopopow 5 hours ago
  huggingface has plenty of openai and antrophic user to assistant chains, beware there are dragons (hallucinations), but good enough for instruction training. I actually recommend distilling kimi k2 instead for instruction following capabilities.
Goto80 5 hours ago
Nice. Mind to put a license on that?
[-]
- thomask1995 4 hours ago
  License added! Good catch
kachapopopow 5 hours ago
This looks rather similar to when I asked an AI to implement a basic xor problem solver I guess fundementally there's really only a very limited amount of ways to implement this.
enricozb 5 hours ago
I did this [0] (gpt in rust) with picogpt, following the great blog by jaykmody [1].
[0]: https://github.com/enricozb/picogpt-rust [1]: https://jaykmody.com/blog/gpt-from-scratch/
abricq 4 hours ago
This is great ! Congratulations. I really like your project, especially I like how easily it is to peak at.
Do you plan on moving forward with this project ? I seem to understand that all the training is done on the CPU, and that you have next steps regarding optimizing that. Do you consider GPU accelerations ?
Also, do you have any benchmarks on known hardware ? Eg, how long would it take to train on a macbook latest gen or your own computer ?
ericdotlee 1 hour ago
This is incredibly cool, but I wonder when more of the AI ecosystem will move past python tooling into something more... performant?
Very interesting to already see rust based inference frameworks as well.
bionhoward 2 hours ago
That time to first token is impressive, it seems like it responds immediately
capestart 3 hours ago
Very cool project, always nice to see deep learning built from scratch in Rust without heavy frameworks.
lutusp 1 hour ago
It would have been nice to see a Rust/Python time comparison for both development and execution. You know, the "bottom line"?
bigmuzzy 5 hours ago
nice
trackflak 5 hours ago
[dead]
Emma_Schmidt 3 hours ago
[dead]
zenlot 2 hours ago
Rust == stars in GitHub.