Which NPM package has the largest version number?

(adamhl.dev)

152 points | by genshii 17 hours ago

19 comments

  • jonchurch_ 2 hours ago
    The author has run into the same problem that anyone who wants to do analysis on the NPM registry runs into, there's just no good first party API for this stuff anymore.

    It seems this was their first time going down this rabbit hole, so for them and anyone else, I'd urge you to use the deps.dev Google BigQuery dataset [0] for this kind of analysis. It does indeed include NPM and would have made the author's work trivial.

    Here's a gist with the query and the results https://gist.github.com/jonchurch/9f9283e77b4937c8879448582b...

    [0] - https://docs.deps.dev/bigquery/v1/

  • stabbles 13 hours ago
    For Python (or PyPI) this is easier, since their data is available on Google BigQuery [1], so you can just run

        SELECT * FROM `bigquery-public-data.pypi.distribution_metadata` ORDER BY length(version) DESC LIMIT 10
    
    The winner is: https://pypi.org/project/elvisgogo/#history

    The package with most versions still listed on PyPI is spanishconjugator [2], which consistently published ~240 releases per month between 2020 and 2024.

    [1] https://console.cloud.google.com/bigquery?p=bigquery-public-...

    [2] https://pypi.org/project/spanishconjugator/#history

    • Rygian 10 hours ago
      Regarding spanishconjugator, commit ec4cb98 has description "Remove automatic bumping of version".

      Prior to that commit, a cronjob would run the 'bumpVersion.yml' workflow four times a day, which in turn executes the bump2version python module to increase the patch level. [0]

      Edit: discussed here: https://github.com/Benedict-Carling/spanish-conjugator/issue...

      [0] https://github.com/Benedict-Carling/spanish-conjugator/commi...

      • dijksterhuis 6 hours ago
        i love the package owner’s response in that issue xD
    • passivegains 1 hour ago
      I decided my life could not possibly go on until I knew what "elvisgogo" does, so I downloaded the tarball and poked around. it's a pretty ordinary numpy + pandas + matplotlib project that makes graphs from csv. one line jumped out at me: str_0 = ['refractive_index','Na','Mg','Al','Si','K','Ca','Ba','Fe','Type'] the university of st. andrews has a laser named "elvis" that goes on a remote controlled submarine: https://www.st-andrews.ac.uk/~bds2/elvislaser.htm I was hoping it'd be about go-go dancing to elvis music, but physics experiments on light in seawater is pretty cool too.
    • breakingcups 10 hours ago
      Tangential, but I've only heard about BigQuery from people being surprised with gargantuan bills for running one query on a public dataset. Is there a "safe" way to use it with a cost limit, for example?
      • abxyz 9 hours ago
        Yes you can set price caps. The cost of a query is understandable ahead of time with the default pricing model ($6 per TB of data processed in a query). People usually get caught out by running expensive queries recursively. BigQuery is very cost effective and can be used safely.
        • Bratmon 4 hours ago
          You can tell someone has worked in the cloud for too long when they start to think of $6 per database query as a reasonable price.
          • lenkite 1 hour ago
            We really need to go back to on-premise. We have surrendered our autonomy to these megacorps and now are paying for it - quite literally in many cases.
          • abxyz 2 hours ago
            My 3TB, 41 billion row table costs pennies to query day to day. The billing is based on the data processed by the query, not the table size. I pay more for storage.
          • morcus 2 hours ago
            Surely most queries should process much less than 1 TB of data?
        • Aeolun 5 hours ago
          Running ripgrep on my harddrive would cost me $48 at that price point.
    • thesystemisbust 11 hours ago
      You can also query for free at clickpy.clickhouse.com. If you click on any of the links on the visuals you can see the query used.

      The underlying dataset is hosted at sql.clickhouse.com e.g. https://sql.clickhouse.com/?query=U0VMRUNUIGNvdW50KCkgICBGUk...

      disclaimer: built this a a while ago but we maintain this at clickhouse

      oh and rubygems data is also there.

    • n4r9 10 hours ago
      > spanishconjugator [2], which consistently published ~240 releases per month between 2020 and 2024

      They also stopped updating major and minor versions after hitting 2.3 in Sept 2020. Would be interesting to hear the rationale behind the versioning strategy. Feels like you might as well use a datetimestamp for the version.

    • 0x500x79 3 hours ago
      deps.dev has a similar bigquery dataset across a couple more languages if someone wanted to do analysis across the other ecosystems they support.
  • bapak 11 hours ago
    > there are over 2800 legacy mixed-case packages, many of which have the same spelling as other existing lowercase packages

    This is insane

    • dotancohen 9 hours ago

        > This is insane
      
      Not for the JavaScript world.

      I hate to deride the entire community, but many of the collective community decisions are smells. I think that the low barrier to entry means that the community has many inexperienced influential people.

      • krapp 9 hours ago
        A lot of these decisions were made after Javascript went "enterprise" to make it seem more like a "serious" programming language to SV entrepreneurs by a small number of corporations, not necessarily the community.

        The bar for entry was always low with javascript, but it also used to be a lot more sane when it was a publicly-driven language.

  • sundarurfriend 10 hours ago
    The Julia General registry is locally stored as a tar.gz and has version info for all registered packages, so I tried this out for Julia packages. The top 5 are:

        DiffEqBase                  6.189.1   
        LoopVectorization           0.12.172  
        Reactant                    0.2.161   
        Mooncake                    0.4.159   
        Distributions               0.25.120  
    
    So, no crazy numbers or random unknown packages, all are major packages that have just had a lot of work and history to them. Out of the top 10, pretty much half were from the SciML ecosystem.

    Caveats/constraints: Like the post, this ignores non-SemVer packages (which mostly used date-based versions) and also jll (binary wrapper) packages which just use their underlying C libraries' versions. Among jlls, the largest that isn't a date afaict is NEO_jll with 25.31.34666+0 as its version.

    • int_19h 1 hour ago
      This would seem to imply that the vast majority of Julia packages are 0.x?
    • dotancohen 9 hours ago
      You might want to try a different storing strategy. 0.25 is above 0.4. These are, I believe, what are called in Unix flags "human numbers".
      • Savageman 8 hours ago
        I understood the list is ordered by biggest number, aka 189 > 172 > 161 > 159 > 120
        • Ghoelian 5 hours ago
          I think in semver 0.4 usually means 0.04, not 0.40..., so it should be lower than 0.25.

          Edit: nevermind, I misunderstood your point

  • aragonite 14 hours ago
    Incidentally I once ran into a mature package that had lived in the 0.0.x lane forever and treated every release as a patch, racking up a huge version number, and I had to remind the maintainer that users depending with caret ranges won't get those updates automatically. (In semver caret ranges never change the leftmost non-zero digit; in 0.0.x that digit is the patch version, so ^0.0.123 is just a hard pin to 0.0.123). There may occasionally be valid reasons to stay on 0.0.x though (e.g. @types/web).
    • robin_reala 11 hours ago
      Presumably they’re following https://0ver.org/
    • jve 13 hours ago
      Maybe that is intentional? Which package is it?
      • aragonite 13 hours ago
        It's the type definitions for developing chrome extensions. They'd been incrementing in the 0.0.x lane for almost a decade and bumped it to 0.1.0 after I raised the issue, so I doubt it was intentional:

        https://www.npmjs.com/package/@types/chrome?activeTab=versio...

        • creatonez 12 hours ago
          This is part of the DefinitelyTyped project. DT tends to get a lot of one-off contributions just for fixing the one error a dev is experiencing. So maybe they all just copied the version incrementing that previous commits had done, and no one in particular ever took the responsibility to say "this is ready now".
      • CITIZENDOT 8 hours ago
        threejs ?
  • franky47 13 hours ago
    Anthony Fu’s epoch versioning scheme (to differentiate breaking change majors from "marketing" majors) could yield easy winners here, at least on the raw version number alone (not the number of sequential versions released):

    https://antfu.me/posts/epoch-semver

    • bapak 11 hours ago
      > People often assume that a zero-major version indicates that the software is not ready for production

      I wonder why. Conventions that are being broken, maybe.

      • remedan 9 hours ago
        I don't know if this is the origin, but the semver spec says 0.x.y is unstable. Sure, not everybody uses semver, but it is popular enough for people to make incorrect assumptions.

        https://semver.org/#spec-item-4

        • int_19h 1 hour ago
          It's not the origin. Using 0.x for stuff like this was already a thing long before semver. For example, the very first release of Linux in 1991 was v0.01.
      • dotancohen 9 hours ago
        I agree with that sentiment.

        If the guy writing and maintaining the software is stating "this software is not stable yet" then who am I to disagree?

  • nosefurhairdo 14 hours ago
    The "winner" just had its 3000th release on GitHub, already a few patch versions past the version referenced in this article (which was published today): https://github.com/wppconnect-team/wa-version
    • genshii 13 hours ago
      After double-checking some things, the real winner is actually: https://github.com/nice-registry/all-the-package-names

      I made a fairly significant (dumb) mistake in the logic for extracting valid semver versions. I was doing a falsy check, so if any of major/minor/patch in the version was a 0, the whole package was ignored.

      The post has been updated to reflect this.

    • oconnore 14 hours ago
      This package also seems to just have a misbehaving github action that is in a loop.
      • genshii 14 hours ago
        Hmm yeah, I decided that one counts because the new packages have (slightly) different content, although it might be the case that the changes are junk/pointless anyway.
    • TZubiri 14 hours ago
      Brief reminder/clarification that these tools are used to circumvent WhatsApp ToS, and that they are used to:

      1- Spam 2- Scam 3- Avoid paying for Whatsapp API (which is the only form of monetization)

      And that the reason this thing gets so many updates is probably because of a mouse and cat game where Meta updates their software continuously to avoid these types of hacks and the maintainers do so as well, whether in automated or manual fashion.

      • afiori 13 hours ago
        Considering the 18 billions price tag and the current mixing of user data between meta and WhatsApp I believe that meta has now revenue streams in mind than just the API pricing
  • whilenot-dev 14 hours ago
    > Time to fetch version data for each one of those packages: ~12 hours (yikes)

    The author could improve the batching in fetchAllPackageData by not waiting for all 50 (BATCH_SIZE) promises to resolve at once. I just published a package for proper promise batching last week: https://www.npmjs.com/package/promises-batched

    • winrid 14 hours ago
      What's the benefit of promises like this here?

      Just spin up a loop of 50 call chains. When one completes you just do the next on next tick. It's like 3 lines of code. No libraries needed. Then you're always doing 50 at a time. You can still use await.

      async work() { await thing(); nextTick(work); }

      for(to 50) { work(); }

      then maybe a separate timer to check how many tasks are active I guess.

      • whilenot-dev 14 hours ago
        Promise.all waits for all 50 promises to resolve, so if one of these promises takes 3s, while the other 49 are taking 0.5s, you're waisting 2.5s awaiting each batch.

        The implementation is rather simple, but more than 3 LoC: https://github.com/whilenot-dev/promises-batched/blob/main/s...

        • winrid 13 hours ago
          I know. My point is you can do better without a library.
          • halfmatthalfcat 10 hours ago
            Why not write all of our applications on one file? Why bother using (language specific) modules? To take your argument to the logical extreme, DRY is a fanatical doomsday computer science cult.
    • 1gn15 8 hours ago
      Worried about being rate limited or DoSing the server.
    • genshii 14 hours ago
      Ah this is cool, thanks!
  • athrowaway3z 13 hours ago
    One of the 'winners' I randomly googled.

    > carrot-scan -> 27708 total versions

    > Command-line tool for detecting vulnerabilities in files and directories.

    I can't help but feel there is something absurd about this.

    • Taek 12 hours ago
      Each version is likely a new vulnerability that got submitted, doesn't seem that weird.
      • darkwater 12 hours ago
        Shouldn't vulnerabilities be "data" in this context? You bump the vulns database but keep the code at the same version if the logic is the same.
        • OJFord 10 hours ago
          If it's baked into the tool (can run offline) then it would be unavoidable, need a new version to get a new release on the package manager.

          1.2.3 -> 1.2.3+1 (or +anything, date, whatever) could arguably be idiomatic semver though - that's what you do for packaging changes, like updating the description or categories to file it under etc. without actually changing the program.

          • chrisandchris 3 hours ago
            > (can run offline)

            And yet you need internet access when running the package with npx, even though the package is already locally installed.

            At least, I can't use npx on so e of my VMs which do not have access to the internet. It just takes forever to atart (and I get annoyed after some minutes).

        • pixl97 4 hours ago
          The particular problem here is if you started out doing it wrong then changing your update behavior would break everyone's scripting around it. By changing the 'code version' everyones CI/CD system just keeps working the same way as any other package.
  • EdSchouten 12 hours ago
    So 19494 is the largest? That's far lower than I expected. There's nobody out there that has put a date in a version number (e.g., 20250915)?
    • genshii 5 hours ago
      There are plenty of larger ones and plenty of ones that used the date as the version, but I was mainly curious about packages that followed semver.

      Any package version that didn't follow the x.y.z format was excluded, and any package that had less published versions than their largest version number was excluded (e.g. a package at version 1.123.0 should have at least 123 published versions)

    • rs186 9 hours ago
      Well, we are looking at npm packages, where every package is supposed to follow semantic versioning. The fact that we don't have date as version number means everyone is a good citizen.

      https://docs.npmjs.com/about-semantic-versioning

      • arcfour 5 hours ago
        Off the top of my head, CloudFlare uses a somewhat date based method of typing for their Workers types package, but it makes sense in context as you define compatibility dates for a Worker when you set it up, which automatically enables/disables potentially breaking features in the API.

        https://www.npmjs.com/package/@cloudflare/workers-types

  • joeyhage 1 hour ago
    > 46. aws-sdk -> 1692 (2.1692.0)

    AWS still made the top 50

  • tedk-42 13 hours ago
    Large number of released packages due to renovatebot / dependabot patching + release automation!

    If this was an actual measurement of productivity that bot deserves a raise!

  • geetee 15 hours ago
    I wonder if the author could have replicated the couchdb database locally to make their life easier.
  • kubatyszko 1 hour ago
    the one with -1 obviously ;)
  • paulirish 3 hours ago
    `latentflip-test` is from the same fellow who did the "What the heck is the event loop anyway?" JSConf EU talk that many have seen. https://youtu.be/8aGhZQkoFbQ
  • bmn__ 3 hours ago
  • nailer 15 hours ago
    > I was recently working on a project that uses the AWS SDK for JavaScript. When updating the dependencies in said project, I noticed that the version of that dependency was v3.888.0. Eight hundred eighty eight. That’s a big number as far as versions go.

    It also isn’t the first AWS SDK. A few of us in… 2012 IIRC… wrote the first one because AWS didn’t think node was worth an SDK.

  • zastai0day 12 hours ago
    Haha, good luck finding a real project that holds that title. It's always some squatted name, a dependency confusion experiment, or a troll publishing a package with version 99999.99999.99999 just to see what breaks. The "king" of that hill changes all the time. Just another day in the NPM circus.
  • huflungdung 13 hours ago
    [dead]