First-Class Models: The Missing Productivity Revolution

(frest.substack.com)

26 points | by surprisetalk 3 days ago

8 comments

jdvh 2 hours ago
Rule of thumb: every 10% increase in complexity cuts your potential user base in half.
This is why people make backups by copy-pasting files. This is why Excel is so dominant. This is why systems like hypercard and git are not mainstream and never will be.
There is a large universe of tools people would love if only they would bother to learn how they worked. If only. Most people will just stick to whatever tools they know.
For most people the ability to go back and forward in time (linear history) is something they grasp immediately. Being able to go back in time and make a copy also requires no explanation. But having a version tree, forking and merging, having to deal multiple timelines and the graphs that represent them -- that's where you lose people.
[-]
- alphazard 2 minutes ago
  > Rule of thumb: every 10% increase in complexity cuts your potential user base in half.
  I agree this is an accurate rule of thumb. However if the complexity lets users achieve more, then the complexity can earn its keep. Using version control is so beneficial that software engineers deal with the complexity. The ability to maintain a more complicated model in one's head and use it to produce more value is not something that all users are able to do. More sophisticated users can afford to use more complicated tools.
  However the sophisticated users are reigned in by network effects. If you want to work with people then everyone needs to be able to deal with the complexity. Programmers are more sophisticated than most office workers, which is why we ubiquitously version codebases, and not so much spreadsheets.
  > This is why systems like hypercard and git are not mainstream and never will be.
  We are moving towards a world where fewer humans are needed, and the humans that are needed are the most sophisticated operators in their respective domains. This means less network effects, less unsophisticated user drag holding back the tooling. The worst drop off and the population average increases.
  I would not be surprised to see an understanding of version control and other sophisticated concepts become common place among the humans that still do knowledge work in the next few years.
- wavemode 1 hour ago
  I wouldn't frame it as "complexity", I would frame it as "cognitive load". You can lower cognitive load despite having high complexity. For example, you could (and many companies have done so) build a user-friendly version management system and UI on top of git, which on its surface is just "version 1", "version 2", "version 2 (final) (actually)" but under the hood is using commits and branches. You can have submenus expose advanced features to advanced users while the happy path remains easy to use.
- foundart 56 minutes ago
  I think the author's ideas are likely too complex for a wide audience, but they could be a game changer for those who can handle that kind of complexity.
- kamaal 1 hour ago
  Part of the reasons why so many people are disillusioned by AI. We are attempting to tame complexity that shouldn't exist at the first place.
  Im guessing lots of code that was getting written was kind of verbose boiler plate, automating all that doesn't move the productivity needle all that much. That shouldn't have existed at all to start with.
GiorgioG 2 hours ago
When we model systems and data models, we model them for the problem we're solving today, and with enough experience you can anticipate some future problems...but not all. Business is dynamic and software data models are relatively rigid. That's why business users love Excel - it's the javascript of the database world...you can slap something together without too much fuss and 60% of the time it'll work every time.
rorylaitila 1 hour ago
So I do a lot of backward looking analysis and forward looking planning. All my reusable models are SQL based. I am certainly a top user in terms of intersection of database, programming and financial modelling. I meet almost no FP&A persons doing what I do.
But I still reach for Excel for exploration. And I also share these explorations with clients in Excel. Why? Because when you are exploring, you don't yet know the domain. There is no value in saving and building upon top of that kind of data. "What if" questions from clients are ephemeral. My pile of scratch pad queries is absolutely enormous. I don't need or want any of that stuff cluttering up my data sources.
My biggest challenge in complex analysis is keeping track of all of the branches and data lineage of the complexity of the models. Knowing what some metric I created "means" and if or how it can be used properly by something else.
taeric 2 hours ago
I assert that the idea of "authoritative data" for a lot of hypothetical questions people would explore will itself be a hindrance. That is, for most hypothetical scenarios, you explicitly want modeled data. Ideally, you would know the parameters used in generating the data, but the entire point of many hypotheticals is you are projecting into an area where you don't have data. Or building a counterfactual world and predicting what it would have changed.
Similarly, "invariant preserving operations" is not necessarily what you want, either? You want to know what other parameters you would need to adjust to keep some conditions. But you want to be able to edit anything free form and then interact with it to get back to a "solved" state. That is to say, when interacting with a system, you explicitly want to allow bad or incomplete states. (This is, ultimately, what kills many "code as an AST" ideas.)
FreeCAD has a good example to consider on this. When drafting, you can pretty much freeform draw a part. But if you want to finish the session, you have to fully specify all parameters. This doesn't mean you can't add to the drawing while it is unsolved. It does mean that you can't take it to the next phase without fully solving.
This can be difficult if you don't have training material on how to use an application. And while it can be frustrating that you can save something in a state where you can't click "next," it is also frustrating to have a system where you can't save what you currently have to send to someone else to look at.
doug_durham 2 hours ago
So this is just an "idea"? It uses a lot of formal language terms, but doesn't specifically say how those are applied to the problem. Is this just "save all of the analysis state at every step, and allow subsequent steps to take different approaches?" I can see problems with data compatibility between steps depending upon the processing done.
carterschonwald 3 hours ago
So… I agree with the problem, but the solution in my mind is that you want to treat the counter parties as first class values that are essentially interacting objects with user space defined capabilities that wind up being a sort of transactional db as runtime environment. Transactions plus first class ownership and authorization and the ability to define incremental interactions between systems (which sounds like dependent session types but isn’t quite for a lot of little reasons)
I led building a first pilot system at jpmorgan 2015-2018 but ethereum et al got mind share for buzzword compliance at the time.
[-]
- ta988 1 hour ago
  Do you mind explaining a bit more with a concrete example your first paragraph?
- BOOSTERHIDROGEN 2 hours ago
  Mind expand on eth?
  [-]
  - carterschonwald 1 hour ago
    Sure. Top level context / motivation was build a db/ modelling tool that could easily describe all the different administrative and economic workflows and resources across all of finance.
    You actually want to view possible economic or administrative exchanges as just being literally partially applied anonymous fucntions that only counter parties can import, partially apply and if not fully applied, re-export for further steps.
    You kinda want dependent types so you can express stuff like “this has been signed and accepted by Bob or some designated delegate on his/her behalf”
    First class signed values get a title subtle because you’ve gotta have a robust way to talk about a canonical binary serialization that plays nicely with cryptographic signatures. The simplest way is to require every signed value have a globally unique nonce pubkey counter for each new thing signed by that identity. I’m saying that a little bit imprecisely and I mean the weaker per pubkey sense of that. This also is a sane approach because it’s the same as requiring per identity sequential consistency OR explicitly coordinated nonce sharding.
    Basically: if you just add first class identity and signatures as values to like anyyy dbms and have strict validation for any db transaction to commit, the whole needing an api barrier around your application db kinda goes away!
    AFAICT, pretty much every api wrapper around a db is mostly because pretty much every db has no native way to model and enforce an application specific identity authorization and resource ownership model.
    Allowing your full dbms to be exposed over an api while respecting application state and security is a pretty mind blowing perspective shift, and maybe I should revisit working on that.
    There’s a lot of other cool pieces that I’ve not touched on that make it pretty fun/interesting/useful, but I think that partially expresses stuff
jauntywundrkind 1 hour ago
There's a lot of factions doing git backed or git like systems out there already. LogRocket has a round-up of 9 different CMS, such as the excellent excellent Tina CMS. The #GitOps is incredibly well loved & respected for all manner of systems & infrastructure, with projects like ArgoCD being immensely popular. https://blog.logrocket.com/9-best-git-based-cms-platforms/ https://tina.io/ https://argoproj.github.io/cd/
There's countless git-like data systems. Dolt DB, LakeFS, and notably Iceberg has branching. https://github.com/dolthub/dolt https://lakefs.io/ https://iceberg.apache.org/docs/latest/branching/
My experience is the pushback that git is too complex, git is for programmers is enormously fast to follow most suggestions on using git. I remain enormously skeptical. There was a zeitgeist floating around a week or two ago about the worst thing you can do is to make decisions on the assumption that people are stupid, and git is a place where I absolutely believe we can outshine the rain, and will wonder why we ever let such doubt guide us away from such an amazing set of capabilities with such amazing tooling. We have such a start already with git tools! Once we try, we can keep humanizing this!!
My pet theory is that much of the weakness of git for everyone is that we do a terrible job of expressing data on the filesystem. It's popular & so reasonable to be able to copy a document to someone, but in my mind, ideally a document is maybe a file tree of sections or paragraphs, that the editor kindly manages for us. We'd need a general container format (OCI?) to give us lots of small files but still nicely wrapped into one. A 9p everything-is-a-file philosophy is semi known, has believers, but decomposing big data splats into smaller pieces, such that individual fields and bits and bobs of the data can be scripted: that is a shift to exposes the data, & is a necessary core accompaniment for any 9p-ification. Together, small/decomposed visible data on the filesystem unlocks people's ability to see what stuff is without the crutch of one and only one special purpose application, is a bridge to making much more of humankind more of an expert (or at least capable dabbler!) at computing, opens up broader realms of scripting & interaction that we've kept locked away, locked inside the process, only for developers to sometimes peer at. Git is weak because our data is encoded in countless bespoke ways, not visible on the file system. (It doesn't have to be that way.)
Regardless, programming itself has a lot of gains to make. Making the object model in programming languages version aware is a huge lift I really hope to see happen some day. This submissions is about improving the rest of the world, but programmers and languages should do for themselves, should lead the way. I started this post by citing lots of git related efforts, but these are libraries, are layers atop, and pushing versioning down down down into the runtime-i feel- is where we really need version control: right from the start, in the base. (More intuitive sense here, would love more developed structured argumentation on this).
Globally addressable (beyond the process boundary, beyond the machine boundary) object spaces, with version control, feels like some omega point for computing, where we go from bespokely layering together complex systems and capabilities, to having a language that can express and work through the range of materials that actually make up the computing world as it is. A better compute is possible.