The Waymo World Model

(waymo.com)

583 points | by xnx 6 hours ago

48 comments

mattlondon 4 hours ago
Suddenly all this focus on world models by Deep mind starts to make sense. I've never really thought of Waymo as a robot in the same way as e.g. a Boston Dynamics humanoid, but of course it is a robot of sorts.
Google/Alphabet are so vertically integrated for AI when you think about it. Compare what they're doing - their own power generation , their own silicon, their own data centers, search Gmail YouTube Gemini workspace wallet, billions and billions of Android and Chromebook users, their ads everywhere, their browser everywhere, waymo, probably buy back Boston dynamics soon enough (they're recently partnered together), fusion research, drugs discovery.... and then look at ChatGPT's chatbot or grok's porn. Pales in comparison.
[-]
- phkahler 3 hours ago
  Google has been doing more R&D and internal deployment of AI and less trying to sell it as a product. IMHO that difference in focus makes a huge difference. I used to think their early work on self-driving cars was primarily to support Street View in thier maps.
  [-]
  - brokencode 1 hour ago
    There was a point in time when basically every well known AI researcher worked at Google. They have been at the forefront of AI research and investing heavily for longer than anybody.
    It’s kind of crazy that they have been slow to create real products and competitive large scale models from their research.
    But they are in full gear now that there is real competition, and it’ll be cool to see what they release over the next few years.
    [-]
    - hosh 1 hour ago
      I also think the presence of Sergey Brin has been making a difference in this.
      [-]
      - hungryhobbit 1 hour ago
        Please, Google was terrible about using the tech the had long before Sundar, back when Brin was in charge.
        Google Reader is a simple example: Googl had by far the most popular RSS reader, and they just threw it away. A single intern could have kept the whole thing running, and Google has literal billions, but they couldn't see the value in it.
        I mean, it's not like being able to see what a good portion of America is reading every day could have any value for an AI company, right?
        Google has always been terrible about turning tech into (viable, maintained) products.
        [-]
        vinkelhake 44 minutes ago
        Is there an equivalent to Godwin's law wrt threads about Google and Google Reader?
        See also: any programming thread and Rust.
        burgreblast 23 minutes ago
        I never get the moaning about killing Reader. It was never about popularity or user experience.
        Reader had to be killed because it [was seen as] a suboptimal ad monetization engine. Page views were superior.
        Was Google going to support minimizing ads in any way?
        DiggyJohnson 25 minutes ago
        How is this relevant? At best it’s tangentially related and low effort
        jamespo 43 minutes ago
        Took a while but I got to the google reader post. Self host tt-rss, it's much better
      - refulgentis 1 hour ago
        Ex-googler: I doubt it, but am curious for rationale (i know there was a round of PR re: him “coming back to help with AI.” but just between you and me, the word on him internally, over years and multiple projects, was having him around caused chaos b/c he was a tourist flitting between teams, just spitting out ideas, but now you have unclear direction and multiple teams hearing the same “you should” and doing it)
        [-]
        pstuart 53 minutes ago
        That makes sense. A "secret shopper" might be a better way to avoid that but wouldn't give him the strokes of being the god in the room.
        LightBug1 33 minutes ago
        Oh ffs, we have an external investor who behaves like that. Literally set us back a year on pet nonsense projects and ideas.
    - smallnix 53 minutes ago
      > It’s kind of crazy that they have been slow to create real products and competitive large scale models from their research.
      I always thought they deliberately tried to contain the genie in the bottle as long as they could
  - AlfredBarnes 2 hours ago
    It has always felt to me that the LLM chatbots were a surprise to Google, not LLMs, or machine learning in general.
    [-]
    - raphlinus 1 hour ago
      Not true at all. I interacted with Meena[1] while I was there, and the publication was almost three years before the release of ChatGPT. It was an unsettling experience, felt very science fiction.
      [1]: https://research.google/blog/towards-a-conversational-agent-...
      [-]
      - hibikir 1 hour ago
        The surprise was not that they existed: There were chatbots in Google way before ChatGPT. What surprised them was the demand, despite all the problems the chatbots have. The pig problem with LLMs was not that they could do nothing, but how to turn them into products that made good money. Even people in openAI were surprised about what happened.
        In many ways, turning tech into products that are useful, good, and don't make life hell is a more interesting issue of our times than the core research itself. We probably want to avoid the valuing capturing platform problem, as otherwise we'll end up seeing governments using ham fisted tools to punish winners in ways that aren't helpful either
        [-]
        diamondage 26 minutes ago
        The uptake forced the bigger companies to act. With image diffusion models too - no corporate lawyer would let a big company release a product that allowed the customer to create any image...but when stable diffusion et al started to grow like they did...there was a specific price of not acting...and it was high enough to change boardroom decisions
      - nasretdinov 1 hour ago
        Well, I must say ChatGPT felt much more stable than Meena when I first tried it. But, as you said, it was a few years before ChatGPT was publicly announced :)
  - AbstractH24 1 hour ago
    Google and OpenAI are both taking very big gambles with AI, with an eye towards 2036 not 2026. As are many others, but them in particular.
    It'll be interesting to see which pays off and which becomes Quibi
- xnx 4 hours ago
  > Suddenly all this focus on world models by Deep mind starts to make sense
  Google's been thinking about world models since at least 2018: https://arxiv.org/abs/1803.10122
  [-]
  - anp 2 hours ago
    FWIW I understood GP to mean that it suddenly makes sense to them, not that there’s been a sudden focus shift at google.
- mooktakim 4 hours ago
  Tesla built something like this for FSD training, they presented many years ago. I never understood why they did productize it. It would have made a brilliant Maps alternative, which country automatically update from Tesla cars on the road. Could live update with speed cameras and road conditions. Like many things they've fallen behind
  [-]
  - berryg 3 hours ago
    No Lidar anymore on the 2026 Volvo models ES60 and EX60. See for example: https://www.jalopnik.com/2032555/volvo-ends-luminar-lidar-20...
    [-]
    - senordevnyc 2 hours ago
      I love Volvo, am considering buying one in a couple weeks actually, but they're doing nothing interesting in terms of ADAS, as far as I can tell. It seems like they're limited to adaptive cruise control and lane keeping, both of which have been solved problems for more than a decade.
      It sounds like they removed Lidar due to supplier issues and availability, not because they're trying to build self-driving cars and have determined they don't need it anymore.
      [-]
      - ruszki 1 hour ago
        Is lane keeping really a solved problem? Just last year one of my brand new rented cars tried to kill me a few times when I tried it again, and so far not even the simple lane leaving detection mechanism worked properly in any of the tried cars when it was raining.
      - nfg 1 hour ago
        I’d suggest doing some research on software quality. Two years back I was all for buying one (I was considering an EX40), but I got myself into some Facebook groups for owners and was shocked at the dreadful reports of quality of the software and it completely put me off. I got an ID4 instead. Reports about the EX90 have been dreadful. I was very interested, and I still admire their look and build when they drive by - but it killed my enthusiasm to buy one for a few years until they get it right.
        [-]
        nwienert 1 minute ago
        [delayed]
      - fuckyah 2 hours ago
        [dead]
  - jellojello 4 hours ago
    Without Lidar + the terrible quality of tesla onboard cameras.. street view would look terrible. The biggest L of elon's career is the weird commitment to no-lidar. If you've ever driven a Tesla, it gives daily messages "the left side camera is blocked" etc.. cameras+weather don't mix either.
    [-]
    - ASalazarMX 4 hours ago
      At first I gave him the benefit of the doubt, like that weird decision of Steve Jobs banning Adobe Flash, which ran most of the fun parts of the Internet back then, that ended up spreading HTML5. Now I just think he refused LIDAR on purely aesthetic reasons. The cost is not even that significant compared to the overall cost of a Tesla.
      [-]
      - ciberado 1 hour ago
        That one was motivated by the need of controlling the app distribution channel, just like they keep the web as a second class citizen in their ecosystem nowadays.
      - iamtheworstdev 3 hours ago
        he didn't refuse it. MobileEye or whoever cut Tesla off because they were using the lidar sensors in a way he didn't approve. From there he got mad and said "no more lidar!"
        [-]
        semiquaver 1 hour ago
        Assuming what you say is true, are they the only LIDAR vendor?
        iknowstuff 3 hours ago
        False. Mobileye never used lidar. Lmao where do you all come up with this
        [-]
        nerdsniper 3 hours ago
        I think Elon announced Tesla was ditching LIDAR in 2019.[0] This was before Mobileye offered LIDAR. Mobileye has used LIDAR from Luminar Technologies around 2022-2025. [1][2] They were developing their own lidar, but cancelled it. [3] They chose Innoviz Technologies as their LIDAR partner going forward for future product lines. [4]
        0: https://techcrunch.com/2019/04/22/anyone-relying-on-lidar-is...
        1: https://static.mobileye.com/website/corporate/media/radar-li...
        2: https://www.luminartech.com/updates/luminar-accelerates-comm...
        3: https://www.youtube.com/watch?v=Vvg9heQObyQ&t=48s
        4: https://ir.innoviz.tech/news-events/press-releases/detail/13...
        [-]
        Fricken 30 minutes ago
        The original Mobileye EyeQ3 devices that Tesla began installing in their cars in 2013 had only a single forward facing camera. They were very simple devices, only intended to be used for lane keeping. Tesla hacked the devices and pushed them beyond their safe design constraints.
        Then that guy got decapitated when his Model S drove under a semi-truck that was crossing the highway and Mobileye terminated the contract. Weirdly, the same fatal edge case occurred 2 more times at least on Tesla's newer hardware.
        https://en.wikipedia.org/wiki/List_of_Tesla_Autopilot_crashe...
        [-]
        nerdsniper 6 minutes ago
        Thank you!
        iknowstuff 2 hours ago
        Never with the product used by Tesla early on.
        agildehaus 3 hours ago
        https://www.mobileye.com/news/mobileye-to-end-internal-lidar...
        Um, yes they did.
        No idea if it had any relation to Tesla though.
        [-]
        iknowstuff 2 hours ago
        Did not
      - smallmancontrov 3 hours ago
        His stated reason was that he wanted the team focused on the driving problem, not sensor fusion "now you have two problems" problems. People assumed cost was the real reason, but it seems unfair to blame him for what people assumed. Don't get me wrong, I don't like him either, but that's not due to his autonomous driving leadership decisions, it's because of shitting up twitter, shitting up US elections with handouts, shitting up the US government with DOGE, seeking Epstein's "wildest party," DARVO every day, and so much more.
        [-]
        jellojello 3 hours ago
        Sensor fusion is an issue, one that is solvable over time and investment in the driving model, but sensor-can't-see-anything is a show stopper.
        Having a self-driving solution that can be totally turned off with a speck of mud, heavy rain, morning dew, bright sunlight at dawn and dusk.. you can't engineer your way out of sensor-blindness.
        I don't want a solution that is available to use 98% of the time, I want a solution that is always-available and can't be blinded by a bad lighting condition.
        I think he did it because his solution always used the crutch of "FSD Not Available, Right hand Camera is Blocked" messaging and "Driver Supervision" as the backstop to any failure anywhere in the stack. Waymo had no choice but to solve the expensive problem of "Always Available and Safe" and work backwards on price.
        [-]
        smallmancontrov 2 hours ago
        LIDAR is notoriously easy to blind, what are you on about? Bonus meme: LIDAR blinds you(r iPhone camera)!
      - jellojello 3 hours ago
        [dead]
    - verelo 4 hours ago
      Yeah its absurd. As a Tesla driver, I have to say the autopilot model really does feel like what someone who's never driven a car before thinks it's like.
      Using vision only is so ignorant of what driving is all about: sound, vibration, vision, heat, cold...these are all clues on road condition. If the car isn't feeling all these things as part of the model, you're handicapping it. In a brilliant way Lidar is the missing piece of information a car needs without relying on multiple sensors, it's probably superior to what a human can do, where as vision only is clearly inferior.
      [-]
      - smallmancontrov 3 hours ago
        The inputs to FSD are:
        7 cameras x 36fps x 5Mpx x 30s 48kHz audio Nav maps and route for next few miles 100Hz kinematics (speed, IMU, odometry, etc)
        Source: https://youtu.be/LFh9GAzHg1c?t=571
        [-]
        ambicapter 3 hours ago
        So if they’re already “fusioning” all these things, why would LIDAR be any different?
        [-]
        smallmancontrov 2 hours ago
        Tesla went nothing-but-nets (making fusion easy) and Chinese LIDAR became cheap around 2023, but monocular depth estimation was spectacularly good by 2021. By the time unit cost and integration effort came down, LIDAR had very little to offer a vision stack that no longer struggled to perceive the 3D world around it.
        Also, integration effort went down but it never disappeared. Meanwhile, opportunity cost skyrocketed when vision started working. Which layers would you carve resources away from to make room? How far back would you be willing to send the training + validation schedule to accommodate the change? If you saw your vision-only stack take off and blow past human performance on the march of 9s, would you land the plane just because red paint became available and you wanted to paint it red?
        I wouldn't completely discount ego either, but IMO there's more ego in the "LIDAR is necessary" case than the "LIDAR isn't necessary" at this point. FWIW, I used to be an outspoken LIDAR-head before 2021 when monocular depth estimation became a solved problem. It was funny watching everyone around me convert in the opposite direction at around the same time, probably driven by politics. I get it, I hate Elon's politics too, I just try very hard to keep his shitty behavior from influencing my opinions on machine learning.
        [-]
        magicalist 1 hour ago
        > but monocular depth estimation was spectacularly good by 2021
        It's still rather weak and true monocular depth estimation really wasn't spectacularly anything in 2021. It's fundamentally ill posed and any priors you use to get around that will come to bite you in the long tail of things some driver will encounter on the road.
        The way it got good is by using camera overlap in space and over time while in motion to figure out metric depth over the entire image. Which is, humorously enough, sensor fusion.
        [-]
        smallmancontrov 30 minutes ago
        It was spectacularly good before 2021, 2021 is just when I noticed that it had become spectacularly good. 7.5 billion miles later, this appears to have been the correct call.
        kanbara 2 hours ago
        depth estimation is but one part of the problem— atmospheric and other conditions which blind optical visible spectrum sensors, lack of ambient (sunlight) and more. lidar simply outperforms (performs at all?) in these conditions. and provides hardware back distance maps, not software calculated estimation
        [-]
        gibolt 1 hour ago
        Lidar fails worse than cameras in nearly all those conditions. There are plenty of videos of Tesla's vision-only approach seeing obstacles far before a human possibly could in all those conditions on real customer cars. Many are on the old hardware with far worse cameras
        kranke155 1 hour ago
        Always thought the case was for sensor redundancy and data variety - the stuff that throws off monocular depth estimation might not throw off a lidar or radar.
        7e 1 hour ago
        Monocular depth estimation can be fooled by adversarial images, or just scenes outside of its distribution. It's a validation nightmare and a joke for high reliability.
        [-]
        gibolt 1 hour ago
        It isn't monocular though. A Tesla has 2 front-facing cameras, narrow and wide-angle. Beyond that, it is only neural nets at this point, so depth estimation isn't directly used; it is likely part of the neural net, but only the useful distilled elements.
        [-]
        smallmancontrov 10 minutes ago
        I never said it was. I was using it as a lower bound for what was possible.
        verelo 3 hours ago
        Better than I expected. So this was 3 days ago, is this for all previously models or is there a cut off date here?
        ChicagoDave 2 hours ago
        Fog, heavy rain, heavy snow, people running between cars or from an obstructed view…
        None of these technologies can ever be 100%, so we’re basically accepting a level of needless death.
        Musk has even shrugged off FSD related deaths as, “progress”.
        [-]
        smallmancontrov 2 hours ago
        Humans: 70 deaths in 7 billion miles
        FSD: 2 deaths in 7 billion miles
        Looks like FSD saves lives by a margin so fat it can probably survive most statistical games.
        [-]
        elgenie 44 minutes ago
        Isn't there a great deal of gaming going on with the car disengaging FSD milliseconds before crashing? Voila, no "full" "self" driving accident; just another human failing [*]!
        [*] Failing to solve the impossible situation FSD dropped them into, that is.
        [-]
        smallmancontrov 23 minutes ago
        Nope. NHTSA's criteria for reporting is active-within-30-seconds.
        https://www.nhtsa.gov/laws-regulations/standing-general-orde...
        If there's gamesmanship going on, I'd expect the antifan site linked below to have different numbers, but it agrees with the 2 deaths figure for FSD.
        hn_acc1 1 hour ago
        Is that the official Tesla stat? I've heard of way more Tesla fatalities than that..
        [-]
        simondotau 11 minutes ago
        There are a sizeable number of deaths associated with the abuse of Tesla’s adaptive cruise control with lane cantering (publicly marketed as “autopilot”). Such features are commonplace on many new cars and it is unclear whether Tesla is an outlier, because no one is interested in obsessively researching cruise control abuse among other brands.
        There are two deaths associated with FSD.
        ChicagoDave 1 hour ago
        This is absolutely a Musk defender. FSD and Tesla related deaths are much higher.
        https://www.tesladeaths.com/index-amp.html
        [-]
        smallmancontrov 29 minutes ago
        Autopilot is the shitty lane assist. FSD is the SOTA neural net.
        Your link agrees with me:
        > 2 fatalities involving the use of FSD
        Fricken 29 minutes ago
        I don't know what he's on about. Here's a better list:
        https://en.wikipedia.org/wiki/List_of_Tesla_Autopilot_crashe...
        [-]
        smallmancontrov 22 minutes ago
        Autopilot is the shitty lane assist. FSD is the SOTA neural net.
        Your link agrees with me:
        > two that NHTSA's Office of Defect Investigations determined as happening during the engagement of Full Self-Driving (FSD) after 2022.
        [-]
      - torginus 2 hours ago
        I quickly googled Lidar limitations, and this article came up:
        https://www.yellowscan.com/knowledge/how-weather-really-affe...
        Seeing how its by a lidar vendor, I don't think they're biased against it. It seems Lidar is not a panacea - it struggles with heavy rain, snow, much more than cameras do and is affected by cold weather or any contamination on the sensor.
        So lidar will only get you so far. I'm far more interested in mmwave radar, which while much worse in spatial resolution, isn't affected by light conditions, weather, can directly measure stuff on the thing its illuminating, like material properties, the speed its moving, the thickness.
        Fun fact: mmWave based presence sensors can measure your hearbeat, as the micro-movements show up as a frequency component. So I'd guess it would have a very good chance to detect a human.
        I'm pretty sure even with much more rudimentary processing, it'll be able to tell if its looking at a living being.
        By the way: what happened to the idea that self-driving cars will be able to talk to each other and combine each other's sensor data, so if there are multiple ones looking at the same spot, you'd get a much improved chance of not making a mistake.
      - ASalazarMX 3 hours ago
        Maybe vision-only can work with much better cameras, with a wider spectrum (so they can see thru fog, for example), and self-cleaning/zero upkeep (so you don't have to pull over to wipe a speck of mud from them). Nevertheless, LIDAR still seems like the best choice overall.
      - iknowstuff 3 hours ago
        Autopilot hasn’t been updated in years and is nothing like FSD. FSD does use all of those cues.
        [-]
        verelo 3 hours ago
        I misspoke, i'm using Hardware 3 FSD.
    - 0xfaded 3 hours ago
      I have HW3, but FSD reliably disengages at this time of year with sunrise and sunset during commute hours.
      [-]
      - jellojello 3 hours ago
        Yep, and won't activate until any morning dew is off the sensors.. or when it rains too hard.. or if it's blinded by a shiny building/window/vehicle.
        I will never trust 2d camera-only, it can be covered or blocked physically and when it happens FSD fails.
        As cheap as LIDAR has gotten, adding it to every new tesla seems to be the best way out of this idiotic position. Sadly I think Elon got bored with cars and moved on.
      - iknowstuff 3 hours ago
        FSD14 on hw4 does not. Its dynamic range is equivalent or better than human.
    - kypro 3 hours ago
      From the perspective of viewing FSD as an engineering problem that needs solving I tend to think Elon is on to something with the camera-only approach – although I would agree the current hardware has problems with weather, etc.
      The issue with lidar is that many of the difficult edge-cases of FSD are all visible-light vision problems. Lidar might be able to tell you there's a car up front, but it can't tell you that the car has it's hazard lights on and a flat tire. Lidar might see a human shaped thing in the road, but it cannot tell whether it's a mannequin leaning against a bin or a human about to cross the road.
      Lidar gets you most of the way there when it comes to spatial awareness on the road, but you need cameras for most of the edge-cases because cameras provide the color data needed to understand the world.
      You could never have FSD with just lidar, but you could have FSD with just cameras if you can overcome all of the hardware and software challenges with accurate 3D perception.
      Given Lidar adds cost and complexity, and most edge cases in FSD are camera problems, I think camera-only probably helps to force engineers to focus their efforts in the right place rather than hitting bottlenecks from over depending on Lidar data. This isn't an argument for camera-only FSD, but from Tesla's perspective it does down costs and allows them to continue to produce appealing cars – which is obviously important if you're coming at FSD from the perspective of an auto marker trying to sell cars.
      Finally, adding lidar as a redundancy once you've "solved" FSD with cameras isn't impossible. I personally suspect Tesla will eventually do this with their robotaxis.
      That said, I have no real experience with self-driving cars. I've only worked on vision problems and while lidar is great if you need to measure distances and not hit things, it's the wrong tool if you need to comprehend the world around you.
      [-]
      - senordevnyc 2 hours ago
        This is so wild to read when Waymo is currently doing like 500,000 paid rides every week, all over the country, with no one in the driver's seat. Meanwhile Tesla seems to have a handful of robotaxis in Austin, and it's unclear if any of them are actually driverless.
        But the Tesla engineers are "in the right place rather than hitting bottlenecks from over depending on Lidar data"? What?
        [-]
        smallmancontrov 2 hours ago
        Tesla has driven 7.5B autonomous miles to Waymo's 0.2B, but yes, Waymo looks like they are ahead when you stratify the statistics according to the ass-in-driver-seat variable and neglect the stratum that makes Tesla look good.
        The real question is whether doing so is smart or dumb. Is Tesla hiding big show-stopper problems that will prevent them from scaling without a safety driver? Or are the big safety problems solved and they are just finishing the Robotaxi assembly line that will crank out more vertically-integrated purpose-designed cars than Waymo's entire fleet every day before lunch?
        [-]
        hn_acc1 1 hour ago
        Tesla's also been involved in WAY more accidents than Waymo - and has tried to silence those people, claim FSD wasn't active, etc.
        What good is a huge fleet of Robotaxis if no one will trust them? I won't ever set foot in a Robotaxi, as long as Elon is involved.
        [-]
        kypro 59 minutes ago
        There's more Tesla's on the road than Waymo's by several orders of magnitude. Additionally the types of roads and conditions Tesla's drive under is completely incomparable to Waymo.
        [-]
        jamespo 35 minutes ago
        Yes that was accounted for above, but this isn't autonomous apples to apples
        jasondigitized 45 minutes ago
        semi autonomous
        kypro 1 hour ago
        I wasn't arguing Tesla is ahead of Waymo? Nor do I think they are. All I was arguing was that it makes sense from the perspective of a consumer automobile maker to not use lidar.
        I don't think Tesla is that far behind Waymo though given Waymo has had a significant head start, the fact Waymo has always been a taxi-first product, and given they're using significantly more expensive tech than Tesla is.
        Additionally, it's not like this is a lidar vs cameras debate. Waymo also uses and needs cameras for FSD for the reasons I mentioned, but they supplement their robotaxis with lidar for accuracy and redundancy.
        My guess is that Tesla will experiment with lidar on their robotaxis this year because design decisions should differ from those of a consumer automobile. But I could be wrong because if Tesla wants FSD to work well on visually appealing and affordable consumer vehicles then they'll probably have to solve some of the additional challenges with with a camera-only FSD system. I think it will depend on how much Elon decides Tesla needs to pivot into robotaxis.
        Either way, what is undebatable is that you can't drive with lidar only. If the weather is so bad that cameras are useless then Waymos are also useless.
    - gambiting 2 hours ago
      >>The biggest L of elon's career is the weird commitment to no-lidar.
      I thought it was the Nazi salutes on stage and backing neo-nazi groups everywhere around the world, but you know, I guess the lidar thing too.
- schiffern 29 minutes ago
```
  >I've never really thought of Waymo as a robot in the same way as e.g. a Boston Dynamics humanoid, but of course it is a robot of sorts.
```
  So for the record, that's 3+ years behind Tesla.
  https://www.youtube.com/watch?v=ODSJsviD_SU&t=3594s
  [-]
  - tapoxi 19 minutes ago
    Aren't they still using safety drivers or safety follow cars and in fewer cities? Seems Tesla is pretty far behind.
    [-]
    - schiffern 13 minutes ago
      What do you think I said that you're contradicting?
      IMO the presence of safety drivers is just a sensible "as low as reasonably achievable" measure during the early rollout. I'm not sure that can (fairly) be used as a point against them.
      I'm comfortably with Tesla sparing no expense for safety, since I think we all (including Tesla) understand that this isn't the ultimate implementation. In fact, I think it would be a scandal if Tesla failed to do exactly that.
      Damned if you do and damned if you don't, apparently.
      [-]
      - tapoxi 12 minutes ago
        I don't know if Tesla claiming they're doing something carries weight anymore.
        [-]
        schiffern 6 minutes ago
        Setting aside your anti-Tesla bias, none of what I said relies on Tesla claims. The "chase vehicle" claims are all based on third-party accounts from actual rideshare customers.
- smeeth 4 hours ago
  I always understood this to be why Tesla started working on humanoid robots
  [-]
  - ACCount37 3 hours ago
    Pretty much. They banked on "if we can solve FSD, we can partially solve humanoid robot autonomy, because both are robots operating in poorly structured real world environments".
  - jasondigitized 44 minutes ago
    I don't want a humanoid robot. I want a purpose built robot.
  - Fricken 26 minutes ago
    It's so they can stick a Tesla logo on a bunch of chinese tech and call it innovation.
  - smt88 3 hours ago
    They started working on humanoid robots because Musk always has to have the next moonshot, trillion-dollar idea to promise "in 3 years" to keep the stock price high.
    As soon as Waymo's massive robotaxi lead became undeniable, he pivoted to from robotaxis to humanoid robots.
    [-]
    - senordevnyc 2 hours ago
      Yeah, that and running Grok on a trillion GPUs in space lol
- dmd 3 hours ago
  Which is why it's embarrassing how much worse Gemini is at searching the web for grounding information, and how incredibly bad gemini cli is.
  [-]
  - xnx 1 hour ago
    Not my experience in either of those areas.
- coffeemug 3 hours ago
  The vertical integration argument should apply to Grok. They have Tesla driving data (probably much more data than Waymo), Twitter data, plus Tesla/SpaceX manufacturing data. When/if Optimus starts on the production line, they'll have that data too. You could argue they haven't figured out how to take advantage of it, but the potential is definitely there.
  [-]
  - BoredPositron 3 hours ago
    Agreed. Should they achieve Google level integration, we will all make sure they are featured in our commentary. Their true potential is surely just around the corner...
  - jeffbee 2 hours ago
    "Tesla has more data than Waymo" is some of the lamest cope ever. Tesla does not have more video than Google! That's crazy! People who repeat this are crazy! If there was a massive flow of video from Tesla cars to Tesla HQ that would have observable side effects.
- thefounder 3 hours ago
  But somehow google fails to execute. Gemini is useless for programming and I don’t think even bother to use it as chat app. Claude code + gpt 5.2 xhigh for coding and gpt as chat app are really the only ones that are worth it(price and time wise)
  [-]
  - coffeemug 3 hours ago
    I've recently switched to Claude for chat. GPT 5.2 feels very engagement-maxxed for me, like I'm reading a bad LinkedIn post. Claude does a tiny bit of this too, but an order of magnitude less in my experience. I never thought I'd switch from ChatGPT, but there is only so much "here's the brutal truth, it's not x it's y" I can take.
    [-]
    - thechao 3 hours ago
      GPT likes to argue, and most of its arguments are straw man arguments, usually conflating priors. It's ... exhausting; akin to arguing on the internet. (What am I even saying, here!?) Claude's a lot less of that. I don't know if tracks discussion/conversation better; but, for damn sure, it's got way less verbal diarrhea than GPT.
      [-]
      - mrlongroots 2 hours ago
        Yes, GPT5-series thinking models are extremely pedantic and tedious. Any conversation with them is derailed because they start nitpicking something random.
        But Codex/5.2 was substantially more effective than Claude at debugging complex C++ bugs until around Fall, when I was writing a lot more code.
        I find Gemini 3 useless. It has regressed on hallucinations from Gemini 2.5, to the point where its output is no better than a random token stream despite all its benchmark outperformance. I would use Gemini 2.5 to help write papers and all, can't see to use Gemini 3 for anything. Gemini CLI also is very non-compliant and crazy.
    - thefounder 2 hours ago
      To me ChatGPT seems smarter and knows more. That’s why I use it. Even Claude rates gpt better for knowledge answers. Not sure if that itself is any indication. Claude seems superficial unless you hammer it to generate a good answer.
    - aschla 3 hours ago
      Experiencing the same. It seems Anthropic’s human-focused design choices are becoming a differentiator.
  - henryfjordan 2 hours ago
    Gemini works well enough in Search and in Meet. And it's baked into the products so it's dead simple to use.
    I don't think Google is targeting developers with their AI, they are targeting their product's users.
  - unsupp0rted 1 hour ago
    Gemini is by far the best UI/UX designer model. Codex seems to the worst: it'll build something awkward and ugly, then Gemini will take 30-60 seconds to make it look like something that would have won a design award a couple years ago.
  - noelsusman 2 hours ago
    It is a bit mind boggling how behind they were considering they invented transformers and were also sitting on the best set of training data in the world, but they've caught up quite a bit. They still lag behind in coding, but I've found Gemini to be pretty good at more general knowledge tasks. Flash 3 in particular is much better than anything of comparable price and speed from OpenAI or Anthropic.
- spiderfarmer 1 hour ago
  Grok/xAI is a joke at this point. A true money pit without any hopes for a serious revenue stream.
  They should be bought by a rocket company. Then they would stand a chance.
- jasondigitized 2 hours ago
  The flywheel is starting to spin......
- Dig1t 1 hour ago
  >or grok's porn
  I know it’s gross, but I would not discount this. Remember why Blu-ray won over HDDVD? I know it won for many other technical reasons, but I think there are a few historical examples of sexual content being a big competitive advantage.
- uoaei 1 hour ago
  What an upsetting comment. I'm glad you came around but what did you think was going to be effective before you came around to world models?
- themafia 3 hours ago
  It's a 3500lb robot that can kill you.
  Boston Robotics is working on a smaller robot that can kill you.
  Anduril is working on even smaller robots that can kill you.
  The future sucks.
  [-]
  - zzzeek 3 hours ago
    and they're all controlled by (poorly compensated) humans anyway [1] [2]
    [1] https://www.wsj.com/tech/personal-tech/i-tried-the-robot-tha...
    [2] https://futurism.com/advanced-transport/waymos-controlled-wo...
    [-]
    - themafia 2 hours ago
      They couldn't even make burger flipping robots work and are paying fast food workers $20/hr in California.
      If that doesn't make it obvious what they can and cannot do then I can't respect the tranche of "hackers" who blindly cheer on this unchecked corporate dystopian nightmare.
- sdf2erf 4 hours ago
  "Waymo as a robot in the same way"
  Erm, a dishwasher, washing machine, automated vacuum can be considered robots. Im confused as to this obsession of the term - there are many robots that already exist. Robotics have been involved in the production of cars for decades.
  ......
  [-]
  - ASalazarMX 3 hours ago
    I think the (gray) line is the degree of autonomy. My washing machine makes very small, predictable decisions, while a Waymo has to manage uncertainty most of the time.
    [-]
    - sdf2erf 3 hours ago
      Its irrelevant. A robot is a robot.
      Dictionary def: "a machine controlled by a computer that is used to perform jobs automatically."
      [-]
      - mattlondon 3 hours ago
        No one is denying that robots existed already (but I would hardly call a dishwasher a robot FWIW)
        But in my mind a waymo was always a "car with sensors", but more recently (especially having recently used them a bunch in California recently) I've come to think of them truly as robots.
      - saghm 3 hours ago
        A robot is a robot, and a human is a creature that won't necessarily agree with another human on what the definition of a word is. Dictionaries are also written by humans and don't necessarily reflect the current consensus, especially on terms where people's understanding might evolve over time as technology changes.
        Even if that definition were universally agreed on l upon though, that's not really enough to understand what the parent comment was saying. Being a robot "in the same way" as something else is even less objective. Humans are humans, but they're also mammals; is a human a mammal "in the same way" as a mouse? Most humans probably have a very different view of the world than most mice, and the parent comment was specifically addressing the question of whether it makes sense for an autonomous car to model the world the same way as other robots or not. I don't see how you can dismiss this as "irrelevant" because both humans and mice are mammals (or even animals; there's no shortage of classifications out there) unless you're completely having a different conversation than the person you responded to. You're not necessarily wrong because of that, but you're making a pretty significant misjudgment if you think that's helpful to them or to anyone else involved in the ongoing conversation.
      - ASalazarMX 3 hours ago
        TIL fuel injectors are robots. Probably my ceiling lights too.
        Maybe we need to nitpick about what a job is exactly? Or we could agree to call Waymos (semi)autonomous robots?
      - goatlover 3 hours ago
        In the same way people online have argued helicopters are flying cars, it doesn't capture what most people mean when they use the word "robot", anymore than helicopters are what people have in mind when they mention flying cars.
SebastianSosa 2 minutes ago
Very concerned with this direction of training “counterfactual events such as whether the Waymo Driver could have safely driven more confidently instead of yielding in a particular situation.” Seems dicey. This could lead in the direction to a less safe Waymo. Since the counterfactual will be generated, I suspect that that the generations will be biased towards survivor situations where most video footage in its training data will be from environments where people reacted well not those that ended in tragedy. Emboldening Waymo on generated best case data. THIS IS DANGEROUS!!!
yummypaint 48 minutes ago
By leveraging Genie’s immense world knowledge, it can simulate exceedingly rare events—from a tornado to a casual encounter with an elephant—that are almost impossible to capture at scale in reality. The model’s architecture offers high controllability, allowing our engineers to modify simulations with simple language prompts, driving inputs, and scene layouts. Notably, the Waymo World Model generates high-fidelity, multi-sensor outputs that include both camera and lidar data.
How do you know the generated outputs are correct? Especially for unusual circumstances?
Say the scenario is a patch of road is densely covered with 5 mm ball bearings. I'm sure the model will happily spit out numbers, but are they reasonable? How do we know they are reasonable? Even if the prediction is ok, how do we fundamentally know that the prediction for 4 mm ball bearings won't be completely wrong?
There seems to be a lot of critical information missing.
[-]
- IMTDb 23 minutes ago
  The idea is that, over time, the quality and accuracy of world-model outputs will improve. That, in turn, lets autonomous driving systems train on a large amount of “realistic enough” synthetic data.
  For example, we know from experience that Waymo is currently good enough to drive in San Francisco. We don’t yet trust it in more complex environments like dense European cities or Southeast Asian “hell roads.” Running the stack against world models can give a big head start in understanding what works, and which situations are harder, without putting any humans in harm’s way.
  We don’t need perfect accuracy from the world model to get real value. And, as usual, the more we use and validate these models, the more we can improve them; creating a virtuous cycle.
- joshfee 40 minutes ago
  Isn't that true for any scenario previously unencountered, whether it is a digital simulation or a human? We can't optimize for the best possible outcome in reality (since we can't predict the future), but we can optimize for making the best decisions given our knowledge of the world (even if it is imperfect).
  In other words it is a gradient from "my current prediction" to "best prediction given my imperfect knowledge" to "best prediction with perfect knowledge", and you can improve the outcome by shrinking the gap between 1&2 or shrinking the gap between 2&3 (or both)
- ses1984 26 minutes ago
  You could train it in simulation and then test it in reality.
  [-]
  - inkysigma 25 minutes ago
    Would it actually be a good idea to operate a car near an active tornado?
- fooker 41 minutes ago
  > from a tornado to a casual encounter with an elephant
  A sims style game with this technology will be pretty nice!
- aaaalone 45 minutes ago
  They probably just look at the results of the generation.
  I mean would I like a in-depth tour of this? Yes.
  But it's a marketing blog article, what do you expect?
  [-]
  - parliament32 38 minutes ago
    > just look at the results of the generation
    And? The entire hallucination problem with text generators is "plausible sounding yet incorrect", so how does a human eyeballing it help at all?
    [-]
    - inkysigma 19 minutes ago
      I think because here there's no single correct answer that the model is allowed to be fuzzier. You still mix in real training data and maybe more physics based simulation of course but it does seem acceptable that you synthesize extremely tail evaluations since there isn't really a "better" way by definition and you can evaluate the end driving behavior after training.
      You can also probably still use it for some kinds of evaluation as well since you can detect if two point clouds intersect presumably.
      In much a similar way that LLMs are not perfect at translation but are widely used anyway for NMT.
xnx 6 hours ago
> The Waymo World Model can convert those kinds of videos, or any taken with a regular camera, into a multimodal simulation—showing how the Waymo Driver would see that exact scene.
Subtle brag that Waymo could drive in camera-only mode if they chose to. They've stated as much previously, but that doesn't seem widely known.
[-]
- bonsai_spool 5 hours ago
  I think I'm misunderstanding - they're converting video into their representation which was bootstrapped with LIDAR, video and other sensors. I feel you're alluding to Tesla, but Tesla could never have this outcome since they never had a LIDAR phase.
  (edit - I'm referring to deployed Tesla vehicles, I don't know what their research fleet comprises, but other commenters explain that this fleet does collect LIDAR)
  [-]
  - smallmancontrov 5 hours ago
    They can and they do.
    https://youtu.be/LFh9GAzHg1c?t=872
    They've also built it into a full neural simulator.
    https://youtu.be/LFh9GAzHg1c?t=1063
    I think what we are seeing is that they both converged on the correct approach, one of them decided to talk about it, and it triggered disclosure all around since nobody wants to be seen as lagging.
    [-]
    - tfehring 4 hours ago
      I watched that video around both timestamps and didn't see or hear any mention of LIDAR, only of video.
      [-]
      - smallmancontrov 4 hours ago
        Exactly: they convert video into a world model representation suitable for 3D exploration and simulation without using LIDAR (except perhaps for scale calibration).
        [-]
        tfehring 4 hours ago
        My mistake - I misinterpreted your comment, but after re-reading more carefully, it's clear that the video confirms exactly what you said.
    - IhateAI_3 4 hours ago
      tesla is not impressive, I would never put my child in one
  - yakz 5 hours ago
    Tesla does collect LIDAR data (people have seen them doing it, it's just not on all of the cars) and they do generate depth maps from sensor data, but from the examples I've seen it is much lower resolution than these Waymo examples.
    [-]
    - justapassenger 5 hours ago
      Tesla does it to map the areas to come up with high def maps for areas where their cars try to operate.
      [-]
      - vardump 5 hours ago
        Tesla uses lidar to train their models to generate depth data out of camera input. I don’t think they have any high definition maps.
- ActorNightly 5 hours ago
  The purpose of lidar is to prove error correction when you need it most in terms of camera accuracy loss.
  Humans do this, just in the sense of depth perception with both eyes.
  [-]
  - robotresearcher 2 hours ago
    Human depth perception uses stereo out to only about 2 or 3 meters, after which the distance between your eyes is not a useful baseline. Beyond 3m we use context clues and depth from motion when available.
    [-]
    - aylons 2 hours ago
      Thanks, saved some work.
      And I'll add that it in practice it is not even that much unless you're doing some serious training, like a professional athlete. For most tasks, the accurate depth perception from this fades around the length of the arms.
    - cyanydeez 2 hours ago
      ok, but a care is a few meters wide, isn't that enough for driving depth perception similar to humans
      [-]
      - robotresearcher 2 hours ago
        The depths you are trying to estimate are to the other cars, people, turnings, obstacles, etc. Could be 100m away or more on the highway.
        [-]
        cyanydeez 18 minutes ago
        ok, but the point trying to be made is based on human's depth perception, but a car's basic limitation is the width of the vehicle, so there's missing information if you're trying to figure out if a car can use cameras to do what human eyes/brains do.
  - dbt00 5 hours ago
    (Always worth noting, human depth perception is not just based on stereoscopic vision, but also with focal distance, which is why so many people get simulator sickness from stereoscopic 3d VR)
    [-]
    - wolrah 4 hours ago
      > Always worth noting, human depth perception is not just based on stereoscopic vision, but also with focal distance
      Also subtle head and eye movements, which is something a lot of people like to ignore when discussing camera-based autonomy. Your eyes are always moving around which changes the perspective and gives a much better view of depth as we observe parallax effects. If you need a better view in a given direction you can turn or move your head. Fixed cameras mounted to a car's windshield can't do either of those things, so you need many more of them at higher resolutions to even come close to the amount of data the human eye can gather.
    - CobrastanJorji 3 hours ago
      I keep wondering about the focal depth problem. It feels potentially solvable, but I have no idea how. I keep wondering if it could be as simple as a Magic Eye Autostereogram sort of thing, but I don't think that's it.
      There have been a few attempts at solving this, but I assume that for some optical reason actual lenses need to be adjusted and it can't just be a change in the image? Meta had "Varifocal HMDs" being shown off for a bit, which I think literally moved the screen back and forth. There were a couple of "Multifocal" attempts with multiple stacked displays, but that seemed crazy. Computer Generated Holography sounded very promising, but I don't know if a good one has ever been built. A startup called Creal claimed to be able to use "digital light fields", which basically project stuff right onto the retina, which sounds kinda hogwashy to me but maybe it works?
    - mikepurvis 4 hours ago
      My understanding is that contextual clues are a big part of it too. We see a the pitcher wind up and throw a baseball as us more than we stereoscopically track its progress from the mound to the plate.
      More subtly, a lot of depth information comes from how big we expect things to be, since everyday life is full of things we intuitively know the sizes of, frames of reference in the form of people, vehicles, furniture, etc . This is why the forced perspective of theme park castles is so effective— our brains want to see those upper windows as full sized, so we see the thing as 2-3x bigger than it actually is. And in the other direction, a lot of buildings in Las Vegas are further away than they look because hotels like the Bellagio have large black boxes on them that group a 2x2 block of the actual room windows.
    - kevindamm 4 hours ago
      Actually the reason people experience vection in VR is not focal depth but the dissonance between what their eyes are telling them and what their inner ear and tactile senses are telling them.
      It's possible they get headaches from the focal length issues but that's different.
  - pants2 5 hours ago
    Another way humans perceive depth is by moving our heads and perceiving parallax.
  - menaerus 5 hours ago
    How expensive is their lidar system?
    [-]
    - hangonhn 5 hours ago
      Hesai has driven the cost into the $200 to 400 range now. That said I don't know what they cost for the ones needed for driving. Either way we've gone from thousands or tens of thousands into the hundreds dollar range now.
      [-]
      - bragr 4 hours ago
        Looking at prices, I think you are wrong and automotive Lidar is still in the 4 to 5 figure range. HESAI might ship Lidar units that cheap, but automotive grade still seems quite expensive: https://www.cratustech.com/shop/lidar/
        [-]
        tzs 3 hours ago
        Those are single unit prices. The AT128 for instance, which is listed at $6250 there and widely used by several Chinese car companies was around $900 per unit in high volume and over time they lowered that to around $400.
        The next generation of that, the ATX, is the one they have said would be half that cost. According to regulator filings in China BYD will be using this on entry level $10k cars.
        Hesai got the price down for their new generation by several optimizations. They are using their own designs for lasers, receivers, and driver chips which reduced component counts and material costs. They have stepped up production to 1.5 million units a year giving them mass production efficiencies.
        [-]
        bragr 2 hours ago
        That model only has a 120 degree field of view so you'd need 3-4 of them per car (plus others for blind spots, they sell units for that too). That puts the total system cost in the low thousands, not the 200 to 400 stated by GP. I'm not saying it hasn't gotten cheaper or won't keep getting cheaper, it just doesn't seem that cheap yet.
        jellojello 4 hours ago
        [dead]
    - jmux 4 hours ago
      Waymo does their LiDAR in-house, so unfortunately we don’t know the specs or the cost
      [-]
      - nerdsniper 4 hours ago
        Otto and Uber and the CEO of https://pronto.ai do though (tongue-in-cheek)
        > Then, in December 2016, Waymo received evidence suggesting that Otto and Uber were actually using Waymo’s trade secrets and patented LiDAR designs. On December 13, Waymo received an email from one of its LiDAR-component vendors. The email, which a Waymo employee was copied on, was titled OTTO FILES and its recipients included an email alias indicating that the thread was a discussion among members of the vendor’s “Uber” team. Attached to the email was a machine drawing of what purported to be an Otto circuit board (the “Replicated Board”) that bore a striking resemblance to – and shared several unique characteristics with – Waymo’s highly confidential current-generation LiDAR circuit board, the design of which had been downloaded by Mr. Levandowski before his resignation.
        The presiding judge, Alsup, said, "this is the biggest trade secret crime I have ever seen. This was not small. This was massive in scale."
        (Pronto connection: Levandowski got pardoned by Trump and is CEO of Pronto autonomous vehicles.)
        https://arstechnica.com/tech-policy/2017/02/waymo-googles-se...
      - ra7 3 hours ago
        We know Waymo reduced their LiDAR price from $75,000 to ~$7500 back in 2017 when they started designing them in-house: https://arstechnica.com/cars/2017/01/googles-waymo-invests-i...
        That was 2 generations of hardware ago (4th gen Chrysler Pacificas). They are about to introduce 6th gen hardware. It's a safe bet that it's much cheaper now, given how mass produced LiDARs cost ~$200.
    - eptcyka 5 hours ago
      Less than the lives it saves.
    - xnx 5 hours ago
      Cheaper every year.
      [-]
      - hijnksforall 4 hours ago
        Exactly.
        Tesla told us their strategy was vertical integration and scale to drive down all input costs in manufacturing these vehicles...
        ...oh, except lidar, that's going to be expensive forever, for some reason?
  - SecretDreams 5 hours ago
    > Humans do this, just in the sense of depth perception with both eyes.
    Humans do this with vibes and instincts, not just depth perception. When I can't see the lines on the road because there's too much slow, I can still interpret where they would be based on my familiarity with the roads and my implicit knowledge of how roads work, e.g. We do similar things for heavy rain or fog, although, sometimes those situations truly necessitate pulling over or slowing down and turning on your 4s - lidar might genuinely given an advantage there.
    [-]
    - pookeh 5 hours ago
      That’s the purpose of the neural networks
      [-]
      - array_key_first 4 hours ago
        Yes and no - vibes and instincts isn't just thought, it's real senses. Humans have a lot of senses; dozens of them. Including balance, pain, sense of passage of time, and body orientation. Not all of these senses are represented in autonomous vehicles, and it's not really clear how the brain mashes together all these senses to make decisions.
- mycall 5 hours ago
  That is still important for safety reasons in case someone uses a LiDAR jamming system to try to force you into an accident.
  [-]
  - etrautmann 5 hours ago
    It’s way easier to “jam” a camera with bright light than a lidar, which uses both narrow band optical filters and pulsed signals with filters to detect that temporal sequence. If I were an adversary, going after cameras is way way easier.
    [-]
    - sroussey 4 hours ago
      Oh yeah, point a q-beam at a Tesla at night, lol. Blindness!
  - Jyaif 5 hours ago
    If somebody wants to hurt you while you are traveling in a car, there are simpler ways.
- shihab 5 hours ago
  I think there are two steps here: converting video to sensor data input, and using that sensor data to drive. Only the second step will be handled by cars on road, first one is purely for training.
- sschueller 3 hours ago
  Autonomous cars need to be significantly better than humans to be fully accepted especially when an accident does happen. Hence limiting yourself to only cameras is futile.
- dooglius 4 hours ago
  They may be trying to suggest that, that claim does not follow from the quoted statement.
- uejfiweun 5 hours ago
  I've always wondered... if Lidar + Cameras is always making the right decision, you should theoretically be able to take the output of the Lidar + Cameras model and use it as training data for a Camera only model.
  [-]
  - olex 5 hours ago
    That's exactly what Tesla is doing with their validation vehicles, the ones with Lidar towers on top. They establish the "ground truth" from Lidar and use that to train and/or test the vision model. Presumably more "test", since they've most often been seen in Robotaxi service expansion areas shortly before fleet deployment.
    [-]
    - bob_theslob646 5 hours ago
      Is that exactly true though? Can you give a reference for that?
      [-]
      - olex 5 hours ago
        I don't have a specific source, no. I think it was mentioned in one of their presentation a few years back, that they use various techniques for "ground truth" for vision training, among those was time series (depth change over time should be continuous etc) and iirc also "external" sources for depth data, like LiDAR. And their validation cars equipped with LiDAR towers are definitely being seen everywhere they are rolling out their Robotaxi services.
        [-]
        senordevnyc 2 hours ago
        are definitely being seen everywhere they are rolling out their Robotaxi services
        So...nowhere?
  - __alexs 5 hours ago
    > you should theoretically be able to take the output of the Lidar + Cameras model and use it as training data for a Camera only model.
    Why should you be able to do that exactly? Human vision is frequently tricked by it's lack of depth data.
    [-]
    - scarmig 5 hours ago
      "Exactly" is impossible: there are multiple Lidar samples that would map to the same camera sample. But what training would do is build a model that could infer the most likely Lidar representation from a camera representation. There would still be cases where the most likely Lidar for a camera input isn't a useful/good representation of reality, e.g. a scene with very high dynamic range.
  - dbcurtis 4 hours ago
    No, I don't think that will be successful. Consider a day where the temperature and humidity is just right to make tail pipe exhaust form dense fog clouds. That will be opaque or nearly so to a camera, transparent to a radar, and I would assume something in between to a lidar. Multi-modal sensor fusion is always going to be more reliable at classifying some kinds of challenging scene segments. It doesn't take long to imagine many other scenarios where fusing the returns of multiple sensors is going to greatly increase classification accuracy.
  - etrautmann 5 hours ago
    Sure, but those models would never have online access to information only provided in lidar data…
    [-]
    - tfehring 4 hours ago
      No, but if you run a shadow or offline camera-only model in parallel with a camera + LIDAR model, you can (1) measure how much worse the camera-only model is so you can decide when (if ever) it's safe enough to stop installing LIDAR, and (2) look at the specific inputs for which the models diverge and focus on improving the camera-only model in those situations.
caycep 1 hour ago
All this work is impressive, but I'd rather have better trains
[-]
- scoofy 1 hour ago
  As someone who lives in the Bay Area we already have trains, and they're literally past the point of bankruptcy because they (1) don't actually charge enough maintain the variable cost of operations, (2) don't actually make people pay at all, and (3) don't actually enforce any quality of life concerns short of breaking up literal fights. All of this creates negative synergies that pushes a huge, mostly silent segment of the potential ridership away from these systems.
  So many people advocate for public transit, but are unwilling to deal with the current market tradeoffs and decisions people are making on the ground. As long as that keeps happening, expect modes of transit -- like Waymo -- that deliver the level of service that they promise to keep exceeding expectations.
  I've spent my entire adult life advocating for transportation alternatives, and at every turn in America, the vast majority of other transit advocates just expect people to be okay with anti-social behavior going completely unenforced, and expecting "good citizens" to keep paying when the expected value for any rational person is to engage in freeloading. Then they point to "enforcing the fare box" as a tradeoff between money to collect vs cost of enforcement, when the actually tradeoff is the signalling to every anti-social actor in the system that they can do whatever they want without any consequences.
  I currently only see a future in bike-share, because it's the only system that actually delivers on what it promises.
  [-]
  - doctoboggan 1 hour ago
    > they (1) don't actually charge enough maintain the variable cost of operations
    Why do you expect them to make money? Roads don't make money and no one thinks to complain about that. One of the purposes of government is to make investment in things that have more nebulous returns. Moving more people to public transit makes better cities, healthier and happier citizens, stronger communities, and lets us save money on road infrastructure.
    [-]
    - scoofy 55 minutes ago
      >Why do you expect them to make money?
      I don't.
      That's why I said "variable cost of operations."
      If a system doesn't generate enough revenue to cover the variable costs of operation, then every single new passenger drives the system closer to bankruptcy. The more "successful" the system is -- the more people depend on it -- the more likely it is to fail if anything happens to the underlying funding source, like a regular old local recession. This simple policy decision can create a downward economic spiral when a recession leads to service cuts, which leads to people unable to get to work reliably, which creates more economic pain, which leads to a bigger recession... rinse/repeat. This is why a public transit system should cover variable costs so that a successful system can grow -- and shrink -- sustainably.
      When you aren't growing sustainably, you open yourself up to the whims of the business cycle literally destroying your transit system. It's literally happening right now with SF MUNI, where we've had so many funding problems, that they've consolidated bus lines. I use the 38R, and it's become extremely busy. These busses are getting so packed that people don't want to use them, but the point is they can't expand service because each expansion loses them more money, again, because the system doesn't actually cover those variable costs.
      The public should be 100% completely covering the fixed capital costs of the system. Ideally, while there is a bit of wiggle room, the ridership should be 100% be covering the variable capital costs. That way the system can expand when it's successful, and contract when it's less popular. Right now in the Bay Area, you have the worst of both worlds, you have an underutilized system with absolutely spiraling costs, simply because there is zero connection between "people actually wanting to use the system" and "where the money comes from."
  - martinald 17 minutes ago
    You're definitely right on (2) and (3). I've used many transit systems across the world (including TransMilenio in Bogota and other latam countries "renowned" for crime) and I have never felt as unsafe as I have using transit in the SFBA. Even standing at bus stops draws a lot of attention from people suffering with serious addiction/mental health problems.
    1) is a bit simplistic though. I don't know of any European system that would cover even operating costs out of fare/commercial revenue. Potentially the London Underground - but not London buses. UK National Rail had higher success rates
    The better way to look at it imo is looking at the economic loss as well of congestion/abandoned commutes. To do a ridiculous hypothetical, London would collapse entirely if it didn't have transit. Perhaps 30-40% of inner london could commute by car (or walk/bike), so the economic benefit of that variable transit cost is in the hundreds of billions a year (compared to a small subsidy).
    It's not the same in SFBA so I guess it's far easier to just "write off" transit like that, it is theoretically possible (though you'd probably get some quite extreme additional congestion on the freeways as even that small % moving to cars would have an outsized impact on additional congestion).
  - caycep 27 minutes ago
    Maybe not BART but the new Caltrain electrification program seems to be a success and ridership and revenue are up
  - caycep 1 hour ago
    Well then invest in those things, then. It would probably cost less than the amount they're spending to make a Waymo World Model.
    [-]
    - scoofy 1 hour ago
      Lighting money on fire by funding an extremely expensive system that most people don't want to use is not an "investment." It's just a good way to make everyone much poorer and worse off than if we'd done nothing. The only way to change things is to convince the electorate that we actually do need rules and enforcement and a sustainable transportation system.
      This isn't just happening in America. Train systems are in rough shape in the UK and Germany too.
      Ebike shares are a much more sustainable system with a much lower cost, and achieve about 90% of the level of service in temperate regions of the country. Even the ski-lift guy in this thread has a much more reasonable approach to public transit, because they actually have extremely low cost for the level of service they provide. Their only real shortcoming is they they don't handle peak demand well, and are not flexible enough to handle their own success.
      [-]
      - caycep 43 minutes ago
        People want to use it everywhere in the world
        [-]
        scoofy 37 minutes ago
        People want to have their cake and eat it too.
- servo_sausage 48 minutes ago
  Trains need well behaved people, otherwise they are shit.
  I don't want to hear tiktok or full volume soap operas blasting at some deaf mouth breather.
  I don't want to be near loud chewing of smelly leftovers.
  I don't want to be begged for money, or interact with high or psychotic people.
  The current culture doesn't allow enforcement of social behaviour: so public transport will always be a miserable containment vessel for the least functional, and everyone with sense avoids the whole thing.
  [-]
  - neysofu 39 minutes ago
    > some deaf mouth breather
    I quite agree with the overall point but can we leave this kind of discourse on X, please? It doesn't add much, it just feels caustic for effect and engagement farming.
- chufucious 1 hour ago
  Me too but given our extensive car brain culture, Waymo is an amazing step to getting less drivers & cars off the road, and to further cement future generations not ever needing to drive or own cars
- joenot443 25 minutes ago
  I think future generations will resent us for bureaucratizing our way out of the California HSR.
- andoando 1 hour ago
  Ski lifts man, ski lifts all over the city
  [-]
  - bryan_w 35 minutes ago
    > Ski lifts man, ski lifts all over the city
    Don't they have those somewhere in South America?
  - underdeserver 1 hour ago
    What a glorious utopia we could have
- xnx 1 hour ago
  Isn't a vehicle that goes from anywhere to anywhere on your own schedule, safely, privately, cleanly, and without billions in subsidies better?
  [-]
  - anigbrowl 1 hour ago
    I don't think individual vehicles can ever achieve the same envirnmental economies of scale as trains. Certainly they're far more convenient (especially for short-haul journeys) but I also think they're somewhat alienating, in that they're engineering humans out of the loop completely which contributes to social atomization.
    [-]
    - xnx 1 hour ago
      > I don't think individual vehicles can ever achieve the same envirnmental economies of scale as trains.
      I think you'd be surprised. Look at the difference in cost per passenger mile.
  - appreciatorBus 1 hour ago
    Trains only require subsidies in a world where human & robot cars are subsidized.
    As soon as a mode of transport actually has to compete in a market for scarce & valuable land to operate on, trains and other forms of transit (publicly or privately owned) win every time.
  - Hikikomori 11 minutes ago
    >without billions in subsidies
    Is there a magic road wand?
  - kentiko 54 minutes ago
    Cars don't work in dense places.
  - g947o 1 hour ago
    Not necessarily, and your premise is incorrect.
  - kidk 1 hour ago
    Billions of subsidies? Im confused you talking about cars or trains.
    [-]
    - xnx 1 hour ago
      No major US public transportation system is fully paid for by riders.
      [-]
      - semiquaver 56 minutes ago
        Yep. https://www.transitwiki.org/TransitWiki/index.php/Farebox_Re... is a sobering reminder that many cities’ public transportation would cost $20-50 per trip if paid entirely by riders and thus could not exist without subsidy.
      - JimmyBuckets 1 hour ago
        That includes cars on public roads.
      - caycep 29 minutes ago
        NYC congestion pricing seems to be working quite well though, and probably helps offset MTA costs.
        [-]
        xnx 1 minute ago
        NYC "congestion" pricing (actually cordon pricing) is a good idea. Would be great to see more road use fees proportional to use (distance, weight^3, etc.).
ra7 5 hours ago
The novel aspect here seems to be 3D LiDAR output from 2D video using post-training. As far as I'm aware, no other video world models can do this.
IMO, access to DeepMind and Google infra is a hugely understated advantage Waymo has that no other competitor can replicate.
[-]
- codexb 3 hours ago
  3d from moving 2d images has been a thing for decades.
  [-]
  - ra7 3 hours ago
    This is 3D LiDAR output (multimodal) from 2D images.
    [-]
    - promiseofbeans 2 hours ago
      LiDAR is the technology used to do spatial capture. The output is just point clouds of surfaces. So they’re generating surface point clouds from video
- moffkalast 19 minutes ago
  It's not unheard of, there are a handful [0] of metric monodepth methods that output data that's not unlike a really inaccurate 3D lidar, though theirs certainly looks SOTA.
  [0] https://github.com/YvanYin/Metric3D
joshuamerrill 3 hours ago
It’s impressive to see simulation training for floods, tornadoes, and wildfires. But it’s also kind of baffling that a city full of Waymos all seemed to fail simultaneously in San Francisco when the power went out on Dec 22.
A power outage feels like a baseline scenario—orders of magnitude more common than the disasters in this demo. If the system can’t degrade gracefully when traffic lights go dark, what exactly is all that simulation buying us?
[-]
- GoatOfAplomb 2 hours ago
  All this simulation buys a single vehicle that drives better. That failure was a fleet-wide event (overloading the remote assistance humans).
  That is, both are true: this high-fidelity simulation is valuable and it won't catch all failure modes. Or in other words, it's still on Waymo for failing during the power outage, but it's not uniquely on Waymo's simulation team.
- flutas 3 hours ago
  They've also been seen driving directly into flood waters, with one driving through the middle of a flooded parking lot.
  https://www.reddit.com/r/SelfDrivingCars/comments/1pem9ep/hm...
nightpool 2 hours ago
Interesting, but it feels like it's going to cope very poorly with actually safety-critical situations. Having a world model that's trained on successful driving data feels like it's going to "launder" a lot of implicit assumptions that would cause a car to get into a crash in real life (e.g. there's probably no examples in the training data where the car is behind a stopped car, and the driver pulls over to another lane and another car comes from behind and crashes into the driver because it didn't check its blindspot). These types of subtle biases are going to make AI-simulated world models a poor fit for training safety systems where failure cannot be represented in the training data, since they basically give models "free reign" to do anything that couldn't be represented in world model training.
[-]
- 420official 2 hours ago
  You're forgetting that they are also training with real data from the 100+ million miles they've driven on real roads with riders, and using that data to train the world model AI.
  > there's probably no examples in the training data where the car is behind a stopped car, and the driver pulls over to another lane and another car comes from behind and crashes into the driver because it didn't check its blindspot
  This specific scenario is in the examples: https://videos.ctfassets.net/7ijaobx36mtm/3wK6IWWc8UmhFNUSyy...
  It doesn't show the failure mode, it demonstrates the successful crash avoidance.
- MillionOClock 1 hour ago
  While there most likely is going to be some bias in the training of those kinds of models, we can also hope that transfer learning from other non-driving videos will at least help generate something close enough to the very real but unusual situations you are mentioning. We could imagine an LLM serving as some kind of fuzzer to create a large variety of prompts for the world model, which as we can see in the article seems pretty capable at generating fictive scenarios when asked to.
  As always tho the devil lies in the details: is an LLM based generation pipeline good enough? What even is the definition of "good enough"? Even with good prompts will the world model output something sufficiently close to reality so that it can be used as a good virtual driving environment for further training / testing of autonomous cars? Or do the kind of limitations you mentioned still mean subtle but dangerous imprecisions will slip through and cause too poor data distribution to be a truly viable approach?
  My personal feeling is that this we will land somewhere in between: I think approaches like this one will be very useful, but I also don't think the current state of AI models mean we can have something 100% reliable with this.
  The question is: is 100% reliability a realistic goal? Human drivers are definitely not 100% reliable. If we come up with a solution 10x more reliable than the best human drivers, that maybe has some also some hard proof that it cannot have certain classes of catastrophic failure modes (probably with verified code based approaches that for instance guarantees that even if the NN output is invalid the car doesn't try to make moves out of a verifiably safe envelope) then I feel like the public and regulators would be much more inclined to authorize full autonomy.
hazrmard 4 hours ago
cue the bell curve meme for learning autonomy:
```
                 ____.----.____
          ______/              \______
    _____/                            \_____
    ________________________________________

    (simulations)  (real world data)  (simulations)
```
Seems like it, no?
We started with physics-based simulators for training policies. Then put them in the real world using modular perception/prediction/planning systems. Once enough data was collected, we went back to making simulators. This time, they're physics "informed" deep learning models.
[-]
- crazygringo 2 hours ago
  That's a very interesting way of looking at it. Yes, you start with simulating something simpler than the real world. Then you use the real world. Then you need to go back to simulations for real-world things that are too rare in the real world to train with.
  Seems like there ought to be a name for this, like so-and-so's law.
  [-]
  - buddhistdude 2 hours ago
    hazrmard's law
mellosouls 5 hours ago
Deepmind's Project Genie under the hood (pun intended). Deepmind & Waymo both Alphabet(Google) subsidiaries obv.
https://deepmind.google/blog/genie-3-a-new-frontier-for-worl...
Discussed here,eg.
Genie 3: A new frontier for world models (1510 points, 497 comments)
https://news.ycombinator.com/item?id=44798166
Project Genie: Experimenting with infinite, interactive worlds (673 points, 371 comments)
https://news.ycombinator.com/item?id=46812933
[-]
- paxys 4 hours ago
  Regardless of the corporate structure DeepMind is a lot more than just another Alphabet subsidiary at this point considering Demis Hassabis is leading all of Google AI.
ok_dad 2 hours ago
I'd like to see Waymo have a few of their Drivers do some sim racing training and then compete in some live events. It wouldn't matter much to me if they were fast at all, I'd like to see them go into the rookie classes in various games and see how they avoid crashes from inexperienced players. I believe that it would be the ultimate "shitty drivers vs. AI" test.
[-]
- JBorrow 2 hours ago
  Racing and street driving are completely different. Racing involves detailed knowledge of vehicle dynamics and grip. Street driving is mainly obstacle recognition and avoidance. No waymo ever operates anywhere close to the limit of grip, which is where you are all the time when racing.
londons_explore 9 minutes ago
Do wayno models really use side cameras at only like 4 FPS?
0xTJ 32 minutes ago
Interesting, but I am very sceptical. I'd be interested in seeing actual verified results of how it handles a road with heavy snow, where the only lane references are the wheel tracks of other vehicles, and you can't tell where the road ends and the snow-filled ditch begins.
joshfee 46 minutes ago
It is great being able to generate a much larger universe of possibilities than what they can gather from real world data collection, but I'd be curious to learn how they check that the generated data is a superset of the possibility-space seen in the real world (e.g. confirm that their models closely match what is seen in the real world too)
AceJohnny2 3 hours ago
IIUC, there's a confusion of meaning for "World Model", between Waymo/Deepmind's which is something that can create a consistent world (for use to train Waymo's Driver), vs Yann LeCun/Advanced Machine Intelligence (AMI) which is something that can understand a world.
[-]
- xnx 1 hour ago
  I don't think there's a conflict. If you can predict the world you understand it.
jrm4 3 hours ago
1. Still hard not to think that this is a huge waste of time as opposed to something that's a little more like a public transport train-ish thing, i.e. integrate with established infrastructure.
2. No seriously, is the filipino driver thing confirmed? It really feels like they're trying to bury that.
[-]
- airstrike 3 hours ago
  "The Filipino driver thing" is simply that there's a manual override ability when this profoundly complex and marvelously novel technology gets trapped in edge cases.
  Once it gets unstuck, it runs autonomously.
- frenchy 1 hour ago
  As someone who half-learned to drive in Manila, the idea that they would use Filipino drivers as backups is ironic.
  For context, my "driver's test" was going to the back of the office, and driving some old car backwards and forwards a few meters.
- anigbrowl 1 hour ago
  2. Yes, a Waymo exec described it in a Congressional hearing.
  https://news.ycombinator.com/item?id=46918043
- spaceywilly 3 hours ago
  My view on Waymo and autonomous taxis in general is they will eventually make public transit obsolete. Once there is a robotaxi available to pick up and drop off every passenger directly from a to b, the whole system could be made to be super efficient. It will take time to get there though.
  But eventually I think we will get there. Human drivers will be banned, the roads will be exclusively used by autonomous vehicles that are very efficient drivers (we could totally remove stoplights, for example. Only pedestrian crossing signs would be needed. Robo-vehicles could plug into a city-wide network that optimizes the routing of every vehicle.) At that point, public transit becomes subsidized robotaxi rides. Why take a subway when a car can take you door to door with an optimized route?
  So in terms of why it isn’t a waste of time, it’s a step along the path towards this vision. We can’t flip a switch and make this tech exist, it will happen in gradual steps.
  [-]
  - ianburrell 1 hour ago
    Automated taxis would still be stuck in traffic. Automation gets couple times in capacity, but the induced demand and extra cars looking for rides and parking will mean traffic.
    Automation makes public transit better. There will be automated minibuses that are more flexible and frequent than today's buses. Automation also means that buses get a virtual bus lane. Taxis solve the last mile problem, by taking taxi to the station, riding train with thousands of people, and then taking more transit.
    Also, we might discover the advantage of human powered transit. Ebikes are more efficient than cars and give health benefits. They will be much safer than automated cars. Could use the extra capacity for bike and bus lanes.
  - sagarm 2 hours ago
    If everyone in NYC tried to commute in a single-occupancy vehicle, there would be gridlock -- AVs or no.
  - rootusrootus 2 hours ago
    > Human drivers will be banned, the roads will be exclusively used by autonomous vehicles
    I basically agree with your premise that public transit as it exists today will be rendered obsolete, but I think this point here is where your prediction hits a wall. I would be stunned if we agreed to eliminate human drivers from the road in my lifetime, or the lifetime of anyone alive today. Waymo is amazing, but still just at the beginning of the long tail.
    [-]
    - xnx 1 hour ago
      > I would be stunned if we agreed to eliminate human drivers from the road in my lifetime
      It basically happened for horses.
      [-]
      - anigbrowl 1 hour ago
        Horses don't vote.
        [-]
        xnx 1 hour ago
        Neither do cars?
        [-]
        anigbrowl 1 hour ago
        Drivers, however, absolutely do. And I do not see enough drivers voting away their own ability to drive any time soon.
        [-]
        xnx 6 minutes ago
        Right, I was pointing out that at some point there was probably a horse-rider constituency as there is a driver constituency today.
        semiquaver 43 minutes ago
        A few years ago I would have (and did) considered the notion that manually programming was about to turn into a quaint relic and computers would be writing 90%+ of code preposterous. Once an alternative becomes obviously superior things can change very fast.
    - Jblx2 2 hours ago
      Is that:
      - I would be stunned if we agree to eliminate human drivers from 100% of roads in the lifetime of anyone alive today.
      or
      - I would be stunned if we agree to eliminate human drivers from 10% of roads...
      ...or is there some other percentage to qualify this? I guess I wouldn't expect there to be a decree that makes it happen all at once for a country. Especially a large country like the U.S.. More like, some really dense city will decide to make a tiny core autonomous vehicles only, and then some other cities also do years later. And then maybe it expands to something larger than just the core after 5 or 10 years. And so on...
- smotched 3 hours ago
  America is not europe, how would public transport work for the last 1/2miles
  [-]
  - goatlover 3 hours ago
    Walking, bikes and scooters.
- iknowstuff 3 hours ago
  Filipino driver is false. Filipino guidance person is true.
  [-]
  - jrm4 56 minutes ago
    The difference being?
- jeffbee 2 hours ago
  They are not trying to "bury" remote assistance at all. They wrote a white paper about it in 2020 and a blog post about it in 2024.
  Anyway you can think it's a waste but they're wasting their money, not yours. If you want a train in your town, go get one. Waymo has only spent, cumulatively, about 4 months of the budgets of American transit agencies. If you had all that money it wouldn't amount to anything.
  [-]
  - jrm4 53 minutes ago
    "At all?"
    Oh come on -- of course they are. That's precisely why you put it in a "white paper" and not, you know, ads.
- hiddencost 3 hours ago
  (2) I really don't understand why people are surprised that Waymo has fallbacks? The fact that they had a team ready to take over as necessary was well known. I've seen a bunch of comments about this and it seems like people are confused.
  [-]
  - anigbrowl 1 hour ago
    I think they're surprised to learn it's being done by a bunch of people on the other side of the world because they don't want to pay American wages.
  - jrm4 54 minutes ago
    I think you sort of fundamentally misunderstand the whole "steak vs sizzle" thing in capitalism?
    The technology "feels" way less cool knowing that there are human backups, which would absolutely in turn make its percieved value go down.
phailhaus 3 hours ago
Finally I understand the use case for Genie 3. All the talk about "you can make any videogame or movie" seems to have been pure distraction from real uses like this: limited, time-boxed simulated footage.
ActorNightly 3 hours ago
This is cool, but they are still not going about it the right way.
Its much easier to build everything into the compressed latent space of physical objects and how they move, and operate from there.
Everyone jumped on the end-2-end bandwagon, which then locks you into the input to your driving model being vision, which means that you have to have things like genie to generate vision data, which is wasteful.
[-]
- sagarm 2 hours ago
  The article is about using the world model to generate simulations, not for controlling the vehicle.
- senordevnyc 2 hours ago
  This is cool, but they are still not going about it the right way.
  This is legit hilarious to read from some random HN account.
NullHypothesist 5 hours ago
I wonder if they can simulate the Beatles crossing the street at Abbey Road in the late '60s
[-]
- seanhunter 5 hours ago
  As a Londoner who used to have to ride up Abbey Road at least once per week there are people on that crossing pretty much all day every day reproducing that picture. So now Waymo are in Beta in London[1] they have only to drive up there and they'll get plenty of footage they could use for taht.
  [1] I've seen a couple of them but they're not available to hire yet and are still very rare.
  [-]
  - permenant 3 hours ago
    Will Google finally fund Christopher Wren's post great fire "wide streets" rebuild of the City?
    [-]
    - ddalex 3 hours ago
      i think we might need aother great fire to widen the streets at this point
999900000999 4 hours ago
It doesn't look like they're going to open sources or anything, but I could imagine this would be great for city planning.
Or the most realistic game of SimCity you could imagine.
fabmilo 3 hours ago
Very impressive work from Waymo. The driving with a tornado in the horizon example kind of struck my imagination, many people actually panic in such scenarios. I wonder though the compute requirements to run these simulations and producing so many data points.
Kapura 5 hours ago
Interesting that this should come out right as lawmakers are beginning to understand that Waymos have overseas operators making major decisions.
[*] https://futurism.com/advanced-transport/waymos-controlled-wo...
[-]
- FL33TW00D 4 hours ago
  Completely false: https://x.com/i/status/2019213765506670738
  Listen to the statement.
  The operators help when the Waymo is in a "difficult situation".
  Car drives itself 99% of the time, long tail of issues not yet fixed have a human intervene.
  Everyone is making out like it's an RC car, completely false.
  [-]
  - ChadNauseam 4 hours ago
    Whenever something like this comes out, it's a good moment to find people with no critical thinking skills who can safely be ignored. Driving a waymo like an RC car from the philippines? you can barely talk over zoom with someone in the philippines without bitrate and lag issues.
    [-]
    - anigbrowl 1 hour ago
      Except that's not what the original posters said, rather 'operators making major decisions.' Don't strawman here, it wastes everyone's time.
    - hijnksforall 4 hours ago
      Hacker News has had some of the dumbest Tesla takes of all time. People should be embarrassed about some of the claims that were made here.
      And apparently some people still haven't caught on.
      Have a look if you don't believe me:
      https://hn.algolia.com/?dateRange=custom&page=0&prefix=false...
      [-]
      - Hikikomori 5 minutes ago
        Seems to be some ketamine on that boot.
  - MillionOClock 4 hours ago
    I haven't read anything about this but I would also suppose long distance human intervention cannot be done for truly critical situations where you need a very quick reaction, whereas it would be more appropriate in situations where the car has stopped and is stuck not knowing what to do. Probably just stating the obvious here but indeed this seems like something very different from an RC car kind of situation.
    [-]
    - sroussey 4 hours ago
      It’s not for that. It’s for things like the car drove into a protest area and people are surrounding the car. Or police blocked off an intersection and the car is stuck temporarily with people doing otherwise illegal u-turns or driving the wrong way on a one way road to get out of it.
- thethimble 5 hours ago
  Why is this relevant at all?
  Having humans in the loop at some level is necessary for handling rare edge cases safely.
  [-]
  - AlotOfReading 40 minutes ago
    The word "loop" here has multiple meanings. Only one is what you mean and the other person responding to you has understood another.
    The first is the DDT control loop, what a human driver does. Waymo's remote assistants aren't involved in that. The computer always has responsibility for the safety of the vehicle and decisionmaking while operating, which is why Waymo's humans are remote assistants and not remote drivers. Their safety drivers do participate in the DDT loop, hence the name.
    But there's also another "loop" of human involvement. Sometimes the vehicle doesn't understand the scene and asks humans for advice about the appropriate action to take. It's vaguely similar to captchas. The human will usually confirm the computer's proposed actions, but they can also suggest different actions. The computer the advice as a prior to continue operating instead of giving up the DDT responsibility. There's very likely a closely monitored SLA between a few seconds to a few minutes on how long it takes humans to start looking at the scene.
    If something causes the computer to believe the advice isn't safe, it will ignore it. There have been cases where Waymos have erroneously detected collisions and remote assistants were unable to override that decisionmaking. When that happens, a vehicle recovery team is physically sent out to the location. The SLA here is likely between tens of minutes and a couple hours.
  - mrcwinn 5 hours ago
    If that’s true the system isn’t finished. That’s what reasoning is for.
    [-]
    - sroussey 4 hours ago
      Who ever said they were finished? You think the laid off the team since everything is “done”?
pcurve 2 hours ago
Dumb question - Why would Waymo disclose this much information to public and competitors?
[-]
- FuckButtons 1 hour ago
  Given the announcement from a few days ago of google trying to get external investment, this is their follow up, showing what that investment is good for. Also, it’s pretty light on details that are of much use to competitors. “We made an accurate simulation system to test our system in before deployment” would be pretty mundane if you were talking about any other field of engineering.
- xnx 1 hour ago
  It's easier to build trust for such a safety-critical service when you're more open about how it works an performs. For the complete opposite approach, see Tesla.
- wnevets 1 hour ago
  Maybe to distract from the story that they use remote drivers after one of their cars hit a kid? [1]
  [1] https://people.com/waymo-exec-reveals-company-uses-operators...
  edit: fixed kill -> hit
  [-]
  - docere 1 hour ago
    The child did not die, and suffered only minor injuries: https://abc7.com/post/california-teamsters-call-suspension-w...
    Under the same circumstances (kid suddenly emerging between two parked cars and running out onto the street), it could be debated that the outcome could have been worse if a human were driving.
  - dan-g 1 hour ago
    It’s awful a child was hit, but they only suffered minor injuries [1]. Nowhere in your linked article does it say they were killed.
    [1] https://people.com/waymo-car-hits-child-walking-to-school-du...
anigbrowl 1 hour ago
Seems relevant: Waymo exec admits remote operators in Philippines help guide US Robotaxis
https://news.ycombinator.com/item?id=46918043
KeyBoardG 1 hour ago
and I literally just saw the other headline "Waymo says its robotaxis get help from remote workers in the Philippines"
t1234s 2 hours ago
Could these world models be used to build some sort of endless GranTurismo type street racing game?
[-]
- crazygringo 2 hours ago
  It seems inevitable that they'll soon be used as the starting points for developing almost all video game environments.
  Not for the rendering (that's still way too expensive), but for the initial world generation that gets iteratively refined and then still ultimately gets converted into textured triangles.
LightBug1 35 minutes ago
Have been seeing Waymo test vehicles regularly around central London recently, operating at speed.
For shits and giggles, I did stop randomly while crossing the road and acted like a jerk.
The Waymo did, in fact, stop.
Kudos, Waymo
mgaunard 5 hours ago
Still needs to be trained on the final boss: dense cities with narrow streets.
[-]
- reluctant_dev 5 hours ago
  San Francisco isn't uniformly dense and narrow, but it does have both, and it's run remarkably well so far.
  [-]
  - elliotec 4 hours ago
    Another comment mentioned the Philippines as the manifest frontier. SF is not on the same plane of reality in terms of density or narrow streets as PH, I would argue in comparison it does not have both.
  - smallmancontrov 4 hours ago
    This is the craziest I've seen, but it was 10 months ago which is ~10 years in AI years
    https://www.youtube.com/watch?v=3DWz1TD-VZg
  - fragmede 4 hours ago
    On that specific count, not really. There's a skate park north end of the Mission, and Stevenson St is a two way road that borders it, but it's narrow enough that you need to drive up on the curb to get two vehicles side by side on the street. Waymo's can't handle that on a regular basis. Being San Francisco and not London, you can just skip that road, but if you find yourself in a Waymo on that street and are unlucky to have other traffic on it, the Waymo will just have to back up the entire street. Hope there's no one behind you as well as in front of you!
    Anyway, we'll see how the London rollout goes, but I get the impression London's got a lot more of those kinds of roads.
    [-]
    - rootusrootus 4 hours ago
      > Stevenson St is a two way road
      That is extremely narrow, I wonder why the city has not designated it as a one-way street? They've done that for other similarly narrow sections of the same street farther north.
- xnx 5 hours ago
  What would be an example city? Waymo just announced they're ramping up in Boston: https://waymo.com/blog/?modal=short-back-to-boston
  "we’re excited to continue effectively adapting to Boston’s cobblestones, narrow alleyways, roundabouts and turnpikes."
  [-]
  - pants2 5 hours ago
    Any small city in Italy is going to be 10X more challenging than Boston
    [-]
    - threetonesun 5 hours ago
      Depends, which is harder: a narrow street or a three lane one with no obvious lane markers with people double parking?
    - chasd00 1 hour ago
      the absolute chaos of Paris would also be challenging.
    - micromacrofoot 4 hours ago
      and the failure mode for some of them are steep drops off of cliffs
  - zhengyi13 5 hours ago
    Various European cities come to mind: Narrow streets are something of a trope in certain movies/genres.
    [-]
    - jkaptur 5 hours ago
      To be fair, many of those films do not portray human drivers in the best light.
  - ginko 5 hours ago
    Not grandparent but I was rather thinking of medieval city centers in Italy or Spain.
    edit: Case in point:
    https://maps.app.goo.gl/xxYQWHrzSMES8HPL8
    This is an alley in Coimbra, Portugal. A couple years ago I stayed at a hotel in this very street and took a cab from the train station. The driver could have stopped in the praça below and told me to walk 15m up. Instead the guy went all the way up then curved through 5-10 alleys like that to drop me off right right in front of my place. At a significant speed as well. It was one of the craziest car rides I've ever experienced.
    [-]
    - stackedinserter 3 hours ago
      Do we really need FSD cars (any cars, actually) in medieval city centers?
- breckinloggins 5 hours ago
  I live in such an area. The route to my house involves steep topography via small windy streets that are very narrow and effectively one-way due to parked cars.
  Human drivers routinely do worse than Waymo, which I take 2 or 3 times a week. Is it perfect? No. Does it handle the situation better than most Lyft or Uber drivers? Yes.
  As a bonus: unlike some of those drivers the Waymo doesn't get palpably angry at me for driving the route.
- dandaka 5 hours ago
  Yes, something like Ho Chi Minh or Mumbai in a peak hour! With lots of bike riders, pedestrians, and livestock at the same roundabout.
- throwaway_20357 4 hours ago
  Like London? https://www.youtube.com/watch?v=KvctCbVEvwQ
- pja 5 hours ago
  Waymo cars are driving around London right now.
  Not taking paying passengers yet though!
- chpatrick 3 hours ago
  They're being trialled in London right now.
- seydor 5 hours ago
  Napoli
- kylehotchkiss 4 hours ago
  Old Delhi is the the final boss.
- renewiltord 4 hours ago
  Does it, though? Maybe Dhaka will never get Waymo. The same way you can’t get advanced gene therapy there.
threethirtytwo 1 hour ago
What if we put this mechanism of recording the world on people. We have mics listening to people talking to us and noises we hear.
Also we record body position actuation and self speech. As output then we put this on thousands of people to get as much data as Waymo gets.
I mean that’s what we need to imitate agi right? I guess the only thing missing is the memory mechanism. We train everything as if it’s an input and output function without accounting for memory.
cratermoon 1 hour ago
Meanwhile. https://eletric-vehicles.com/waymo/waymo-exec-admits-remote-...
LowLevelKernel 3 hours ago
Instructions to load it on WAYMAX simulator?
PeterStuer 5 hours ago
Imagine driving in a Waymo 'out of a raging fire'.
Talk about edge cases.
But, what would you do? Trust the Waymo, or get out (or never get in) at the first sign of trouble?
[-]
- breckinloggins 5 hours ago
  Interesting question. If the Waymo was driving aggressively to remove us from the situation but relatively safely I might stay in it.
  This does bring up something, though: Waymo has a "pull over" feature, but it's hidden behind a couple of touch screen actions involving small virtual buttons and it does not pull over immediately. Instead, it "finds a spot to pull over". I would very much like a big red STOP IMMEDIATELY button in these vehicles.
  [-]
  - bragr 4 hours ago
    >it's hidden behind a couple of touch screen actions involving small virtual buttons and it does not pull over immediately
    It was on the home screen when I've taken it, and when I tested it, it seemed to pull to the first safe place. I don't trust the general pubic with a stop button.
  - arctic-true 2 hours ago
    I feel like this ends with drunk morons accidentally creating Waymo barricades and totally ruining Mardi Gras
  - tensor 5 hours ago
    Can you not just unlock and open the door? Wouldn't that cause it to immediately stop? Or can you not unlock the door manually? I'd be surprised if there was not an emergency door release.
  - phainopepla2 2 hours ago
    Imagine how many drunk/careless passengers might press it. Stopping in the middle of the street or highway could be a serious safety hazard.
- kylehotchkiss 4 hours ago
  I can! If the Waymo got you into one on the way home because Google didn’t integrate with watch duty yet, that’s plausible
ge96 3 hours ago
What is the 5/3 tiles? Cameras?
[-]
- spaceywilly 3 hours ago
  The model generates camera and Lidar data. As if it was a Waymo car that drove through the simulated scenario with its cameras running. This synthetic training data can then be used to train the driving models.
  [-]
  - ge96 2 hours ago
    Wonder how it'll do. The trees change shape (presumably the Lidar patterns do too). I get the premise/why but it seems odd to me (armchair) to use fake data. Real trees don't change shape (in real time) although it can be windy.
    It probably doesn't matter though, "this general blob over there"
AndrewKemendo 4 hours ago
For whatever it’s worth World models is going to be the dominant computing structure of the future
I started working heavily on realizing them in 2016 and it is unquestionably (finally) the future of AI
01100011 3 hours ago
Nvidia has had this for years. What am I missing?
tgrowazay 4 hours ago
This page crashes my browser.
Vivaldi 7.8.3931.63 on iOS 26.2.1 iPhone 16 pro
mempko 5 hours ago
One interesting thing from this paper is how big of a LiDaR shadow there is around the waymo car which suggests they rely on cameras for anything close (maybe they have radar too?). Seems LiDaR is only useful for distant objects.
[-]
- xnx 4 hours ago
  At least 6 radar units: https://support.google.com/waymo/answer/9190838?hl=en
jimt1234 5 hours ago
This might be relevant to the timing here: https://eletric-vehicles.com/waymo/waymo-exec-admits-remote-...
Vosporos 4 hours ago
The new frontier is manifestly the Phillipines.
[-]
- elliotec 4 hours ago
  Can you explain? I lived in PH, and my guess is that you mean navigating and modeling the unending and constantly changing chaos of the street systems (and lack thereof) is going to be a monumental task which I completely agree with. It would be an impressive feat if possible.
  Edit: or are you talking about the allegations of workers in the Philippines controlling the Waymos: https://futurism.com/advanced-transport/waymos-controlled-wo... I guess both are valid.
m0llusk 5 hours ago
Seems interesting, but why is it broken. Waymo repeatedly directed multiple automated vehicles into the private alley off of 5th near Brannan in SF even after being told none of them have any business there ever, period. If they can sense the weather and stuff then maybe they could put out a virtual sign or fence that notes what appears to be a road is neither a through way nor open to the public? I'm really bullish on automated driving long term, but now that vehicles are present for real we need to start to think about potentially getting serious about finding some way to get them to comply with the same laws that limit what people can do.
[-]
- tanseydavid 4 hours ago
  >> get them to comply with the same laws that limit what people can do
  I think you meant, "Attempt" to limit what people can do.
  Driving in SF (for example) provides many opportunities to see "free will" exerted in the most extreme ways -- laws be damned.
selenajennifer 1 hour ago
[dead]
andrewmcwatters 5 hours ago
[dead]
xvxvx 3 hours ago
[flagged]
devmor 5 hours ago
Wow, interesting timing for this PR blast considering the admission in the Senate Commerce Committee hearing. Not transparent at all!
[-]
- WarmWash 5 hours ago
  What was the admission? That they use cheap labor to provide the waymo clarity when it is confused? That has been known for a long time.
turtlesdown11 5 hours ago
How many Filipinos, who do not have US drivers licenses, does it take to drive this new model?
add-sub-mul-div 6 hours ago
"Autonomous"
https://cybernews.com/news/waymo-overseas-human-agents-robot...
[-]
- Rebelgecko 5 hours ago
  My understanding is that support is basically playing an RTS (point and click), not a 1P driving game. Which makes sense, if they were directly controlling the vehicles they'd put support in central America for better latency, like the food delivery bot drivers
  [-]
  - jonas21 5 hours ago
    Yeah. Waymo described how this works a couple of years ago:
    https://waymo.com/blog/2024/05/fleet-response/
    [-]
    - turtlesdown11 5 hours ago
      Right, I totally believe Waymo, just like I totally believed Amazon's checkout-less stores.
- TulliusCicero 5 hours ago
  This isn't news, they've always acknowledged that they have remote navigators that tell the cars what to do when they get stuck or confused. It's just that they don't directly drive the car.
smcl 3 hours ago
The Waymo driving model: hire some guys in Philippines: https://futurism.com/advanced-transport/waymos-controlled-wo...
[-]
- ASalazarMX 3 hours ago
  This is not false, but gives the wrong idea that foreigners are driving them in real time.
  > After being pressed for a breakdown on where these overseas operators operate, Peña said he didn’t have those stats, explaining that some operators live in the US, but others live much further away, including in the Philippines.
  > “They provide guidance,” he argued. “They do not remotely drive the vehicles. Waymo asks for guidance in certain situations and gets an input, but the Waymo vehicle is always in charge of the dynamic driving tasks, so that is just one additional input.”
- andreyk 2 hours ago
  This is quite misleading... From the article:
  “When the Waymo vehicle encounters a particular situation on the road, the autonomous driver can reach out to a human fleet response agent for additional information to contextualize its environment,” the post reads. “The Waymo Driver [software] does not rely solely on the inputs it receives from the fleet response agent and it is in control of the vehicle at all times.” [from Waymo's own blog https://waymo.com/blog/2024/05/fleet-response/]
  What's the problem with this?
- ddalex 3 hours ago
  Have you read the article ? The guys in the Philippines are providing high level executive indications, they don't drive remotely the car or have any low level control of the car.
- themafia 3 hours ago
  Dig deep enough into any "AI" idea and you'll find the bottom end of the scam looks exactly like this.
  We've simply relabeled the "Mechanical Turk" into "AI."
  The rest is built on stolen copyrighted data.
  The new corporate model: "just lie the government clearly doesn't give a shit anymore."
OGEnthusiast 6 hours ago
What's going to happen to all the millions of drivers who will lose their job overnight? In a country with 100 million guns, are we really sure we've thought this through?
[-]
- 0x457 6 hours ago
  Yes, let's stop all progress and roll-back all automation to keep hypothetical angry people with guns happy.
  [-]
  - Phenomenit 5 hours ago
    Seems like a good description on current events.
  - runarberg 5 hours ago
    Autonomous private cars is not the technological progress you think it is. We’ve had autonomous trains for decades, and while it provides us with a more efficient and cost effective public transit system, it didn’t open the doors for the next revolutionary technology.
    Self driving cars is a dead end technology, that will introduce a whole host of new problems which are already solved with public transit, better urban planning, etc.
    [-]
    - sekai 5 hours ago
      > We’ve had autonomous trains for decades
      Trains need tracks, cars - already have the infrastructure to drive on.
      > Self driving cars is a dead end technology, that will introduce a whole host of new problems which are already solved with public transit, better urban planning, etc.
      Self driving cars will literally become a part of public transit
      [-]
      - runarberg 4 hours ago
        > Self driving cars will literally become a part of public transit
        I’ve been hearing people say that for almost 15 years now. I believe it when I see it.
        [-]
        tanseydavid 4 hours ago
        >> I believe it when I see it.
        I'm willing to wager that you might not actually believe it at that point either.
    - drewmate 3 hours ago
      Unfortunately, many of our urban areas have already been planned (for better or worse) for cars and not the density that makes public transit viable. Autonomous cars will solve a host of problems for the old, young, mobility limited, and just about everyone else.
      It will prove disruptive to the driving industry, but I think we’ve been through worse disruptions and fared the better for it.
    - xnx 4 hours ago
      > Self driving cars is a dead end technology
      I would be happy to bet on some strict definition of your claim.
    - pnut 5 hours ago
      Nope. Humans are statistically fallible and their attention is too valuable to be obliged to a mundane task like executing navigation commands. Redesigning and rebuilding city transportation infrastructure isn't happening, look around. Also personal agency limits public transportation as a solution.
      [-]
      - askl 3 hours ago
        > Redesigning and rebuilding city transportation infrastructure isn't happening, look around.
        The US already did it once (just in the wrong direction) by redesigning all cities to be unfriendly to humans and only navigable by cars. It should be technically possible to revert that mistake.
      - runarberg 4 hours ago
        Unlike autonomous driving, public transit is a proven solution employed in thousands of cities around the world, on various scales, economies, etc.
        > Redesigning and rebuilding city transportation infrastructure isn't happening, look around.
        We have been redesigning and rebuilding city transportation infrastructure since we had cities. Where I live (Seattle) they are opening a new light rail bridge crossing just next month (first rail over a floting bridge; which is technologically very interesting), and two new rail lines are being planned. In the 1960s the Bay area completely revolutionized their transit sytem when they opened BART.
        I think you are simply wrong here.
        [-]
        tanseydavid 4 hours ago
        >> In the 1960s the Bay area completely revolutionized their transit sytem when they opened BART.
        66 years later we see California struggling terribly with implementation of a high-speed rail system -- where the placement/location of the infrastructure largely is targeted for areas far less dense than the Bay Area.
        I don't think there is any single reason why this is so much more difficult now then it was in 1960 -- but clearly things have changed quite a lot in that time.
- paxys 5 hours ago
  Waymo has been operating since 2004 (22 years ago), and replacing drivers on the road will take many more decades. Nothing is happening "overnight".
  [-]
- skybrian 5 hours ago
  If Waymo's history is any guide, it's not going to happen overnight. Even in San Francisco, their market share is only 20-30%.
- sekai 5 hours ago
  > What's going to happen to all the millions of drivers who will lose their job overnight? In a country with 100 million guns, are we really sure we've thought this through?
  Same was said about electricity, or the internet.
  [-]
  - password54321 4 hours ago
    People keep referencing history but this really is unprecedented. We are approaching singularity and many people will become obsolete in all areas. There are no new hypothetical jobs waiting on the horizon.
- sroussey 4 hours ago
  Reminds me of the history or radio and the absolute uproar that someone played a record on the radio rather than live performances!!
  [-]
- lanthissa 5 hours ago
  same thing that happened during the industrial revolution, you pay enough of them to 'protect the law' vs the rest.
- sigspec 5 hours ago
  UBI or war, or both
- VirusNewbie 5 hours ago
  I don't think Uber goes out of business. There is probably a sweet spot for Waymo's steady state cars, and you STILL might want 'surge' capabilities for part time workers who can repurpose their cars to make a little extra money here and there.
- gadflyinyoureye 5 hours ago
  Those are rookie numbers. The US has 400 million guns. https://www.theglobalstatistics.com/united-states-gun-owners...
  As to the revolt, America doesn't do that any more. Years of education have removed both the vim and vigor of our souls. People will complain. They will do a TikTok dance as protest. Some will go into the streets. No meaningful uprising will occur.
  The poor and the affected will be told to go to the trades. That's the new learn to program. Our tech overlords will have their media tell us that everything is ok (packaging it appropriately for the specific side of the aisle).
  Ultimately the US will go down hill to become a Belgium. Not terrible, but not a world dominating, hand cutting entity it once was.
  [-]
  - markvdb 5 hours ago
    > Ultimately the US will go down hill to become a Belgium.
    Sharing one's opinion in a respectful way is possible. Less spectacle, so less eyeballs, but worth it. Try it.
    [-]
    - nubg 5 hours ago
      What's wrong with his comparison? He explained what he meant by "a Belgium".
      [-]
      - tanseydavid 4 hours ago
        The entire side topic of guns and revolt seems misplaced in this thread.
        The original Luddite movement arose in response to automation in the textile industry.
        They committed violence. Violence was committed against them. All tragic events when viewed from a certain perspective.
        My rhetorical question is this: did any of this result in any meaningful impedance of the "march of technological progress"?
  - bonsai_spool 5 hours ago
    > Ultimately the US will go down hill to become a Belgium.
    I'm curious why you say this given you start by highlighting several characteristics that are not like Belgium (to wit, poor education, political media capture, effective oligarchy). I feel there are several other nations that may be better comparators, just want to understand your selection.