March 11, 2026

by Max aka Mosheh

From Pokémon Go Scans to PDDL Plans: The Data Stack Behind “Physical AI”

Want to learn how to USE AI technology to make money and/or your life easier? Join our FREE AI community here: https://www.skool.com/ai-with-apex/about

From Pokémon Go Scans to PDDL Plans: The Data Stack Behind “Physical AI”

Today’s thread is simple: machines get better at moving through the real world when perception becomes reusable infrastructure—and when planning becomes verifiable.

From crowdsourced city scans to formal solvers, the week’s research points to a new stack for robotics and embodied AI: sense, localize, plan, then act.

TL;DR

Niantic Spatial says it’s training a Large Geospatial Model from 30B+ “posed images” captured across “millions of locations.”
MIT introduced a dual vision-language approach that turns a single image into PDDL files + a solver-produced plan, reporting ~70% average success vs ~30% for its best baselines.
An MIT Media Lab profile of Joseph Paradiso highlights sensing deployments from wearables to ecology, including acoustic monitoring for endangered honeybees in Patagonia.
MIT’s Matthew Jones is building predictive models of tumor progression and treatment resistance, framing it as trying to anticipate cancer’s “moves” rather than reacting after resistance appears.
NVIDIA’s guest post on Hugging Face pushes “open data for AI” and directs builders to datasets, recipes, and community tooling—signaling that “open” competition is increasingly about data, not just weights.

1) Niantic Spatial: Pokémon Go-era scans become a Large Geospatial Model

What happened
Niantic’s spatial computing spinout, Niantic Spatial, says it is training a “Large Geospatial Model” using 30+ billion “posed images” collected across “millions of locations.” The company frames the result as a way for machines to reconstruct, localize within, and understand real-world places.

Why it matters
This is one of the cleanest examples yet of consumer AR data collection turning into a foundation layer for autonomy: playful scanning becomes industrial-grade localization. It also sharpens the next big competition after text and image models—building “world models” that can anchor robots and devices in messy physical spaces.

Key details

Niantic Spatial describes “posed images” as a core ingredient for its Large Geospatial Model, a format that pairs imagery with camera pose/geometry for scalable 3D understanding. (link)
The company positions its services around reconstruction, localization, and understanding of real-world spaces. (link)
Public background describes Niantic Spatial as a Niantic spinout and places it in the context of Niantic’s restructuring and shift in focus. (link)

Source links
https://www.nianticspatial.com/en/blog/niantic-spatial-day-one?utm_source=openai
https://en.wikipedia.org/wiki/Niantic_Spatial?utm_source=openai

2) MIT: A dual VLM that turns one image into a formal plan (PDDL + solver)

What happened
MIT researchers presented “VLM-guided formal planning (VLMFP),” a method that uses two vision-language models and a classical planner to generate step-by-step plans from a single image. In MIT’s description, a smaller model (SimVLM) describes the scene, simulates actions, and checks whether a goal is satisfied, while a larger model (GenVLM) converts that information into PDDL domain/problem files that a solver can execute.

Why it matters
The practical move here is forcing an AI system to externalize its intent into a formal representation that can be checked and solved, rather than relying on end-to-end neural reasoning for long-horizon tasks. It’s a credible bridge between perception (what’s in the image) and reliability (a plan a solver can validate and step through).

Key details

MIT reports ~70% average success for VLMFP vs ~30% for the best baseline in its comparison framing. (link)
MIT reports SimVLM achieved ~85% success at describing/simulating/checking goals in tests. (link)
The workflow includes iterative refinement: solver output is compared with simulator expectations to improve the generated PDDL. (link)
MIT links the work to an open-access arXiv preprint for additional methodological details. (link, link)

Source links
https://news.mit.edu/2026/better-method-planning-complex-visual-tasks-0311
https://arxiv.org/abs/2510.03182?utm_source=openai

3) MIT Media Lab: Joseph Paradiso’s sensing work across wearables, art, and ecology

What happened
MIT prof Joseph Paradiso is profiled for decades of work on wearable and environmental sensing and the idea of “responsive environments.” The piece spans prototypes and field deployments, emphasizing sensing as a cross-disciplinary tool that can be worn, embedded, and carried into the world.

Why it matters
If “physical AI” is the story of machines acting in the world, sensing is the quieter prerequisite: what you can reliably measure determines what you can safely automate. Paradiso’s career is also a reminder that data collection isn’t only web-scale scraping—it can be embodied, ecological, and designed for specific real-world constraints.

Key details

The profile describes collaborations with National Geographic Explorers using sensors for animal behavior and environmental monitoring, including work involving lions/hyenas in Botswana and goats in Chile. (link)
The article highlights acoustic sensors with onboard AI used for monitoring endangered honeybees in Patagonia. (link)
Paradiso was named an IEEE Fellow in January for contributions to wireless wearable sensing and mobile energy harvesting. (link)

Source links
https://news.mit.edu/2026/mit-professor-joseph-paradiso-sensing-innovations-0310

4) Predictive oncology: Modeling how tumors evolve—and how resistance emerges

What happened
In a Q&A, MIT Assistant Professor Matthew Jones describes research aimed at building predictive models of tumor progression and therapy resistance across genetic, epigenetic, metabolic, and microenvironmental factors. The framing is explicitly forward-looking: understanding how tumors change in space and time to anticipate failure modes earlier.

Why it matters
Across AI and biology, a shared theme is shifting from reactive pattern recognition to forecasting dynamics—what happens next, not only what happened. In cancer research, that means trying to predict which resistance pathways are likely to appear, and when, rather than waiting for treatments to stop working.

Key details

Jones describes therapy resistance as a central challenge: treatments can work initially and then fail as tumors evolve. (link)
He outlines a multi-level view of tumor evolution spanning genetic, epigenetic, metabolic, and microenvironmental layers. (link)
The Q&A uses a “play chess with cancer” metaphor to emphasize anticipating tumor evolution rather than reacting after resistance appears. (link)

Source links
https://news.mit.edu/2026/3-questions-building-predictive-models-characterize-tumor-progression-0310

5) NVIDIA on Hugging Face: The “open” conversation shifts toward datasets

What happened
NVIDIA published a guest post on Hugging Face positioning “open data for AI” as a key lever for progress, pointing readers to datasets, recipes, and community resources. The post fits a broader NVIDIA message: openness isn’t only about releasing model weights; it’s also about how training and evaluation data is packaged, shared, and reproduced.

Why it matters
As foundation models commoditize, data becomes the differentiator—especially in robotics and “physical AI,” where collecting high-quality multimodal ground truth is expensive and slow. This also creates a clearer battleground for transparency: provenance, documentation, and repeatable training pipelines.

Key details

The Hugging Face post describes NVIDIA’s framing around open datasets and invites builders to explore NVIDIA datasets on the platform. (link)
NVIDIA’s broader blog framing emphasizes open models and open data as part of its ecosystem strategy. (link)

Source links
https://huggingface.co/blog/nvidia/open-data-for-ai?utm_source=openai
https://blogs.nvidia.com/blog/open-models-data-ai/?utm_source=openai

Closing takeaway
This week’s research lands on a shared lesson: the next leap in real-world AI won’t come from “smarter” models alone, but from better inputs and tighter guarantees—richer sensing, stronger localization priors, and planning layers that can be checked before anything moves.

—

Want to learn how to USE AI technology to make money and/or your life easier? Join our FREE AI community here: https://www.skool.com/ai-with-apex/about

From Pokémon Go Scans to PDDL Plans: The Data Stack Behind “Physical AI”

From Pokémon Go Scans to PDDL Plans: The Data Stack Behind “Physical AI”

TL;DR

1) Niantic Spatial: Pokémon Go-era scans become a Large Geospatial Model

2) MIT: A dual VLM that turns one image into a formal plan (PDDL + solver)

3) MIT Media Lab: Joseph Paradiso’s sensing work across wearables, art, and ecology

4) Predictive oncology: Modeling how tumors evolve—and how resistance emerges

5) NVIDIA on Hugging Face: The “open” conversation shifts toward datasets

Related Articles

OpenAI Tightens Cyber Access While AMD ROCm Gets a Practical Fine-Tuning Showcase

Today in AI: Automation’s Incentives, RL Correctness, and the Rise of Voice Agents

Better AI Reasoning, Better AI Benchmarks

AI’s Expanding Front: Democracy, OpenAI’s Trial, and the CFO Office

AI’s Next Phase Is About Power, Security, Control, and Infrastructure

Today’s AI Story: From Black Boxes to Real-World Workflows

AI Daily: Bias Fixes, Science Workflows, Eval Costs, and the New Compute Reality

AI Daily Roundup: Edge Privacy, MIT-IBM’s New Lab, NVIDIA’s Omni Model, and OpenAI’s Cyber Push

AI Enters Its Infrastructure Era: OpenAI’s FedRAMP Win, Musk’s Trial, and the Enterprise Data Reality Check

AI’s New Shape: DeepMind in Korea, MIT’s Energy Tool, and OpenAI’s AGI Principles

DeepSeek V4 and MIT MathNet Show AI’s Next Phase: Infrastructure and Honest Evaluation

DeepSeek-V4 Launches With 1M-Token Context and a Clear Bet on AI Agents

YouTube and LinkedIn

Looking for Something?

From Pokémon Go Scans to PDDL Plans: The Data Stack Behind “Physical AI”

From Pokémon Go Scans to PDDL Plans: The Data Stack Behind “Physical AI”

TL;DR

1) Niantic Spatial: Pokémon Go-era scans become a Large Geospatial Model

2) MIT: A dual VLM that turns one image into a formal plan (PDDL + solver)

3) MIT Media Lab: Joseph Paradiso’s sensing work across wearables, art, and ecology

4) Predictive oncology: Modeling how tumors evolve—and how resistance emerges

5) NVIDIA on Hugging Face: The “open” conversation shifts toward datasets

Related Articles

Free AI Newsletter