Reminds me of Ecstatica [1], a 1994 game that had intense visuals with a very odd/different rendering engine made of 3D ellipsoids; in a way really crude splats in gouraud shading.
People have also converted some small sections of Unreal 5 demos into splats https://superspl.at/scene/692c4f91
Or perhaps use a real world scan - it was suggested this one would make an ideal setting for zombies https://superspl.at/scene/6359774f
The contributions of 3DGS lie in how fast you can make them in modern GPU hardware (tiling + sorting with threads), and how to make the pipeline differentiable so that you can fit the Gaussian splats with photogrammetry data. Similar to the history of deep learning, it became technically feasible once the GPU hardware was powerful enough.
Its honestly really very hard to work with this stuff because you ultimately need to be able to meshes inside these scenes triangle seas and you need to do it in a way that plausibly fits in the world. You can't have unlit characters walking around a baked lit scene and have them fit in. That's just from a visual design perspective.
You also always want to have bounce light from your dynamic things onto the baked scene and depending on the tech, you might not even be able to spatially place a dynamic thing and have it properly occlude what splats it needs to occlude.
As is, its a niche technology for games. That might change one day.
https://github.com/googlevr/seurat https://www.youtube.com/watch?v=Pf5Q3bvXj8E
If you mean the technique of splatting specifically, Dreams for PS4 [1] is prior art.
If you mean pre-rendering, there's Myst and games like the original FF7 for PS1.
I'm probably being a bit of a grinch about it but the abstract doesn't address performance or hardware constraints either so I guess I'm going to have to read the damn paper.
I captured a video on a smartphone camera, using the OpenCamera app. Specifically, this video was captured with exposure locked, framerate locked, focus locked, fairly high framerate and resolution. I walked slowly and carefully around an outdoor scene, trying to get fairly good coverage from multiple angles. I took roughly 20 minutes of video, weighing 19GB.
This video was sampled into individual image frames at about 5fps using ffmpeg. There's room for experimentation and improvement here, an adaptive, coverage-aware sampling strategy would be better. But fixed 5fps was Good Enough (tm). This resulted in roughly 8,000 images at 4k. This was a pretty hefty dataset for my limited 1080, but I made it work.
I then generated masks for these images, to ignore transient objects during the splat training. (i.e. to cut out people who transiently walked through the scene). For this I used Cutie (https://github.com/hkchengrex/Cutie). For outdoor scenes, it can also make sense to mask out low-parallax areas like faraway mountains or especially the sky, as these are difficult to train correctly. If masks are generated for some images, you'll need at least placeholder masks for the all of them. In the end I've got about 8,000 PNGs that are monochrome black/white masks.
Then the images are handed to COLMAP (https://github.com/colmap/colmap), using the 'global mapper' option. This registers the camera positions in 3D space, and generates a crude point cloud that's good for sanity-checking. This step required a fair bit of iteration to get right. The full reconstructed output from COLMAP is not necessary, only the pose-estimate .bin files. The output directory here was about 500MB for this step for me.
With COLMAP registration done, the next step is the actual training. I found two useful pieces of software for this, with different tradeoffs.
Brush (https://github.com/ArthurBrussee/brush). Was very straightforward to install and use, requiring very little in external dependencies and setup. It was also pretty speedy on training, and gave good results. Minor modifications to the training process were possible by editing source, though I didn't get too wild here. Brush takes the *.bin files from COLMAP, plus the original images directory, and the masks directory if it exists. Run on its own, this could produce gaussian splat .ply files, 500-800MB in size, containing 1-10M splats. More than that and my poor little 8GB of VRAM OOM'd.
nerfstudio (https://github.com/nerfstudio-project/nerfstudio) Was also useful, as many research papers get implemented in its framework. In particular, for this outdoor scene, I used wild-gaussians (https://github.com/jkulhanek/wild-gaussians/) to generate just a sky sphere (to help seed low-parallax areas in my particular dataset), stopped training, and used this as an init.ply to pass to brush.
I then set up a very simple viewer website, using SuperSplat (https://github.com/playcanvas/supersplat). I used supersplat's editor to align the splat's coordinate system with the rotation and scaling that I wanted, and then exported an optimized .sog file, roughly 1/10th the size. .sog is nominally open-standards, though I'm not aware of any other projects using the format. This gave fairly good framerates and adequate controls across a variety of platforms.
As a little bit extra, supersplat's splat-transform CLI tool was used to generate a crude collision mesh for the scene, enabling a walking mode that respected object boundaries.
If there's interest I can post my results, I got a bit sidetracked with other projects and other splats, and this particular one I got fiddling with some more cleanup. I can get it up with a few more hours work. But hopefully that's a good start, all of these are fully FOSS, and resulted in a good-looking splat.
<3
Umm on my machine it has 560px margin on both sides with the content being only 474px sliver in the middle?
Personally I suspect they are getting a bit more attention then they "deserve"; people aren't talking about their weaknesses very much and I think that's resulting in some overexcitement. Some of the "we can replace everything with splats!" reminds me of the people who still don't understand why "if GPUs are thousands of times faster than CPUs why don't we run everything on GPUs?" is basically not even a sensible question. I don't see them as ever being the foundation of a graphics stack, but they definitely have a place as part of a well-rounded menu of techniques that can be brought to bear on a wide range of problems.
This is the big thing imho. Sure, you can do traditional photogrammetry to capture meshes and textures but getting the shaders exactly right is afaik non-trivial etc, and if you want real-time rendering then you likely need some further post-processing of the assets. With 3dgs you can pretty much bypass all that complexity and the whole pipeline from photos to rendered frame is much more straightforward.
I think future papers would probably continue improving on this method and focus on how to sample the points more efficiently while being unbiased (similar to how ray-tracing solved their performance issues). Or maybe... we can just add a deep-learning based denoiser and call it a day!
At least if it's progressive (so refines and resolves over time), this has been done with pointclouds in the VFX industry in GPU shaders for years in terms of stochastically drawing different points so eventually the whole point set gets rasterised to a fidelity threshold.
Or the per-pixel coord atomic I guess?
Kind of like Minecraft... but with user-generated gaussian-splat blocks.
Point splatting does introduce a lot of noise though, and their denoiser introduces ghosting, but they say a more sophisticated denoiser would give considerably better quality.
Really?! What OSs can handle that many native threads?
Also, this seems quite similar to stochastic progressive drawing of pointclouds for realtime that has been done for > 15 years in the VFX industry with GPU shaders in a tiled/bucketed fashion, unless this isn't progressive maybe? (The fact it's been accepted for Siggraph likely indicates it's slightly different).
Future proofing I guess...
Ordinarily I don't prefer video, but the visuals are helpful here.
Also, an online interactive, but it seems to only work in Chrome: https://superspl.at/scene/ff1d0393