NeRF Hybrid representations
This is part of my journey of learning NeRF.
2.2. Hybrid representations
Tradeoffs of choosing a proper representation


You may choose one proper representation depending on your own application
1. Grid

Input is too huge. Then you need too huge neural network. So, this grid interpolation acts like a "position encoding", which encodes the low dimensional features into high dims.

NeRFusion CVPR22: online!
2. point cloud

Cons:
- To access local points, you need to specifically design the data structure. Otherwise, it is O(n)!
- Choose different kernels to retrieve nearby points' features. Oftentimes you assume it is local kernel.
3. Mesh
Unstructed grids. Compared with point clouds, meshes have connectivity info.


4. Multiplanar Images
Something like project a 3D grid into an axis to get levels of planes.

Pros:
- Compact
- Very efficient because the hardware and software designs are accelerated to these 2D operations, like bi-linear operations.
Cons:
- Resolution bias on plane axis: coz it is discrete betweens planes.
This is not very wise in my opinion. It is just a temporary tradeoff given nowadays' technologies. Coz everything will be 3D in the future.
Generate 2D images from different camera views (perhaps). Key point is the tri-plane representation of 3D features.
5. Multiresolution grids

Pros:
- Stable coz you indeed need both low and high resolution info
6. Hash grids
\[
[x,y,z]\text{ coordinates}\rightarrow \text{Hash function()} \rightarrow \text{Fixed size codebook}
\] Pros:
- No matter how big is the original data, you can use a fixed size codebook as the input feature.
- Can be online!
Cons:
- May still need large codebooks
- Features not spatially local. I don't think the hash grid is a good idea if this drawback exists. But isn't there a simple way to generate features with local info remaining?
7. Codebook grids

Instead of storing features of points in grids, store a (index to a) code in a codebook. The size of the codebook is fixed, so the overall size can be controlled as much smaller.
cons:
- To make the indexing operation differentiable, the computing complexity rises here.
- Using hash is to get rid of the complex data structure, but the indices bring it back.
8. Bounding Volume Hierarchies

Commonly used method in computer graphics
9. Others (voxel)

- For dynamic nerfs, is there any better hybrid representation? Sure.
- Is there any explicit bias of these hybird representations that we can discover and then design regularization? Sure.