A camera is a mapping between the 3D world (object space) and a 2D image.

In general, the camera projection matrix P has 11 degrees of freedom: $P=K[R\ \ \ t]$

Component # DOF Elements Known As
K 5 $$f_x, f_y, s,p_x, p_y$$ Intrinsic Parameters; camera calibration matrix
R 3 $$\alpha,\beta,\gamma$$ Extrinsic Parameters
t (or $$\tilde{C}$$) 3 $$(t_x,t_y,t_z)$$ Extrinsic Parameters

3D world frame ----- R, t ----> 3D camera frame ------ K -----> 2D image

Explanation:

• P: Projective camera, maps 3D world points to 2D image points.

• K: Camera calibration matrix, 3 x 3, $$x=K[I|0]X_{cam}$$, given 3D points in camera coordinate frame $$X_{cam}$$, we can project it into 2D points on image $$x$$.

• R and t: Camera Rotation and Translation, rigid transformation. $$X_{cam}=( X,Y,Z,1)^T$$ is expressed in the camera coordinate frame. In general, 3D points are expressed in a different Euclidean coordinate frame, known as the world coordinate frame. The two frames are related via a rigid transformation (R, t).

### Some other terms you may see

• P: 3x4, homogeneous, camera projection matrix, $$P=diag(f,f,1)[I|0]$$. P is K without considering $$(x_{cam},y_{cam})$$ in the image. (In other words, it simplify $$(p_x, p_y)=(0,0)$$.