Commit 0a5087d5 authored by Olivier Bertrand's avatar Olivier Bertrand
Browse files

Start addinng dlt with more than 11 params

parent 182b80f0
%% Cell type:markdown id: tags:
# Recording animal trajectory
## Camera calibration
## Camera recording
A camera is an optical intrument for aquiring images. Cameras are composed of a sensors (converting photon to electrical signal) and a lens (foccussing light rays on the sensors). \
Many experimental paradigm require the recording of animal motion with one or more camera. \
The experimenter is more interested by the position of the animal in his or her arena (world \
A camera is an optical intrument for aquiring images. Cameras are composed of a sensors (converting photon to electrical signal) and a lens (foccussing light rays on the sensors).
Many experimental paradigm require the recording of animal motion with one or more camera.
The experimenter is more interested by the position of the animal in his or her arena (world
coordinate system) than by the position of the animal within the image (camera-coordinate system). Therefore, we need to transform the position of the animal in the image to its position in the world. To do so, we need to calibrate our camera. This calibration require two steps:
* remove distortion from the lens (intrinsic calibration)
* determine the position and orientation (i.e. pose) of the camera in the environment (extrinsic calibration)
### Intrinsic calibration
### Direct Linear Transformation (DLT)
When we observe an animal through a camera, we only observe the projection of the animal. For example the center of mass of the animal in three dimension is projected in two dimension on the camera, i.e. in the camera plane, and only these two dimensions are visible from the camera. Mathematically, the point in 3D space is transformed into the camera space. This transformation can be described as follow:
$$
\begin{align}
\begin{bmatrix} u-u_0 \\ v-v_0 \\ -d \end{bmatrix} &= T \begin{bmatrix} x-x_0 \\ y-y_0 \\ z-z_0 \end{bmatrix} \\
&=c \begin{bmatrix} r_{11} & r_{12} & r_{13} \\ r_{21} & r_{22} & r_{23} \\ r_{31} & r_{32} & r_{33} \end{bmatrix}\begin{bmatrix} x-x_0 \\ y-y_0 \\ z-z_0 \end{bmatrix}
\end{align}
$$
with:
* $u$, $v$ the position of the animal on the camera
* $x$, $y$, $z$ the position of the animal in the environment
* $u_0$, $v_0$, the center of the camera
* $x_0$, $y_0$, $z_0$ the center of the environment (origin)
* $d$, the distance between the point on the camera and the point in the environment
* $c$, a constant of colinarity
**Note** The unit of $u$, $v$ , $u_0$, and $v_0$ have the same units than $x,y,z$. However, $u$ and $v$ will be usually measured in pixels, and thus we need to further transform $u$ and $v$ by a constant of proportionality. $u-u_0 \Rightarrow \lambda_u(u-u_0)$, and $v-v_0 \Rightarrow \lambda_v(v-v_0)$.
#### Calibration
We ultimatly want to have the position of the animal within its environment from two or more cameras. To be able to reconstruct (triangulate) the animal position, we first need to have a method to describe the transformation from camera to world and vice versa, i.e. obtained the parameters of the transformation from known points. This process is called camera calibration.
We need to express $u$ and $v$ as a function of $x$, $y$, and $z$.
$$
u = \frac{L_1x+L_2y+L_3z+L_4}{L_9x+L_{10}y+L_{11}z+L_1}\text{, and }
v = \frac{L_5x+L_6y+L_7z+L_8}{L_9x+L_{10}y+L_{11}z+L_1}
$$
the coefficients $L_1$ to $L_{11}$ are the DLT parameters that reflect the relationships between the environment frame of our animal and the image frame.
The system of equation above can rewritten as a product of matrices:
$$
\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} x & y & z & 1 & 0 & 0 & 0 & 0 & -ux & -uy & -uz \\ 0 & 0 & 0 & 0 & x & y & z & 1 & -vx & -vy & -vz \end{bmatrix}\begin{bmatrix} L_1 \\ L_2 \\ . \\ L_{10} \\ L_{11} \end{bmatrix}
$$
We can expand this equations for $n$ points visible on the camera:
$$
\begin{bmatrix}
u_1 \\ v_1 \\
u_2 \\ v_2 \\
... \\
u_{n-1} \\ v_{n-1} \\
u_n \\ v_n \\
\end{bmatrix} =
\begin{bmatrix}
x_1 & y_1 & z_1 & 1 & 0 & 0 & 0 & 0 & -u_1x_1 & -u_1y_1 & -u_1z_1 \\
0 & 0 & 0 & 0 & x_1 & y_1 & z_1 & 1 & -v_1x_1 & -v_1y_1 & -v_1z_1 \\
x_2 & y_2 & z_2 & 1 & 0 & 0 & 0 & 0 & -u_2x_2 & -u_2y_2 & -u_2z_2 \\
0 & 0 & 0 & 0 & x_2 & y_2 & z_2 & 1 & -v_2x_2 & -v_2y_2 & -v_2z_2 \\
. \\
x_{n-1} & y_{n-1} & z_{n-1} & 1 & 0 & 0 & 0 & 0 & -u_{n-1}x_{n-1} & -u_{n-1}y_{n-1} & -u_{n-1}z_{n-1} \\
0 & 0 & 0 & 0 & x_{n-1} & y_{n-1} & z_{n-1} & 1 & -v_{n-1}x_{n-1} & -v_{n-1}y_{n-1} & -v_{n-1}z_{n-1} \\
x_n & y_n & z_n & 1 & 0 & 0 & 0 & 0 & -u_nx_n & -u_ny_n & -u_nz_n \\
0 & 0 & 0 & 0 & x_n & y_n & z_n & 1 & -v_nx_n & -v_ny_n & -v_nz_n \end{bmatrix}
\begin{bmatrix} L_1 \\ L_2 \\ . \\ L_{10} \\ L_{11} \end{bmatrix}
$$
If we have 11 equations or more we can could deterime the DLT parameters from the system of equations. Since each points in the environment will lead to two variables $u$ and $v$, we only need 6 points to be able to obtain the DLT parameters
#### Reconstruction
#### Exercises
* Derive from the $r_11$ to $r_33$, $d$ and $c$ what are $L_1$ to $L_{11}$
#### 16 DLT parameters
The DLT method exposed above does not correct for optical distortion nor de-centering distortion.
$$
\frac{1}{R}\begin{bmatrix} u \\ v \end{bmatrix} =
\frac{1}{R}\begin{bmatrix}
x & y & z & 1 & 0 & 0 & 0 & 0 & -ux & -uy & -uz & \zeta r^2R & \zeta r^4R & \zeta r^6R & (r^2+2\zeta^2)R & \zeta\eta R \\
0 & 0 & 0 & 0 & x & y & z & 1 & -vx & -vy & -vz & \eta r^2R & \eta r^4R & \eta r^6R & \zeta\eta R & (r^2+2\eta^2)R\\ \end{bmatrix}\begin{bmatrix} L_1 \\ L_2 \\ . \\ L_{15} \\ L_{16} \end{bmatrix}
$$
with
* $R=L_9x + L_{10}y + L_{11}z +1$,
* $r^2 = \eta^2 + \zeta^2$,
* $\zeta = u-u_0$, and
* $\eta = v-v_0$
%% Cell type:markdown id: tags:
### Alternative method
A camera is an optical intrument for aquiring images. Cameras are composed of a sensors (converting photon to electrical signal) and a lens (foccussing light rays on the sensors).
Many experimental paradigm require the recording of animal motion with one or more camera.
The experimenter is more interested by the position of the animal in his or her arena (world
coordinate system) than by the position of the animal within the image (camera-coordinate system). Therefore, we need to transform the position of the animal in the image to its position in the world. To do so, we need to calibrate our camera. This calibration require two steps:
* remove distortion from the lens (intrinsic calibration)
* determine the position and orientation (i.e. pose) of the camera in the environment (extrinsic calibration)
#### Intrinsic calibration
`With Matlab <https://de.mathworks.com/help/vision/ug/single-camera-calibrator-app.html>`_
`With Opencv <https://docs.opencv.org/3.3.1/dc/dbb/tutorial_py_calibration.html>`_
### Extrinsic calibration
#### Extrinsic calibration
To obtain the pose of the camera (i.e. the extrinsic parameters), we need
to find a transformation such that the projection of physical points matche
the points position in 3D. In other words, a ray emanating from a projected
points should cross the physical points in 3D.
The estimation of the camera pose is done by solving the PnP with Ransac.
Using Ransac decrease the bad effect that outliers may have on the pose
estimation.
#### Building your own manhattan
## Building your own manhattan
The manhattan should contains at least 7 towers, and not all towers should
share the same plane or line. For best results it is recommended that the
towers are visible across the entire recorded range on the image. Note,
however, that not all towers need to be visible (at least 7 though).
......
......@@ -101,6 +101,53 @@ units are [u,v] i.e. camera coordinates or pixels
return xyz, rmse
def dlt_principal_point(coeff):
normalisation = np.sum(coeff[8:11]**2)
u_0 = np.sum(coeff[0:3]*coeff[8:11])/normalisation
v_0 = np.sum(coeff[4:7]*coeff[8:11])/normalisation
return u_0, v_0
def dlt_principal_distance(coeff):
normalisation = np.sum(coeff[8:11]**2)
return 1/np.sqrt(normalisation)
def dlt_scale_factors(coeff):
u_0, v_0 = dlt_principal_point(coeff)
normalisation = np.sum(coeff[8:11]**2)
du = (u_0*coeff[8] - coeff[0])**2
du += (u_0*coeff[9] - coeff[1])**2
du += (u_0*coeff[10] - coeff[2])**2
du /= normalisation
dv = (v_0*coeff[8] - coeff[4])**2
dv += (v_0*coeff[9] - coeff[5])**2
dv += (v_0*coeff[10] - coeff[6])**2
dv /= normalisation
return du, dv
def dlt_transformation_matrix(coeff):
u_0, v_0 = dlt_principal_point(coeff)
d = dlt_principal_distance(coeff)
du, dv = dlt_scale_factors(coeff)
transform = np.zeros((3, 3))
transform[0, 0] = (u_0*coeff[8]-coeff[0])/du
transform[0, 1] = (u_0*coeff[9]-coeff[1])/du
transform[0, 2] = (u_0*coeff[10]-coeff[2])/du
transform[1, 0] = (v_0*coeff[8]-coeff[4])/dv
transform[1, 1] = (v_0*coeff[9]-coeff[5])/dv
transform[1, 2] = (v_0*coeff[10]-coeff[6])/dv
transform[2, 0] = coeff[8]
transform[2, 1] = coeff[9]
transform[2, 2] = coeff[10]
return d*transform
def dlt_inverse(coeff, frames):
"""
This function reconstructs the pixel coordinates of a 3D coordinate as
......@@ -123,10 +170,68 @@ def dlt_inverse(coeff, frames):
coeff[5]+frames[:, 2]*coeff[6]+coeff[7]
uv[:, 0] /= normalisation
uv[:, 1] /= normalisation
# Apply distortion
delta_uv = np.zeros((frames.shape[0], 2))
u_0, v_0 = dlt_principal_point(coeff)
zeta = uv[:, 0] - u_0
eta = uv[:, 1] - v_0
rsq = zeta**2 + eta**2
if coeff.shape[0] > 11:
delta_uv[:, 0] += zeta*(coeff[11]*rsq)
delta_uv[:, 1] += eta*(coeff[11]*rsq)
if coeff.shape[0] > 13:
delta_uv[:, 0] += zeta*(coeff[12]*(rsq**2) + coeff[13]*(rsq**3))
delta_uv[:, 1] += eta*(coeff[12]*(rsq**2) + coeff[13]*(rsq**3))
if coeff.shape[0] > 15:
delta_uv[:, 0] += coeff[14]*(rsq + 2*(zeta**2)) + coeff[15]*eta*zeta
delta_uv[:, 1] += coeff[15]*(rsq + 2*(eta**2)) + coeff[14]*eta*zeta
# print(eta, zeta, rsq)
uv += delta_uv
return uv
def dlt_compute_coeffs(frames, campts):
def _dlt_matrices_calib(vframes, vcampts, nparams=11, l1to11=np.zeros(11)):
# re arange the frame matrix to facilitate the linear least
# sqaures solution
if nparams < 11:
nparams = 11
matrix = np.zeros((vframes.shape[0]*2, 16)) # 16 for the dlt
u_0, v_0 = dlt_principal_point(l1to11)
for num_i, index_i in enumerate(vframes.index):
# eta, zeta, Rsq depends on L9to11
zeta = vcampts.loc[index_i, 'u'] - u_0 # -u_0 = 0 ??
eta = vcampts.loc[index_i, 'v'] - v_0 # -v_0 = 0
R = np.sum(l1to11[-3:] * vframes.loc[index_i, ['x', 'y', 'z']])+1
rsq = eta**2 + zeta**2
# populate matrix
matrix[2*num_i, 0:3] = vframes.loc[index_i, ['x', 'y', 'z']]
matrix[2*num_i+1, 4:7] = vframes.loc[index_i, ['x', 'y', 'z']]
matrix[2*num_i, 3] = 1
matrix[2*num_i+1, 7] = 1
matrix[2*num_i, 8:11] = vframes.loc[index_i,
['x', 'y', 'z']]*(-vcampts.loc[index_i, 'u'])
matrix[2*num_i+1, 8: 11] = vframes.loc[index_i,
['x', 'y', 'z']]*(-vcampts.loc[index_i, 'v'])
# 12th parameter
matrix[2*num_i, 11] = zeta*rsq*R
matrix[2*num_i+1, 11] = eta*rsq*R
# 13th and 14th parameters
matrix[2*num_i, 12] = zeta*(rsq**2)*R
matrix[2*num_i+1, 12] = eta*(rsq**2)*R
matrix[2*num_i, 13] = zeta*(rsq**3)*R
matrix[2*num_i+1, 13] = eta*(rsq**3)*R
# 15th and 16th parameters
matrix[2*num_i, 12] = (rsq + 2*(zeta**2))*R
matrix[2*num_i+1, 12] = eta*zeta*R
matrix[2*num_i, 13] = eta*zeta*R
matrix[2*num_i+1, 13] = (rsq + 2*(eta**2))*R
matrix[2*num_i, :] /= R
matrix[2*num_i+1, :] /= R
return matrix[:, : nparams]
def dlt_compute_coeffs(frames, campts, nparams=11, niter=100):
"""
A basic implementation of 11 parameters DLT
......@@ -142,36 +247,44 @@ single plane.
# remove NaNs
valid_idx = frames.dropna(how='any').index
valid_idx = campts.loc[valid_idx, :].dropna(how='any').index
# valid df
vframes = frames.loc[valid_idx, :]
vcampts = campts.loc[valid_idx, :]
# re arange the frame matrix to facilitate the linear least
# sqaures solution
matrix = np.zeros((vframes.shape[0]*2, 11)) # 11 for the dlt
for num_i, index_i in enumerate(vframes.index):
matrix[2*num_i, 0:3] = vframes.loc[index_i, ['x', 'y', 'z']]
matrix[2*num_i+1, 4:7] = vframes.loc[index_i, ['x', 'y', 'z']]
matrix[2*num_i, 3] = 1
matrix[2*num_i+1, 7] = 1
matrix[2*num_i, 8:11] = \
vframes.loc[index_i, ['x', 'y', 'z']]*(-vcampts.loc[index_i, 'u'])
matrix[2*num_i+1, 8:11] = \
vframes.loc[index_i, ['x', 'y', 'z']]*(-vcampts.loc[index_i, 'v'])
# re argen the campts array for the linear solution
vcampts_f = np.reshape(np.flipud(np.rot90(vcampts)), vcampts.size, 1)
print(vcampts_f.shape)
print(matrix.shape)
# Get the matrices for calib
matrix = _dlt_matrices_calib(vframes, vcampts)
vcampts_f = vcampts.values.flatten() # [u_1, v_1, ... u_n, v_n]
# get the linear solution the 11 parameters
coeff = np.linalg.lstsq(matrix, vcampts_f, rcond=None)[0]
# compute the position of the frame in u,v coordinates given the linear
# solution from the previous line
matrix_uv = dlt_inverse(coeff, vframes)
# compute the rmse between the ideal frame u,v and the
# recorded frame u,v
rmse = np.sqrt(np.mean(np.sum((matrix_uv-vcampts)**2)))
return coeff, rmse
if nparams == 11:
return coeff, rmse
# Now we can try to guess the other coefficients
if nparams in [12, 14, 16]:
for _ in range(niter):
# 9th to 11th parameters are index 8 to 10 (0 being the 1st param)
l1to11 = coeff[: 11]
# Get the matrices for calib
matrix = _dlt_matrices_calib(vframes, vcampts,
nparams=nparams, l1to11=l1to11)
vcampts_normed = vcampts_f.copy()
for num_i, index_i in enumerate(vframes.index):
normalisation = np.sum(
l1to11[-3:] * vframes.loc[index_i, ['x', 'y', 'z']])+1
vcampts_normed[2*num_i] /= normalisation
vcampts_normed[2*num_i + 1] /= normalisation
coeff = np.linalg.lstsq(matrix, vcampts_normed, rcond=None)[0]
print(coeff)
# compute the position of the frame in u,v coordinates given the linear
# solution from the previous line
matrix_uv = dlt_inverse(coeff, vframes)
# compute the rmse between the ideal frame u,v and the
# recorded frame u,v
rmse = np.sqrt(np.mean(np.sum((matrix_uv-vcampts)**2)))
return coeff, rmse
else:
raise ValueError('nparams can be either [11,12,14,16]')
......@@ -5,6 +5,10 @@ import cv2
import os
from navipy.arenatools.cam_dlt import dlt_inverse
from navipy.arenatools.cam_dlt import dlt_compute_coeffs
from navipy.arenatools.cam_dlt import dlt_principal_point
from navipy.arenatools.cam_dlt import dlt_principal_distance
from navipy.arenatools.cam_dlt import dlt_scale_factors
from navipy.arenatools.cam_dlt import dlt_transformation_matrix
keybinding = dict()
......@@ -42,6 +46,16 @@ def parser_dlt_calibrator():
parser.add_argument('-s', '--scale',
default=0.5,
help=arghelp)
arghelp = 'number of dlt parameters (coeff)'
parser.add_argument('-c', '--ndltcoeff',
type=int,
default=11,
help=arghelp)
arghelp = 'number of iteration for calibration'
parser.add_argument('-e', '--epoque',
type=int,
default=100,
help=arghelp)
return parser
......@@ -71,6 +85,8 @@ def main():
# The image may need to be scale because screen may have
# less pixel than the image
scale = args['scale']
ndlt_coeff = args['ndltcoeff']
niter = args['epoque']
imageref = cv2.imread(args['image'])
# define some constants
showframe_ref = 50
......@@ -144,9 +160,17 @@ def main():
break
if key == ord("c"):
print('calibrate')
coeff, rmse = dlt_compute_coeffs(frames, campts)
coeff, rmse = dlt_compute_coeffs(
frames, campts, nparams=ndlt_coeff,
niter=niter)
print(rmse)
print(coeff)
print('principal points: {}'.format(dlt_principal_point(coeff)))
print('principal distance: {}'.format(
dlt_principal_distance(coeff)))
print('scale factor: {}'.format(dlt_scale_factors(coeff)))
print('transform:')
print(dlt_transformation_matrix(coeff))
coeff = pd.Series(data=coeff)
coeff.to_csv(os.path.splitext(args['points'])[0]+'_coeff.csv')
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment