Hand-Eye Calibration Problem

Frame Definitions and Observed Points

In the hand-eye calibration problem, we define the following coordinate frames:

  • Base Frame : The fixed coordinate frame of the robot arm, denoted as $ \mathbf{B} $
  • End-Effector Frame : The frame attached to the robot's end-effector, denoted as $ \mathbf{E} $
  • Camera Frame : The coordinate frame of the camera, which is rigidly mounted to the end-effector, denoted as $ \mathbf{C} $
  • World Frame : A global reference frame in which known points are defined, denoted as $ \mathbf{W} $

We assume that four known points are in the world frame:

$$ p_W = \begin{bmatrix} -1 & -1 & 0 \\ -1 & 1 & 0\\ 1 & -1 & 0\\ 1 & 1 & 0 \end{bmatrix} $$

where each row represents the ((x, y, z)) coordinates of a point in the world frame.

These points serve as reference points that can be used to estimate the transformation between different coordinate frames.

Hand-Eye Calibration Goal

In the hand-eye calibration problem, our goal is to estimate the transformation between the camera and the end-effector, as well as the camera's intrinsic parameters.

We define the transformation from the camera frame $\mathbf{C}$ to the end-effector frame $\mathbf{E}$ as: $\mathbf{T}_{EC} \in SE(3)$. This transformation allows us to convert a point $\mathbf{p}_C$ in the camera frame to the end-effector frame using: $\mathbf{p}_E = \mathbf{T}_{EC} \mathbf{p}_C$

In addition to the hand-eye transformation, we may also need to estimate the camera intrinsics , which describe how 3D points in the camera frame are projected onto the image plane. The camera intrinsics matrix is given by $ \mathbf{K} = \begin{bmatrix} f_x & \gamma & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix}$.

where:

  • $f_x, f_y$ are the focal lengths in pixels,
  • $c_x, c_y$ are the principal point coordinates.
  • $\gamma$ is a skew parameter.

Note:

  • In this tutorial, we assume there is no tangent and radial distortion
  • See Optimal Hand-Eye Calibration from Klaus H Strobl and Gerd Hirzinger for more details

Install packates (on Google Colab) and import packages

In [ ]:
In [14]:

Steps to Solve the Hand-Eye Calibration Problem

References:

Robot Sensor Calibration: Solving AX = XB on the Euclidean Group Frank C. Park and Bryan J Martin

Optimal Hand-Eye Calibration Klaus H Strobl and Gerd Hirzinger

A flexible new technique for camera calibration Zhengyou Zhang

Simulated Problem Definition

To validate the hand-eye calibration method, we define a simulated environment where a camera observes known 3D points in the world. This setup allows us to generate synthetic camera poses and project world points onto the image plane.

Defining 3D Points in the World Frame

We define four 3D points in the world frame $\mathbf{W}$ that lie on the $z=0$ plane:

$$ \mathbf{p}_W = \begin{bmatrix} -1 & -1 & 0 \\ 1 & -1 & 0 \\ 1 & 1 & 0 \\ -1 & 1 & 0 \end{bmatrix} $$

To work with homogeneous coordinates, we append a row of ones:

$$ \tilde{\mathbf{p}}_W = \begin{bmatrix} -1 & 1 & 1 & -1 \\ -1 & -1 & 1 & 1 \\ 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 \end{bmatrix} $$

Camera Intrinsic Matrix

The camera intrinsics matrix $\mathbf{K}$ models how 3D points are projected onto the image plane:

$$ \mathbf{K} = \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} $$

where:

  • $f_x = 800, f_y = 800$ are the focal lengths (in pixels),
  • $c_x = 320, c_y = 240$ are the principal point coordinates.
  • We have $\gamma$ as 0 in this case.

Generating Camera Poses

To simulate multiple views, we generate camera poses in spherical coordinates, where:

  • The camera moves along a sphere of varying radii ($r \in [10, 20]$),
  • The azimuth angle varies from $0^\circ$ to $360^\circ$,
  • The elevation angle varies from $30^\circ$ to $60^\circ$.

For a given azimuth $\theta$ and elevation $\phi$, the camera position in Cartesian coordinates is:

$$ \mathbf{t} = \begin{bmatrix} r \cos\phi \cos\theta \\ r \cos\phi \sin\theta \\ r \sin\phi \end{bmatrix} $$

The camera orientation is determined using a look-at transformation , where:

  • The forward vector points from the camera to the world origin,
  • The right vector is computed as the cross-product of the world $z$-axis and the forward vector,
  • The up vector is computed as the cross-product of the forward and right vectors.

The rotation matrix $\mathbf{R}_{WC}$ is then constructed as:

$$ \mathbf{R}_{WC} = \begin{bmatrix} \mathbf{r} & \mathbf{u} & \mathbf{f} \end{bmatrix} $$

where $\mathbf{r}, \mathbf{u}, \mathbf{f}$ are the right, up, and forward unit vectors, respectively.

The final transformation from the world to the camera frame is:

$$ \mathbf{T}_{WC} = \begin{bmatrix} \mathbf{R}_{WC} & \mathbf{t} \\ 0 & 1 \end{bmatrix} $$

This process generates a set of camera poses that will be used to simulate observations and test the hand-eye calibration process.

In [101]:

Camera Intrinsic and Extrinsic Estimation

To estimate both the camera intrinsics and extrinsics, we apply Zhang's method [ A flexible new technique for camera calibration ], which involves the following steps:

  1. Estimating Homographies : We compute the transformation between the world and image planes for multiple camera views.
  2. Calibrating Camera Intrinsics : Using homographies, we solve for the camera intrinsic matrix $\mathbf{K}$.
  3. Estimating Camera Extrinsics : Given $\mathbf{K}$ and the homographies, we recover the camera pose for each view.

Homography Estimation

A homography is a $3 \times 3$ matrix $\mathbf{H}$ that describes the transformation between two planes in projective space. In the context of camera calibration, it maps 3D world points that lie on a known plane (e.g., the $Z = 0$ plane in our simulated problem) to their corresponding 2D image points :

$$ \tilde{\mathbf{p}}_I = \mathbf{H} \tilde{\mathbf{p}}_W $$

where:

  • $\tilde{\mathbf{p}}_W = (x, y, 1)^\top$ is the homogeneous coordinate of a point in the world frame.
  • $\tilde{\mathbf{p}}_I = (u, v, 1)^\top$ is the corresponding image point in homogeneous coordinates.
  • $\mathbf{H}$ is the $3 \times 3$ homography matrix.

Since all world points lie on the $Z=0$ plane , their 3D coordinates simplify to:

$$ \mathbf{p}_W = (x, y, 0)^\top $$

The camera projection equation is given by:

$$ \tilde{\mathbf{p}}_I = \mathbf{K} [\mathbf{R} \ | \ \mathbf{t}] \tilde{\mathbf{p}}_W $$

where:

  • $\mathbf{K}$ is the intrinsic matrix , which defines how 3D points are projected onto the image plane.
  • $[\mathbf{R} \ | \ \mathbf{t}]$ is the extrinsic transformation , with $\mathbf{R} \in SO(3)$ as the rotation matrix and $\mathbf{t} \in \mathbb{R}^3$ as the translation vector.
  • Since world points are on the $Z=0$ plane, the third column of $\mathbf{R}$ is irrelevant in this mapping.

Thus, we can rewrite the projection equation by extracting only the first two columns of $\mathbf{R}$:

$$ \mathbf{H} = \mathbf{K} [\mathbf{r}_1 \quad \mathbf{r}_2 \quad \mathbf{t}] $$

where:

  • $\mathbf{r}_1$ and $\mathbf{r}_2$ are the first two columns of the rotation matrix $\mathbf{R}$.
  • $\mathbf{t}$ is the translation vector.
  • The homography $\mathbf{H}$ encodes both the intrinsic ($\mathbf{K}$) and extrinsic ($\mathbf{R}, \mathbf{t}$) parameters of the camera.

Normalization

To improve numerical stability, we normalize the points to be on unit circle before estimating the homography. This process ensures that the coordinate values are well-conditioned for numerical computations.

Given a set of points $\mathbf{p}$, we compute a normalization transformation matrix $\mathbf{T}$:

$$ \mathbf{T} = \begin{bmatrix} s & 0 & -s \bar{x} \\ 0 & s & -s \bar{y} \\ 0 & 0 & 1 \end{bmatrix} $$

Before estimating the homography, we apply this normalization process separately to the world points and image points , yielding two transformation matrices:

The transformation matrix $\mathbf{T}_{\text{world}}$ normalizes the 2D world points $(x, y)$:

$$ \mathbf{T}_{\text{world}} = \begin{bmatrix} s_W & 0 & -s_W \bar{x}_W \\ 0 & s_W & -s_W \bar{y}_W \\ 0 & 0 & 1 \end{bmatrix} $$

Applying this transformation:

$$ \tilde{\mathbf{p}}_W' = \mathbf{T}_{\text{world}} \tilde{\mathbf{p}}_W $$

Similarly, the transformation matrix $\mathbf{T}_{\text{pixel}}$ normalizes the 2D image points $(u, v)$:

$$ \mathbf{T}_{\text{pixel}} = \begin{bmatrix} s_I & 0 & -s_I \bar{u} \\ 0 & s_I & -s_I \bar{v} \\ 0 & 0 & 1 \end{bmatrix} $$

Applying this transformation:

$$ \tilde{\mathbf{p}}_I' = \mathbf{T}_{\text{pixel}} \tilde{\mathbf{p}}_I $$

After applying these transformations, the normalized world points $\tilde{\mathbf{p}}_W'$ and image points $\tilde{\mathbf{p}}_I'$ are used to compute the homography $\mathbf{H}_{\text{norm}}$. The final homography is then denormalized :

$$ \mathbf{H} = \mathbf{T}_{\text{pixel}}^{-1} \mathbf{H}_{\text{norm}} \mathbf{T}_{\text{world}} $$

This process ensures better numerical stability in solving for the homography using singular value decomposition (SVD) .

In [102]:

DLT for homography estimation

Using normalized world points $\mathbf{p}_W$ and image points $\mathbf{p}_I$, we construct a direct linear transformation (DLT) system.

$$ \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & h_{33} \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} $$

$$ u = \frac{h_{11} x + h_{12} y + h_{13}}{h_{31} x + h_{32} y + h_{33}} $$

$$ v = \frac{h_{21} x + h_{22} y + h_{23}}{h_{31} x + h_{32} y + h_{33}} $$

$$ h_{11} x + h_{12} y + h_{13} - u h_{31} x - u h_{32} y - u h_{33} = 0 $$

$$ h_{21} x + h_{22} y + h_{23} - v h_{31} x - v h_{32} y - v h_{33} = 0 $$

Each point correspondence $(x, y) \rightarrow (u, v)$ contributes two equations. To solve for all elements of $\mathbf{H}$, we stack these equations for multiple points into a linear system :

$$ A \mathbf{h} = 0 $$

where $\mathbf{h}$ is the 9-element vector.

The system matrix $A$ for each point correspondence $(x, y) \rightarrow (u, v)$ is:

$$ A = \begin{bmatrix} x & y & 1 & 0 & 0 & 0 & -u x & -u y & -u \\ 0 & 0 & 0 & x & y & 1 & -v x & -v y & -v \end{bmatrix} $$

By stacking these equations for multiple points and solving the homogeneous linear system using singular value decomposition (SVD) , we obtain the best estimate for $\mathbf{H}$.

Note: Since H has 8 DoF, at least 4 points are required to solve this equation, as each point provides 2 constraints. The points must not be collinear, as two points alone would determine the third, making the solution underdetermined.

In [103]:

Camera Intrinsic Calibration

Once multiple homographies $\mathbf{H}$ are computed, we estimate the camera intrinsic matrix $\mathbf{K}$.

The homography relates to intrinsics and extrinsics:

$$ \mathbf{H} = \mathbf{K} [ \mathbf{r}_1 \quad \mathbf{r}_2 \quad \mathbf{t} ] $$

where $\mathbf{r}_1, \mathbf{r}_2$ are the first two columns of the rotation matrix, and $\mathbf{t}$ is the translation vector.


- Constraint 1: Orthogonality of $\mathbf{r}_1$ and $\mathbf{r}_2$
The rotation columns must be orthogonal , meaning:

$$ \mathbf{r}_1^\top \mathbf{r}_2 = 0 $$

Expanding in terms of $\mathbf{H}$:

$$ (\mathbf{K}^{-1} \mathbf{h}_1)^\top (\mathbf{K}^{-1} \mathbf{h}_2) = 0 $$

Rearranging:

$$ \mathbf{h}_1^\top \mathbf{K}^{-\top} \mathbf{K}^{-1} \mathbf{h}_2 = 0 $$

- Constraint 2: Equality of Norms of $\mathbf{r}_1$ and $\mathbf{r}_2$
Since the columns of a rotation matrix have unit norms:

$$ \mathbf{r}_1^\top \mathbf{r}_1 = \mathbf{r}_2^\top \mathbf{r}_2 $$

Expanding in terms of $\mathbf{H}$:

$$ \mathbf{h}_1^\top \mathbf{K}^{-\top} \mathbf{K}^{-1} \mathbf{h}_1 = \mathbf{h}_2^\top \mathbf{K}^{-\top} \mathbf{K}^{-1} \mathbf{h}_2 $$

These two constraints hold for each homography and will allow us to estimate $\mathbf{K}$.


To simplify notation, we define:

$$ \mathbf{B} = \mathbf{K}^{-\top} \mathbf{K}^{-1} $$

Since $\mathbf{B}$ is symmetric ($B = B^T$) , it has the form:

$$ \mathbf{B} = \begin{bmatrix} B_{00} & B_{01} & B_{02} \\ B_{01} & B_{11} & B_{12} \\ B_{02} & B_{12} & B_{22} \end{bmatrix} $$

Substituting $\mathbf{B}$ into our constraints:

  1. Orthogonality Condition:

    $$ \mathbf{h}_1^\top \mathbf{B} \mathbf{h}_2 = 0 $$

  2. Equality of Norms Condition:

    $$ \mathbf{h}_1^\top \mathbf{B} \mathbf{h}_1 = \mathbf{h}_2^\top \mathbf{B} \mathbf{h}_2 $$

These provide two linear equations for each homography $\mathbf{H}$.

Let the vector $\mathbf{b}$ represents the six unknown elements of the intrinsic constraint matrix $\mathbf{B}$:

$$ \mathbf{b} = \begin{bmatrix} B_{00} \\ B_{01} \\ B_{11} \\ B_{02} \\ B_{12} \\ B_{22} \end{bmatrix} $$

The term ( v_{12} ) is derived from the orthogonality constraint of the rotation columns:

$$ \mathbf{h}_1^\top \mathbf{B} \mathbf{h}_2 = 0 $$

Expanding this gives the constraint vector:

$$ v_{12} = \begin{bmatrix} h_{11} h_{12} \\ h_{11} h_{22} + h_{21} h_{12} \\ h_{21} h_{22} \\ h_{11} h_{32} + h_{31} h_{12} \\ h_{21} h_{32} + h_{31} h_{22} \\ h_{31} h_{32} \end{bmatrix} $$

which contributes one row in the constraint matrix $V$ as $v_{12}^T b = 0$.

The terms ( v_{11} ) and ( v_{22} ) are derived from the equal norm constraint :

$$ \mathbf{h}_1^\top \mathbf{B} \mathbf{h}_1 = \mathbf{h}_2^\top \mathbf{B} \mathbf{h}_2 $$

Expanding these gives:

$$ v_{11} = \begin{bmatrix} h_{11}^2 \\ 2 h_{11} h_{21} \\ h_{21}^2 \\ 2 h_{11} h_{31} \\ 2 h_{21} h_{31} \\ h_{31}^2 \end{bmatrix} v_{22} = \begin{bmatrix} h_{12}^2 \\ 2 h_{12} h_{22} \\ h_{22}^2 \\ 2 h_{12} h_{32} \\ 2 h_{22} h_{32} \\ h_{32}^2 \end{bmatrix} $$

which provides another row in the constraint matrix $V$ as $(v_{11} - v_{22})^\top \mathbf{b} = 0$

We solve for $V \mathbf{b} = 0$ using SVD . The solution is obtained as the singular vector corresponding to the smallest singular value .

Finally we e extract the intrinsic parameters:

  • $v_0 = \frac{B_{01} B_{02} - B_{00} B_{12}}{B_{00} B_{11} - B_{01}^2}$
  • $\lambda = B_{22} - \frac{B_{02}^2 + v_0 (B_{01} B_{02} - B_{00} B_{12})}{B_{00}}$
  • $\alpha = \sqrt{\lambda / B_{00}}$ (focal length in $x$)
  • $\beta = \sqrt{\lambda B_{00} / (B_{00} B_{11} - B_{01}^2)}$ (focal length in $y$)
  • $\gamma = -B_{01} \alpha^2 \beta / \lambda$ (skew factor)
  • $u_0 = \gamma v_0 / \beta - B_{02} \alpha^2 / \lambda$ (principal point)

The intrinsic matrix is then:

$$ \mathbf{K} = \begin{bmatrix} \alpha & \gamma & u_0 \\ 0 & \beta & v_0 \\ 0 & 0 & 1 \end{bmatrix} $$

Note: There may be degenerate cases, which is why we check the condition number when solving B .

In [104]:

Camera Extrinsic Estimation

Given $\mathbf{K}$ and a homography $\mathbf{H}$, the extrinsic parameters (rotation $\mathbf{R}$ and translation $\mathbf{t}$) are computed as:

$$ \mathbf{M} = \mathbf{K}^{-1} \mathbf{H} $$

where:

  • $\mathbf{r}_1 = \frac{\mathbf{M}_{:,1}}{\lambda}$
  • $\mathbf{r}_2 = \frac{\mathbf{M}_{:,2}}{\lambda}$
  • $\mathbf{r}_3 = \mathbf{r}_1 \times \mathbf{r}_2$ (ensuring orthogonality)
  • $\mathbf{t} = \frac{\mathbf{M}_{:,3}}{\lambda}$

The orthonormalization step ensures $\mathbf{R}$ is a valid rotation matrix.

In [105]:

Running the Camera Calibration Simulation

Using the simulated world points and camera poses, we generate image observations:

  1. Project world points into the image plane using $\mathbf{K}$ and camera poses $\mathbf{T}_{WC}$.
  2. Compute homographies from world-image correspondences.
  3. Estimate camera intrinsics using multiple homographies.
  4. Recover camera extrinsics for each view.

To validate the calibration:

  • Intrinsic matrix $\mathbf{K}_{\text{tested}}$ should match the ground truth $\mathbf{K}$.
  • Estimated camera poses $\mathbf{T}_{WC}^{\text{tested}}$ should align with the expected ones.
In [106]:
Rank of V: 5
Singular values of V: [3.7186e+04 2.8990e+04 3.3617e+03 1.1681e+01 1.6096e+00 1.4826e-16]
Condition number of V: 2.508181506987467e+20

Hand-Eye Calibration

Hand-Eye Calibration using ( AX = XB )

I primarily follow Robot Sensor Calibration: Solving AX = XB on the Euclidean Group , except that I use SVD to solve for rotation, similar to the approach in point cloud alignment.

Hand-eye calibration is the process of estimating the transformation between a camera and a robot end-effector when the camera is rigidly mounted on the robot. This calibration helps in accurately relating observations from the camera to the robot's movement.

In this section, we implement the Park & Martin method to solve the classic hand-eye calibration equation :

$$ AX = XB $$

where:

  • $A_i$ represents the relative transformation between two camera poses $T_{C_iC_j}$
  • $B_i$ represents the relative transformation between two end-effector poses $T_{E_iE_j}$.
  • $X$ is the unknown transformation from the camera frame to the end-effector frame $T_{CE}$.

Estimating Rotation $\mathbf{R_{CE}}$ and $\mathbf{t_{CE}}$

Extracting the rotation component from $T_{C_i C_j} X = X T_{E_i E_j}$,we obtain: $$ R_{C_i C_j} R_{CE} = R_{CE} R_{E_i E_j} $$

Taking the logarithm map of the rotation part:

$$ \log( R_{C_i C_j} ) = R \log( R_{E_i E_j} ) R^{-1} $$

Since the Lie algebra of rotations follows:

$$ \log( R_{C_i C_j} ) = [\alpha] $$

$$ \log( R_{E_i E_j} ) = [\beta] $$

We obtain the relationship:

$$ [\alpha] = R [\beta] R^{-1} $$

Using vector form:

$$ \alpha = R \beta $$

Given multiple transformation pairs $\alpha_i, \beta_i$, we solve for $R$ by minimizing:

$$ \sum \| \alpha_i - R \beta_i \|^2 $$

which is solved using SVD decomposition .


Once $R_{CE}$ is known, the equation for translation becomes:

$$ t_{C_i C_j} = R t_{E_i E_j} + t $$

Rearranging:

$$ (I - R) t_{E_i E_j} = t_{C_i C_j} - t $$

This forms an overdetermined linear system , which is solved using least squares regression .

In [107]:

Run Hand-Eye Calibration Simulation

To validate the hand-eye calibration method, we generate a simulation where a camera is rigidly attached to a robot end-effector . The simulation involves:

  1. Generating ground-truth transformations for both the camera and end-effector.
  2. Computing end-effector poses based on the expected transformations.
  3. Introducing noise to simulate real-world conditions.
  4. Extracting transformation pairs to solve for hand-eye calibration.
  5. Testing the estimated transformation against the ground truth.
In [108]:

Factor Graph based Hand-Eye Calibration Optimization

Several degenerate cases can arise in the previous processes, such as insufficient views to estimate $H$ or $K$ and inadequate movement to determine $T_{CE}$. Additionally, real-world measurements are imperfect, with errors in pixel observations, robot arm end-effector poses, and other factors.

When performing hand-eye calibration, it is ideal to capture a diverse range of calibration board views and execute various robot arm movements. Given the limited field of view of a single camera with a single calibration board, using multiple boards placed throughout the room can improve accuracy.

To address these issues after we have enought data, we formulate the problem as an optimization task to estimate the variables of interest.

Note: In this tutorial, we will not simulate the optimization of camera intrinsics.

Import Libraries

In [109]:

Define Residual Functions

The function reprojection_residual_from_camera computes the reprojection error of a landmark observed by a camera. It assumes that the camera pose is being optimized in the world frame.

  • $T_{W_C}^{-1} \cdot W_t^{landmark}$ transforms the landmark position into the camera frame.
  • $K \cdot (...)$ projects the transformed point into image coordinates.
  • The residual is computed as the difference between the observed pixel location
  • $uv_{obs}$ and the projected landmark position.

To enforce consistency in transformations, the function pose_loop_residual ensures that the transformation chain forms a valid loop $T_{BB} = T_{BW} \cdot T_{WC} \cdot T_{CE} \cdot (T_{BE_n} \cdot T_{E_nE})^{-1}$.

NOTE: We might also want to have a prior constraint on error correction term $T_{E_nE}$.

In [110]:

Generate Factors

In [111]:

Optimize Camera Poses

The pose estimation from camera calibration appears to be quite accurate, as the residual reduction is small.

In [112]:
len(result.iterations)=3
Initial error: 3.4163439793428014e-21
Final error: 5.505113158739144e-25

Optimize End Effector Pose Corrections and $T_{CE}$

In [122]:
len(result.iterations)=11
Initial error: 0.005298482748792415
Final error: 1.2656542091601458e-13
========
T_CE(initial)  :  [ 0.0452 -0.0588 -0.0945]
T_CE(optimized):  [ 0.0448 -0.0458 -0.1041]
T_CE(expected) :  [ 0.0465 -0.05   -0.1017]
========
expected_errors : [array([0.0056, 0.007 , 0.0051]), array([0.0056, 0.0054, 0.006 ]), array([0.0035, 0.003 , 0.0036]), array([0.0058, 0.0054, 0.0066]), array([0.0048, 0.0043, 0.0067]), array([0.0049, 0.0062, 0.0038]), array([0.0063, 0.0036, 0.0057]), array([0.0032, 0.0061, 0.0059]), array([0.0061, 0.0037, 0.0032]), array([0.0066, 0.0037, 0.0061]), array([0.0041, 0.0056, 0.007 ]), array([0.0041, 0.0047, 0.0061]), array([0.004 , 0.0032, 0.0065]), array([0.004 , 0.0046, 0.0042]), array([0.0069, 0.0053, 0.0041]), array([0.0065, 0.0033, 0.0061]), array([0.0056, 0.0038, 0.006 ]), array([0.0068, 0.006 , 0.0057]), array([0.0054, 0.0038, 0.0047]), array([0.0062, 0.006 , 0.0058]), array([0.0034, 0.0063, 0.005 ]), array([0.0052, 0.005 , 0.0049]), array([0.0034, 0.0065, 0.0037]), array([0.0044, 0.0038, 0.0051]), array([0.0066, 0.0041, 0.0059]), array([0.0061, 0.003 , 0.0064]), array([0.0058, 0.0057, 0.0039]), array([0.0052, 0.0069, 0.0039]), array([0.0062, 0.0034, 0.0058]), array([0.0036, 0.0041, 0.0069]), array([0.0069, 0.0036, 0.0068]), array([0.0064, 0.0069, 0.0044])]
optimized_errors: [array([0.004 , 0.007 , 0.0002]), array([0.004 , 0.0046, 0.0012]), array([ 0.0019,  0.0014, -0.001 ]), array([0.0042, 0.0029, 0.0023]), array([0.0056, 0.0028, 0.0018]), array([ 0.0064,  0.0052, -0.001 ]), array([0.0085, 0.003 , 0.0011]), array([0.0062, 0.0059, 0.0017]), array([ 0.0069,  0.0052, -0.0017]), array([0.0066, 0.0055, 0.0013]), array([0.0035, 0.0079, 0.0023]), array([0.0028, 0.0073, 0.0018]), array([0.0024, 0.0032, 0.0016]), array([ 0.0024,  0.0038, -0.0006]), array([ 0.0053,  0.0036, -0.0005]), array([0.0049, 0.0009, 0.0018]), array([0.0039, 0.0038, 0.0011]), array([0.0052, 0.0052, 0.0009]), array([3.7438e-03, 2.0942e-03, 9.9781e-05]), array([0.0046, 0.0036, 0.0016]), array([4.2196e-03, 4.9197e-03, 9.1918e-05]), array([6.7145e-03, 3.9874e-03, 3.2584e-05]), array([ 0.0056,  0.0059, -0.0009]), array([0.0073, 0.0036, 0.0009]), array([0.0074, 0.0055, 0.001 ]), array([0.0062, 0.0049, 0.0016]), array([ 0.0052,  0.008 , -0.0007]), array([ 0.0039,  0.0096, -0.0003]), array([0.0046, 0.0034, 0.0009]), array([0.0019, 0.0033, 0.0021]), array([0.0053, 0.0019, 0.0022]), array([0.0048, 0.0044, 0.0002])]
========

Optimize End Effector Pose Corrections

Now, let's focus on optimizing the end-effector pose correction $T_{E_nE}$. The relative error reduction is significant, even though the absolute system error with the initial value is quite small (we have perfect measurements).

In [136]:
len(result.iterations)=8
Initial error: 0.005298482748792109
Final error: 3.1803870859392485e-13
========
expected_errors : [array([0.0056, 0.007 , 0.0051]), array([0.0056, 0.0054, 0.006 ]), array([0.0035, 0.003 , 0.0036]), array([0.0058, 0.0054, 0.0066]), array([0.0048, 0.0043, 0.0067]), array([0.0049, 0.0062, 0.0038]), array([0.0063, 0.0036, 0.0057]), array([0.0032, 0.0061, 0.0059]), array([0.0061, 0.0037, 0.0032]), array([0.0066, 0.0037, 0.0061]), array([0.0041, 0.0056, 0.007 ]), array([0.0041, 0.0047, 0.0061]), array([0.004 , 0.0032, 0.0065]), array([0.004 , 0.0046, 0.0042]), array([0.0069, 0.0053, 0.0041]), array([0.0065, 0.0033, 0.0061]), array([0.0056, 0.0038, 0.006 ]), array([0.0068, 0.006 , 0.0057]), array([0.0054, 0.0038, 0.0047]), array([0.0062, 0.006 , 0.0058]), array([0.0034, 0.0063, 0.005 ]), array([0.0052, 0.005 , 0.0049]), array([0.0034, 0.0065, 0.0037]), array([0.0044, 0.0038, 0.0051]), array([0.0066, 0.0041, 0.0059]), array([0.0061, 0.003 , 0.0064]), array([0.0058, 0.0057, 0.0039]), array([0.0052, 0.0069, 0.0039]), array([0.0062, 0.0034, 0.0058]), array([0.0036, 0.0041, 0.0069]), array([0.0069, 0.0036, 0.0068]), array([0.0064, 0.0069, 0.0044])]
optimized_errors: [array([0.0043, 0.0052, 0.0163]), array([0.0043, 0.0055, 0.0173]), array([0.0023, 0.0051, 0.0147]), array([0.0045, 0.0094, 0.0172]), array([0.007 , 0.0041, 0.0179]), array([0.0054, 0.005 , 0.0151]), array([0.0051, 0.0014, 0.0169]), array([0.0004, 0.003 , 0.0165]), array([0.0052, 0.0058, 0.0144]), array([0.0073, 0.0047, 0.0174]), array([0.0066, 0.0057, 0.0181]), array([0.0082, 0.0038, 0.0167]), array([0.0027, 0.0014, 0.0176]), array([0.0027, 0.0048, 0.0156]), array([0.0056, 0.0074, 0.0152]), array([0.0052, 0.0073, 0.0167]), array([0.0043, 0.002 , 0.0172]), array([0.0055, 0.0062, 0.017 ]), array([0.0041, 0.0059, 0.0158]), array([0.005 , 0.0101, 0.0164]), array([0.0056, 0.0061, 0.0162]), array([0.0057, 0.0038, 0.0162]), array([0.0022, 0.0043, 0.0148]), array([0.0016, 0.0006, 0.0157]), array([0.0057, 0.0061, 0.0171]), array([0.0069, 0.0041, 0.0177]), array([0.0083, 0.0058, 0.015 ]), array([0.0094, 0.006 , 0.0145]), array([0.0049, 0.0016, 0.017 ]), array([0.0023, 0.0043, 0.0182]), array([0.0056, 0.0057, 0.018 ]), array([0.0051, 0.0109, 0.015 ])]
========

Optimize End Effector Pose Corrections and $T_{CE}$ again

After getting end-effector pose correction $T_{E_nE}$, I try to optimize both $T_{E_nE}$ and $T_{CE}$. Interestingly, there is no movement of $T_{CE}$ but just $T_{E_nE}$.

In [137]:
len(result.iterations)=10
Initial error: 3.180387086041939e-13
Final error: 1.2656542081530388e-13
========
T_CE(initial)  :  [ 0.0452 -0.0588 -0.0945]
T_CE(optimized):  [ 0.0452 -0.0588 -0.0945]
T_CE(expected) :  [ 0.0465 -0.05   -0.1017]
========
expected_errors : [array([0.0056, 0.007 , 0.0051]), array([0.0056, 0.0054, 0.006 ]), array([0.0035, 0.003 , 0.0036]), array([0.0058, 0.0054, 0.0066]), array([0.0048, 0.0043, 0.0067]), array([0.0049, 0.0062, 0.0038]), array([0.0063, 0.0036, 0.0057]), array([0.0032, 0.0061, 0.0059]), array([0.0061, 0.0037, 0.0032]), array([0.0066, 0.0037, 0.0061]), array([0.0041, 0.0056, 0.007 ]), array([0.0041, 0.0047, 0.0061]), array([0.004 , 0.0032, 0.0065]), array([0.004 , 0.0046, 0.0042]), array([0.0069, 0.0053, 0.0041]), array([0.0065, 0.0033, 0.0061]), array([0.0056, 0.0038, 0.006 ]), array([0.0068, 0.006 , 0.0057]), array([0.0054, 0.0038, 0.0047]), array([0.0062, 0.006 , 0.0058]), array([0.0034, 0.0063, 0.005 ]), array([0.0052, 0.005 , 0.0049]), array([0.0034, 0.0065, 0.0037]), array([0.0044, 0.0038, 0.0051]), array([0.0066, 0.0041, 0.0059]), array([0.0061, 0.003 , 0.0064]), array([0.0058, 0.0057, 0.0039]), array([0.0052, 0.0069, 0.0039]), array([0.0062, 0.0034, 0.0058]), array([0.0036, 0.0041, 0.0069]), array([0.0069, 0.0036, 0.0068]), array([0.0064, 0.0069, 0.0044])]
optimized_errors: [array([0.0043, 0.0052, 0.0163]), array([0.0043, 0.0055, 0.0173]), array([0.0023, 0.0051, 0.0147]), array([0.0045, 0.0094, 0.0172]), array([0.007 , 0.0041, 0.0179]), array([0.0054, 0.005 , 0.0151]), array([0.0051, 0.0014, 0.0169]), array([0.0004, 0.003 , 0.0165]), array([0.0052, 0.0058, 0.0144]), array([0.0073, 0.0047, 0.0174]), array([0.0066, 0.0057, 0.0181]), array([0.0082, 0.0038, 0.0167]), array([0.0027, 0.0014, 0.0176]), array([0.0027, 0.0048, 0.0156]), array([0.0056, 0.0074, 0.0152]), array([0.0052, 0.0073, 0.0167]), array([0.0043, 0.002 , 0.0172]), array([0.0055, 0.0062, 0.017 ]), array([0.0041, 0.0059, 0.0158]), array([0.005 , 0.0101, 0.0164]), array([0.0056, 0.0061, 0.0162]), array([0.0057, 0.0038, 0.0162]), array([0.0022, 0.0043, 0.0148]), array([0.0016, 0.0006, 0.0157]), array([0.0057, 0.0061, 0.0171]), array([0.0069, 0.0041, 0.0177]), array([0.0083, 0.0058, 0.015 ]), array([0.0094, 0.006 , 0.0145]), array([0.0049, 0.0016, 0.017 ]), array([0.0023, 0.0043, 0.0182]), array([0.0056, 0.0057, 0.018 ]), array([0.0051, 0.0109, 0.015 ])]
========
In [138]:
len(result.iterations)=11
Initial error: 0.0020829916578865106
Final error: 1.2656542085960234e-13
========
T_CE(initial)  :  [ 0.0452 -0.0588 -0.0945]
T_CE(optimized):  [ 0.0464 -0.0503 -0.1015]
T_CE(expected) :  [ 0.0465 -0.05   -0.1017]
========
expected_errors : [array([0.0056, 0.007 , 0.0051]), array([0.0056, 0.0054, 0.006 ]), array([0.0035, 0.003 , 0.0036]), array([0.0058, 0.0054, 0.0066]), array([0.0048, 0.0043, 0.0067]), array([0.0049, 0.0062, 0.0038]), array([0.0063, 0.0036, 0.0057]), array([0.0032, 0.0061, 0.0059]), array([0.0061, 0.0037, 0.0032]), array([0.0066, 0.0037, 0.0061]), array([0.0041, 0.0056, 0.007 ]), array([0.0041, 0.0047, 0.0061]), array([0.004 , 0.0032, 0.0065]), array([0.004 , 0.0046, 0.0042]), array([0.0069, 0.0053, 0.0041]), array([0.0065, 0.0033, 0.0061]), array([0.0056, 0.0038, 0.006 ]), array([0.0068, 0.006 , 0.0057]), array([0.0054, 0.0038, 0.0047]), array([0.0062, 0.006 , 0.0058]), array([0.0034, 0.0063, 0.005 ]), array([0.0052, 0.005 , 0.0049]), array([0.0034, 0.0065, 0.0037]), array([0.0044, 0.0038, 0.0051]), array([0.0066, 0.0041, 0.0059]), array([0.0061, 0.003 , 0.0064]), array([0.0058, 0.0057, 0.0039]), array([0.0052, 0.0069, 0.0039]), array([0.0062, 0.0034, 0.0058]), array([0.0036, 0.0041, 0.0069]), array([0.0069, 0.0036, 0.0068]), array([0.0064, 0.0069, 0.0044])]
optimized_errors: [array([0.0056, 0.0069, 0.0054]), array([0.0056, 0.0054, 0.0064]), array([0.0035, 0.0031, 0.0039]), array([0.0058, 0.0055, 0.0069]), array([0.0048, 0.0043, 0.007 ]), array([0.0049, 0.0062, 0.0041]), array([0.0062, 0.0035, 0.0061]), array([0.0031, 0.006 , 0.0063]), array([0.0061, 0.0038, 0.0035]), array([0.0066, 0.0037, 0.0064]), array([0.0042, 0.0056, 0.0073]), array([0.0042, 0.0046, 0.0064]), array([0.004 , 0.0032, 0.0068]), array([0.004 , 0.0046, 0.0046]), array([0.0068, 0.0054, 0.0044]), array([0.0064, 0.0034, 0.0064]), array([0.0055, 0.0037, 0.0063]), array([0.0068, 0.006 , 0.006 ]), array([0.0053, 0.0038, 0.005 ]), array([0.0062, 0.0062, 0.0062]), array([0.0035, 0.0063, 0.0053]), array([0.0052, 0.0049, 0.0052]), array([0.0033, 0.0064, 0.004 ]), array([0.0043, 0.0037, 0.0054]), array([0.0066, 0.0042, 0.0062]), array([0.0061, 0.0031, 0.0067]), array([0.0059, 0.0057, 0.0042]), array([0.0054, 0.0069, 0.0043]), array([0.0062, 0.0034, 0.0061]), array([0.0035, 0.0042, 0.0072]), array([0.0068, 0.0037, 0.0072]), array([0.0064, 0.007 , 0.0048])]
========
This blog is converted from hand_eye.ipynb
Written on February 2, 2025