Sunday, October 26, 2008

OpenGL ES Picking using Ray-BoundingBox Intersection

One way to determine which object a player has picked, be it via mouseclick or finger tap, is to generate a pick ray relative to the player's eye, and transform it by the inverse of the openGL ModelView matrix. This is true for any openGL 3d 4x4modelview including the iphone's.


In eye coordinates, the pick ray origin is simply (0, 0, 0). You can build the pick ray vector from the perspective projection parameters, for example, by setting up your perspective projection this way

// assuming you created your own frustum setting function with a
// similar signature to this for the camera position.
void Set(float fFov, float fAspect, float fNear, float fFar)
// and you are familiar with the following camera settings.
#define SCREEN_WIDTH 320
#define SCREEN_HEIGHT 480
#define NEAR 1
#define FAR 100

Now make your x and y coordinates relative to the center of the screen and get the x and y components of the ray.

float centered_y = (SCREEN_HEIGHT - y) - SCREEN_HEIGHT/2;
float centered_x = x - SCREEN_WIDTH/2;
float unit_x = centered_x/(SCREEN_WIDTH/2);
float unit_y = centered_y/(SCREEN_HEIGHT/2);

The variables unit_x and unit_y are scaled between -1.0 and 1.0. Use them to find the mouse location on your zNear clipping plane like so:

Now your pick ray vector is (x, y, -zNear, 0).

float near_height = NEAR * float(tan( FOV * PI / 360.0 ));
float ray[4] ={ unit_x*near_height*ASPECT, unit_y*near_height, 1, 0 };
float ray_start_point[4] = {0.f, 0.f, 0.f, 1.f};

To transform this eye coordinate pick-ray into object coordinates we multiply it by the inverse of the ModelView matrix. Since the pick ray is made up of a vector and a point, and that vectors and points transform differently. You can translate and rotate points, but vectors only rotate. The way to guarantee that this is working correctly is to define your point and vector as four-element arrays. The one and zero in the last element determines whether an array transforms as a point or a vector when multiplied by the inverse of the ModelView matrix.

Since my modelview matrix is built dynamically using something like gluLookAt or a sequence of glTranslate, glRotate, and glScale calls, I can use glGetFloatv to retrieve the current modelview matrix.

GLfloat the_modelview[16];
//Read the current modelview matrix into the array the_modelview
glGetFloatv(GL_MODELVIEW_MATRIX, the_modelview);

Given our initial modelview matrix M, consisting of a 3x3 rotation submatrix R and a 3-element translation vector t we can construct the inverse modelview using the transpose of the rotation submatrix and the camera's translation vector:

M = {
{R11, R12, R13, 0},
{R21, R22, R23, 0},
{R31, R32, R33, 0},
{tx, ty, tz, 1},

Rt = {
{R11, R21, R31},
{R12, R22, R32},
{R13, R23, R33}

Rt*t = t'

M-1 = {
{R11, R21, R31, 0},
{R12, R22, R32, 0},
{R13, R23, R33, 0},
{-t'x, -t'y, -t'z, 1},

Now that we have the inverse modelview matrix we can use it to transform our view coordinate into world space:

M-1*ray = ray';
M-1*ray_start_point = ray_start_point'

Yeah cool, now iterate down this ray for a fixed distance no larger than your zFar value creating a small bounding box and testing for collisions as you usually do in your game to find out what it is the player has clicked or tapped on.


  1. Hi, I followed your instructions but couldn't make it to work. Do you have a working Xcode project that you could share ?

  2. Hi, how are you building up the ray vector once you got ray_start_point' and ray' ?

  3. Someone already asked this, but do you have any source for this tht you can share?

  4. Hi, i would like to know the procedure for tapping one image on another image using opengl es in objective c. As i m working on a game for iphone.

  5. Thanks for share.

    What about if the camera is not located at (0, 0, 0)??

  6. This comment has been removed by a blog administrator.

  7. Thanks for the tutorial. It s been very helpful.

    Cody Ng:
    The camera position in eye coordinates (camera coordinates, basically) is always (0,0,0). That s why the ray has to be transformed to world coordinates before it can be used to hit-test objects in the scene, for example.

    This tutorial is very ready-to-use, thanks again.

  8. Thanks.

    But I asked that how do you find out the ray vector if the camera position is not located at (0, 0, 0) in case of using gluLookAt() method??

    Assuming the eye point is placed anywhere but looking at the origin (0, 0, 0)??

  9. Excellent tutorial! I've already seen several people asking about this and I'm glad that you posted the code to be able to do it correctly.


  10. I'm get frustrated because EVERYONE on the web give infos and tutorials, but nobody put a decent piece of code as real-world sample. And the best of all is that nobody write the last part of what is telling (now you do simply this and that... without explaing how).

  11. Good tutorial... for someone like me who is using it just to refresh some concepts. Thanks.

  12. Here is the code I used to get it working. NOTE: This assumes the translation values are stored in the far right column instead of the bottom most row.

    typedef GLfloat Matrix3D[16];

    static inline void Matrix3DInvert(Matrix3D matrix, Matrix3D result)
    //Compute transpose of the rotation matrix
    result[0] = matrix[0]; result[4] = matrix[1]; result[8] = matrix[2];
    result[1] = matrix[4]; result[5] = matrix[5]; result[9] = matrix[6];
    result[2] = matrix[8]; result[6] = matrix[9]; result[10] = matrix[10];
    //Compute the inverse translate
    result[12] = (result[0]*matrix[12] + result[4]*matrix[13] + result[8]*matrix[14]) * -1.0f;
    result[13] = (result[1]*matrix[12] + result[5]*matrix[13] + result[9]*matrix[14]) * -1.0f;;
    result[14] = (result[2]*matrix[12] + result[6]*matrix[13] + result[10]*matrix[14]) * -1.0f;;
    //Fill in the bottom
    result[3] = result[7] = result[11] = 0;
    result[15] = 1;

  13. Here is a working sample for iPhone implementing gluUnproject :

  14. Hi,
    I try to implement your tutorial in my openGL View but without success.
    I have load an obj file from 3DS and i want to "touch" on part of my object drawing a circle for instance. What can I do ?
    I have the x,y (2D) coordinate but how to convert them to my 3D object.


  15. Hi, I am not sure if the calculation of near_height is correct using the field of view. I may be mistaken, but I think the conversion to radions isnt correct?

    I figured that out because I wrote a post on this topic too:



  16. Please show me the source about inverting the matrix and the specifically last part.