Im developing an AR app where I want a aim/crosshair as a overlay over the camera view.

When the aim/crosshair aims at a object the crosshair will activate and show "something in the aim". I then have a "shoot" button in the overlay that will "shoot" the object.

How can I detect "object in aim" and which object is in the aim.

Its simple geometry objects, cubes, planes and spheres with textures.

Hi, I'm not an expert on the topic but here's something you could try. Once you retrieve pose matrix for a trackable, translate it to model-view matrix (convertPose2GLMatrix). Then multiply ProjectionMatrix * ModelViewMatrix * (0, 0, 0, 1), the exact same thing OpenGL does to convert vertices into screen space. After this multiplication you should get (x, y) coordinates you can use to track whether given trackable is close enough to center of the screen (or within aim/crosshair).

Hi, you can follow harism's suggestion, i.e. which means projecting the center point of your 3D model to your screen coordinates, so that than you can simply compare your 2D point with your aim/crosshair point in 2D

However, note that multiplying a vector by the modelview and projection matrix will not deliver the coordinates in pixels, but rather in the so called clip space; you need then to convert those to Normalized Device Coordinates (NDC), which are coordinates in range [-1, 1] and then to screen coordinates;

in practice the full chain of equations that you can use in C++ with the QCAR API is like in the following code snippet;

(note: copy the SampleMath.h and .cpp from the Dominoes sample):

QCAR::Vec4F objectClipCoords = SampleMath::Vec4FTransform( QCAR::Vec4F(0, 0, 0, 1),  modelViewProjectionMatrix );

QCAR::Vec4F objectNDC;

objectNDC.data[0] = objectClipCoords.data[0] / objectClipCoords.data[3]; //divide by the "w" component to normalize to NDC

objectNDC.data[1] = objectClipCoords.data[1] / objectClipCoords.data[3]; //divide by the "w" component to normalize to NDC

objectNDC.data[2] = objectClipCoords.data[2] / objectClipCoords.data[3]; //divide by the "w" component to normalize to NDC

objectNDC.data[3] = 1.0f; //w is "1" in NDC

float x_pixels = (0.5f + 0.5f * objectNDC.data[0]) * view_width_pixels;

float y_pixels = (0.5f + 0.5f * objectNDC.data[1]) * view_height_pixels;

//flip vertical coordinate

y_pixels = view_height_pixels - y_pixels; //as OpenGL uses bottom-left corner as origin, while Android uses top-left corner

This is one possibility.

One problem with this is that if you have more than one object being hit (because they are in 3D), you'll need to sort them by depth (i.e. distance from the camera) to choose the closest object in the list.

This thread also discusses a similar approach, in this case using some QCAR utility functions:

https://ar.qualcomm.at/content/error-projecting-point-screen