Log in or register to post comments

Identifying parts of a single image?

February 14, 2011 - 3:58am #1

Hey I have a doubt.....

Say I have an image of a building/tower(single image)...What I need is that its different parts i.e. floors/levels should be identified uniquely from outside when i point my camera towards them.

If i go for capturing image of each floor and creating a trackable for them...It would involve loads of overhead....

Is there any way to accomplish it at this moment?May be something like creating a single trackable with multiple trackables as a part of it....Each part when identified returning a unique id??

Can multi image targets be of any help to me?But i guess for that also we need to have images for each part.....

Re: Identifying parts of a single image?

March 7, 2011 - 6:28am #10

The SDK really isn't designed for tracking real-world objects like buildings, it's designed for tracking print (2d images). So at the very least I would do your first round of testing off an image of the building, and once that works try taking it to the real world building and see how it works. I suspect tracking is going to be fairly poor, but there's really nothing you can do to compensate for this. If the building has a very flat face (no balconies or other 3D extrusions) it might work reasonably well. Ideally though, the face of the building would be evenly lit and there wouldn't be any randomly introduced features (cars in front, people in windows, etc) that weren't in the original image. Of course these are things you can't control in the real world.

Yes, you can change the values in the config.xml to match the scale of your real-world object. Just be sure to keep the exact same aspect ratio as the original values.

- Kim

Re: Identifying parts of a single image?

March 6, 2011 - 8:34pm #9

Hey kim

you are right.I have to track the real building...And yes you are correct that the user might stand anywhere wrt building...So what should I do to make this working??

One more thing Kim i wanted to confirm that in the config.xml i was thinking to give the actual size of the building so that when i get points for trackable plane they are with respect to the height and width of the building and so that i can easily compute which floor user has clicked...Is it ok?Should it make any difference to the projectscreenPointToPlane method?Because while going through the code i found it is calculating the viewport with that size..so will it make any difference if I give the actual width and height of the building and not of the image?

Re: Identifying parts of a single image?

March 4, 2011 - 6:06am #8

It sounds like you are on the right track. Once you project the screen coordinates to the 2D plane of the target, you do not have to worry about the position of the camera at all. The projectScreenPointToPlane function has already taken that into account, and it returns a point in 2D target space, regardless of how the camera is angled or which segment of the image you are looking at. The origin is always at the center of the target. Try it and see.

I'm curious, are you trying to track the real-world building that the image target represents? This won't work nearly as well as tracking off an image of a building. The problem is that the user could be standing anywhere in relation to the building, and since the building is 3D this skews the perspective. It might work moderately well if you always have the user stand in the same spot (the same place the image was taken from), but this isn't really what the SDK is designed for.

- Kim

Re: Identifying parts of a single image?

March 4, 2011 - 2:20am #7

Hi Kim

As suggested by you I am able to get x,y coordiante with respect to trackable plane with its centre as origin.

Wherever I tap on screen I capture those screen coordinates and calculate the corresponding x,y ont the trackable plane.

Now please let me know if my further approach and understanding for recognition of floors is correct or not--As suggested by you once i get x,y coordinates i should divide my whole trackable(i.e. building/tower) into some 2 d rectangles each rectangle corresponding to specific floor,where in I should be able to get the bounding coordinates of my rectangles from the width and height of the building and considering trackable origin at the centre of it.

Then once I have got the bounding coordinates of rectangles i should check whether the screen point I touched lie in which one of these rectangles and process accordingly...

Am i right with this approach or have i misunderstood your explanation?

Secondly I want to know that say Iam standing in front of a building and pointing my camera to that building and say I touch the screen point corresponding to a particular window on that building(an assumption),with this i will get the coordinates of that window on the trackable plane with origin as centre of tackable.Now if I move my camera a little bit towards left such that the building is still visible and touch the same window on the camera screen.Now in this case will corresponding window coordinates change because the camera orientation has changed slightly?

If yes then how can i check the point i touched on screen with the bounding rectangles?Because Iam thinking to take fixed bounding coordinates for my rectangles based on widht and height of the building....as I guess that will also change with the movement of camera.

Please help as Im totally confused with this.:(
Hope Im clear....

Re: Identifying parts of a single image?

March 2, 2011 - 6:15am #6

Right now Iam not identyfying any trackable Im just tapping any point on screen and trying to get the corresponding point for it on the plane....

This is most likely the problem. The projectScreenPointToPlane function depends on the modelViewMatrix global variable being set. This happens in the renderFrame method, with this line of code:

// Get the model view matrix
modelViewMatrix = QCAR::Tool::convertPose2GLMatrix(trackable->getPose());

You'll need some sort of matrix set up here for the function to return a valid X,Y pair, and it will probably be easiest to use one returned from a trackable.

- Kim

Re: Identifying parts of a single image?

March 1, 2011 - 11:11pm #5

Hi kim

Im back with some queries....You asked me to go through projectScreenPointToPlane method for projecting screen point to the point on plane in Dominoes sample..

If im not wrong the vector intersection in the above method has the final x,y coordinates with respect to the centre of the trackable...

Now I have a simple doubt but still not able to figure out that when i try to return the float values of intersection.data[0] and intersection.data[1] to my java code and display them using system.out.println Iam getting Nan(i.e not a number) value...

Please tell me where am i wrong and am i taking the correct values or some conversion is required...Is intersection the correct vector which has required x,y values???

Right now Iam not identyfying any trackable Im just tapping any point on screen and trying to get the corresponding point for it on the plane....

Re: Identifying parts of a single image?

February 15, 2011 - 10:29am #4

Sure, although it might be easier if I could visualize your target. Is it an image of a building in profile? Do you want to render a different 3D object over each floor, or do you want to trigger an event when a particular floor comes into view?

If you are using a single image target, and have multiple models that you want to draw on the target, then you simply draw each model with a different translation value to spread them out correctly over your target. Look at the SampleUtils::translatePoseMatrix method, called in ImageTargets.cpp for example.

Maybe I'm misunderstanding what you are trying to do though. If you want to discuss offline (with a concrete example) you can email me at


- Kim

Re: Identifying parts of a single image?

February 14, 2011 - 10:28pm #3

Hey Kim

Can u please elaborate more on the first option......As i cant put the different images of floors of a building and create trackables so i think first option would be suitable for me as in other two we need to have the images of different parts of building(floors)........

Re: Identifying parts of a single image?

February 14, 2011 - 6:03am #2

So you have a few options:

1) Use a single image target. Translate each OpenGL object according to its position offset from the center of the target. If the target gets too big however, the user would have to stand back a ways to initiate tracking.

2) Use a multi image target. Create a bunch of image targets, and "stitch" them together by creating a multi image target in the config.xml (you will need to calculate the offsets by hand, the My Trackables system only handles cubes right now). This will improve tracking if you have a large target and the user is only focusing on one small section at a time. You cannot tell which part of a multi image target is currently visible, however, so you will have to use offsets from the target origin much like you would with a single image target.

For 1 and 2, If you need to know which section is currently visible, you will need to do some math on the pose matrix that is returned from the camera. For example, you could cast a ray from the center of the screen to the target plane, and find the X and Y values of the intersection point on the plane. The Dominoes sample has some code that should help with this.

3) Use a bunch of image targets, but don't make them part of a multi image target. If the user is only focusing on one section at a time this might be sufficient, but if he wants to pan from one section to another the multi image target will provide a smoother tracking experience. In this case you can easily tell which image target is currently visible.

- Kim

Log in or register to post comments