There are two main ways to do this:
First, you need a 3d model of the structure. For large structures, I suggest breaking into multiple different models to increase accuracy, (do you want the application to work at different distances).
Next you have the easy or hard method;
EZ but cost money:
1- Join the developer program, and get the object recog program application. All you have to do is load the object model into the application and convert it to the targets. For large structures, I strongly suggest you create secondary and tertiary models depending on how large the structure is.
2- You use a hack of the existing object recog software.
Right now, the only option for us poor folks is to use the existing 3d scanning application. This is where you print an array, and place it on the ground with the object inside. Obviously, this doesn't work with buildings unless you have a drone/helicopter with a massive base target.
You can "fool" the application, by instead loading a quad of the image target, and then placing the scanned 3d model on top of it. You can either build this model via any modeling software (blender), however I suggest using Unity since it's easier to do.
Next, take the phone and mount it looking at a high definition display rendering the model. Start recording and follow the instructions. It will work, however please note you are taking a scan of a scan, so the resolution won't be as high as direct object converting.
I'd write a tutorial, but I'm still waiting for the vuforia guys to get off their butts and give me the flipping object recog compiler already. I was suppose to get it months ago before it was announced. If you need assistance, please feel free to message the devs and ask on my behalf. The reason I'm posting this is to say it is indeed possible to use vuforia for large scale objects, however there are additional steps you need to take to properly calibrate the software to work. I gave 3 presentations on this functionality and I'm still ignored :/