Opening Thoughts

In the age of AI it is more important than ever to show process, otherwise it is assumed that all you did was type a few words into a prompt box. I like to compare AI to the invention of photography in a world of painters. Before photos, the act of creating a life-like representation was a prized skill. Nearly overnight, anyone could click a button to capture reality, reinforcing the value of ideas in painting. Anyone can create an image in photography, but not all images are good. It takes expertise in composition, lighting color, and most importantly the idea to create compelling images. Now with AI anyone can create a beautiful image with great lighting, composition and color, but the value of the work will always go back to the ideas an artist is bringing into the world.

75 of the best from ~ 8,000 generations straight from Midjourney AI.

Process Overview

For the past 3 months I have been experimenting with generating architecture with Midjourney AI. In that time, I create nearly 8,000 skyscrapers (2,000 prompts that generate 4 options each), and narrowed them down to 20 final images to present. The act of curation is more important than ever in the world of AI art.

Starting Point

I spent 2 years documenting over 150 real art deco skyscraper around the USA via high-res drone scans. This legwork provided me with high-quality source material to prompt the AI, yielding more consistent and detailed results. By using Image prompting, I am able to use AI to create images inspired by my own personal style, which I have developed over 8 years of photography experience. In addition, the image prompts enable the AI to gain inspiration from specific examples of incredible architecture, rather than a general database of what a building is. The two generations below use the same prompt, one without an image, one with an image.

Prompt - a unique and iconic art deco skyscraper made of stone and steel with an iconic spire photographed at sunset --ar 2:3

Without image prompting the AI presents several different composition styles and more variety in the designs. I included the word “Photographed” in the prompt to push the AI to be more realistic, but it falls short of the image prompted version.

With image prompting using the Genesee Valley Trust Building in Rochester NY, it is clear to see how the AI maintains the style of imagery that inspires me, and the composition of the images is more consistent.

The lack of an image prompt sometimes allows the AI to be more “creative” and sometimes to better respond to text prompts, but using an image prompt adds a level of realism and detail enabled by the quality of the high-res image I created with my own personal style.

Image Prompting

Through the nearly 2,000 prompts, I experimented with different combinations of word and image prompts. It is fascinating to see what the AI interprets from the image. To me, the most important element of the image I wanted the AI to follow is the consistent rectilinear composition, and general lighting quality, but it does a great job at understanding the design language of the structure and reinterpreting it. Below are four generations using different real-life buildings as image prompts while keeping the text prompt consistent.

Prompt - (Image) a unique art deco skyscraper with an iconic spire photographed at sunset with soft dramatic lighting --ar 2:3 --iw 2 (—ar sets the aspect ratio, and —iw changes the weight of the text vs image)

Carbide and Carbon Building, Chicago IL.

Eastern Columbia Building, Los Angeles CA.

Genesee Valley Trust Building, Rochester NY.

Buffalo City Hall, Buffalo NY.

Text Prompts

There were a variety of different prompting formats I tried, ranging between short and simple to very descriptive. I found that thinking of architectural style, materiality and lighting yielded interesting results. It is also good to include words like “iconic” or “unique” to try to generate atypical results. Below are examples of using the same image prompt of the General Electric Building in New York but varying the text.

Prompt - (Image) an all white art nouveau skyscraper made from carved white marble with a shimmering aluminum panel spire. --ar 2:3

Often times the text is not explicitly followed. The “Aluminum panel spire” was not generated.

Prompt - (Image) a sinister all black skyscraper with red art deco details. The facade is made of all black metal with a unique black spire --ar 2:3

You can see here how the text was not explicitly followed, as there is some brown stone in some of the options.

Image Weight

The --IW command at the end of a prompt enables users to change the “weight” between words an image prompts between a range of 0 to 2. This is helpful for making sure the AI listens to your words, but very quickly things like composition from the image become lost. Below are examples with a real building image I create on the left (70 Pine in NYC) and the output on the right.

Prompt: (Image prompt) a sinister all black skyscraper with a dramatic design. There are red art deco details. --ar 2:3 --iw .5

You can see that with a lower IW, the AI follows the “all black” prompt better and diverges from the massing of the building, but the composition of the images is unsatisfactory, with converging vertical lines and no city in the background.

Prompt: (Image prompt) a sinister all black skyscraper with a dramatic design. There are red art deco details. --ar 2:3 --iw 2

With a higher image weight the AI doesnt follow the “all black” prompt, keeping the contrast in materials seen in the prompt image, and staying more true to the massing. The composition is perfect for higher IW generations.

Photoshop

In the creation of the final presented images, Photoshop was used to fix any issues in the generation. Often times I had to replace or manipulate the backgrounds of the buildings, and I cleaned up any obvious mistakes in the generations.

The output straight from Midjourney is on the left, and the refined Photoshopped version is on the right. The background was replace, composition was adjusted, windows were added and color and tone was enhanced.

Closing thoughts

I worked in an architecture office designing high rise buildings for 3 years, and I think it is an interesting analogy. At work, I was the one who did the nitty gritty creation of design options that were prompted by my project architect / manager albeit much slower than AI. Through this AI process I began to feel like more of a manager, steering the ship and curating options. It was like I have my own team of entry-level designers producing designs based on my specifications, and it was my job to sift through their work to discover the nuggets of brilliance amidst the plethora of fails.