An isometric room, based on the screenshot. Qwen3.6-35B

Posted by k0setes@reddit | LocalLLaMA | View on Reddit | 19 comments

I didn't expect this result. I knew Qwen3.6-35B-A3B-UD-Q4_K_S was capable of generating 3D scenes, but this was unexpected. I found the original screenshot on r/OpenAI and asked Qwen to recreate it. I nudged it to round out the furniture and add some texture to the rug

[-]

Chromix_@reddit

I've also given it (Qwen3.6-35B-A3B-UD-Q8_K_XL) the screenshot with a simple prompt: "Build a single-file HTML visualization of this screenshot using three.js. Include OrbitControls.". 7k tokens later I got this, which is quite less detailed:

[-]

cunasmoker69420@reddit

what's the image min tokens do?

[-]

Chromix_@reddit

It encodes the image into more tokens, given the LLM (that supports dynamic resolution) a more detailed representation of the image.

[-]

cunasmoker69420@reddit

Do you know what the default value is and what the practical options are?

[-]

Chromix_@reddit

Next iteration with --image-min-tokens 2048 and 9k tokens looks rather decent: "Build a single-file HTML visualization of this screenshot using three.js. Include OrbitControls. Ensure an accurate, detailed representation. Prevent Z-fighting."

I've also tried to push the quality with more prompting like this:

Build a single-file HTML visualization of this screenshot using three.js. Include OrbitControls. Create an accurate, detailed representation. Make the edges of most furniture visibly rounded, to make the objects look smoother and more realistic. Use high-quality lighting while ensuring visibility like in the screenshot. Prevent Z-fighting. Ensure code correctness.

Yet that led to unreliable results, like clearly broken geometry, script errors like double-defining a const, threejs usage errors, etc. Apparently it's too complex. It only did basic reasoning about materials and high-level choices, nothing about actual code. Occasionally and with a bit of hinting it got something working though, yet not as nice as OPs images.

Gemma4 31B (Q4_K_XL) reliably creates working scenes with this more detailed prompt, but their quality is way below that of Qwen for the "next iteration" prompt. Same when using the "next iteration" prompt for Gemma.

[-]

CircularSeasoning@reddit

I swear every time the model sees the word "ensure" this or that, it considers that as a sign of weakness and desparation on the prompter's part and basically ignores it.

Adding things like "ensure code correctness" right at the end rarely works out for me. Being more specific about what code correctness you actually expect seems to do much better, even if it's just a style expectation like single quotes vs. double quotes.

[-]

New_Comfortable7240@reddit

Not sure if it helps, try to pin the versions of threejs. There was a big breaking change regarding lights on latest versions, so what I did was pin to old versions and it worked.

[-]

Chromix_@reddit

You dare adding actual knowledge to this vibe coding topic? 😉

Yes, it of course makes a lot of sense to pin dependencies to the knowledge cut-off date of the model. Well, either that or it needs to be provided with updated documentation on it.

The model explicitly includes version 0.160.0 which was released January 2024 - which is strange, as the knowledge cut-off seems to be around end of 2025. Maybe it just has a bit of outdated training data mixed in.

[-]

Unknown_New_God@reddit

Prompt: Create a detailed isometric visualization of this room in Three.js. Hide the two foreground walls and the ceiling. Arrange the elements in the room logically so they don't overlap or intersect (e.g., the bookshelf and other furniture should have proper spacing). Add decorative panels to the walls and implement high-quality lighting for the scene.

[-]

Several-Tax31@reddit

Looks awesome! I love some guy brags about a soon-to-come nonexistent closed source model and a smaller existing open model already replicates that feat

[-]

Complete_Instance_18@reddit

That's a genuinely impressive leap for a local model,

[-]

Skystunt@reddit

This looks like a cool way to play with models, too bad most of them aren’t trained to do this kind of tasks

[-]

Tormeister@reddit

They used an elaborate prompt to generate the original 3d scene, haven't they? Surely recreating it from a screenshot is much more difficult than from a meticulous text description

[-]

houchenglin@reddit

It seems the output mixed the scenes of 2 screenshots. Maybe only one target scene can better see the difference of QWEN 3.6 vs GPT 5.5.

[-]

EatTFM@reddit

Could you share the exact prompt you have used?

[-]

k0setes@reddit (OP)

I used the original image as a reference and sent these two prompts

Prompt 1: "Create a detailed isometric visualization of this room in Three.js. Hide the two foreground walls and the ceiling. Arrange the elements in the room logically so they don't overlap or intersect (e.g., the bookshelf and other furniture should have proper spacing). Add decorative panels to the walls and implement high-quality lighting for the scene." Prompt 2 (for the rounded edges): "Is it possible to modify the furniture—like the sofa, bed, and pillows—to have rounded edges instead of sharp corners? Use rounded primitives or beveled geometries to make the objects look smoother and more realistic."

[-]

tableball35@reddit

So, how do you generate stuff like this? Does it build the 3D models, or is it more a code output you plug into some other app?

[-]

k0setes@reddit (OP)

Qwen generated an HTML file based on the image and built the entire scene out of primitives using Three.js. There’s nothing stopping you from asking it to add an exporter to save the scene as an STL or OBJ file, so you can open it in other 3D software instead of just a browser.

[-]

Wise-Hunt7815@reddit

maybe：html + threejs, that's easy