5 for inpainting details. Rank 8 is a very low LoRA rank, barely above the minimum. SDXL is supposedly better at generating text, too, a task that’s historically. With 4 times more pixels, the AI has more room to play with, resulting in better composition and. ago RangerRocket09 SDXL and low resolution images Question | Help Hey there. SDXL is ready to turn heads. For example, the default value for HED is 512 and for depth 384, if I increase the value from 512 to 550, I see that the image becomes a bit more accurate. (Interesting side note - I can render 4k images on 16GB VRAM. SDXL Control Net Models. Open in Playground. SDXL or Stable Diffusion XL is an advanced model developed by Stability AI that allows high-resolution AI image synthesis and enables local machine execution. (As a sample, we have prepared a resolution set for SD1. There is still room for further growth compared to the improved quality in generation of hands. Replicate was ready from day one with a hosted version of SDXL that you can run from the web or using our cloud API. The total number of parameters of the SDXL model is 6. Static engines use the least amount of VRAM. Stable Diffusion XL has brought significant advancements to text-to-image and generative AI images in general, outperforming or matching Midjourney in many aspects. Stability AI’s SDXL 1. From these examples, it’s clear to see that the quality is now on par with MidJourney. The benefits of using the SDXL model are. 5)This capability, once restricted to high-end graphics studios, is now accessible to artists, designers, and enthusiasts alike. SDXL 1. Thank God, SDXL doesn't remove. The Stable Diffusion XL (SDXL) model is the official upgrade to the v1. 0 offers a variety of preset art styles ready to use in marketing, design, and image generation use cases across industries. (2) Even if you are able to train at this setting, you have to notice that SDXL is 1024x1024 model, and train it with 512 images leads to worse results. Feedback gained over weeks. 1's 860M parameters. The model also contains new Clip encoders, and a whole host of other architecture changes, which have real implications. Run time and cost. For example, if the base SDXL is already good at producing an image of Margot Robbie, then. Also when I use it to generate a 1024x1416 image it takes up all 24GB of the vram on my 4090 and takes be over 5 minutes to make an image. e. Stability AI has released the latest version of its text-to-image algorithm, SDXL 1. While both videos involve inpainting resolutions of 768 or higher, the same 'trick' works perfectly for me on my laptop's 4GB GTX 1650 at 576x576 or 512x512. 0 and updating could break your Civitai lora's which has happened to lora's updating to SD 2. Stable Diffusion XL ( SDXL), is the latest AI image generation model that can generate realistic faces, legible text within the images, and better image composition, all while using shorter and simpler prompts. 5 as the original set of ControlNet models were trained from it. Recommended graphics card: ASUS GeForce RTX 3080 Ti 12GB. ; Added support for generate forever mode (ported from SD web UI). SDXL was trained on a lot of 1024x1024 images so this shouldn't happen on the recommended resolutions. 256x512 1:2. Better prompt following, due to the use of dual CLIP encoders and some improvement in the underlying architecture that is beyond my level of understanding 😅. The AI model was trained on images of varying sizes, so you can generate results at different resolutions. Moreover, I will show how to do proper high resolution fix (Hires. 9 are available and subject to a research license. It's simply thanks to the higher native resolution so the model has more pixels to work with – if you compare pixel for. 5 model. Within those channels, you can use the follow message structure to enter your prompt: /dream prompt: *enter prompt here*. org SDXL - The Best Open Source Image Model The Stability AI team takes great pride in introducing SDXL 1. I would prefer that the default resolution was set to 1024x1024 when an SDXL model is loaded. Both I and RunDiffusion are interested in getting the best out of SDXL. 9 - How to use SDXL 0. Recommended graphics card: MSI Gaming GeForce RTX 3060 12GB. Additionally, I've included explanations directly. This means every image. ¡No te lo pierdas! Hoy hablaremos de SDXL, un modelo de difusión latente que ha revolucionado la calidad de imágenes generadas en alta resolución. 2:1 to each prompt. For example: 896x1152 or 1536x640 are good resolutions. 0 is particularly well-tuned for vibrant and accurate colors, with better contrast, lighting, and shadows than its predecessor, all in native 1024x1024 resolution. 1's 860M parameters. Possibly deprecated now that the. Some notable improvements in the model architecture introduced by SDXL are:You don't want to train SDXL with 256x1024 and 512x512 images; those are too small. 5 models are (which in some cases might be a con for 1. . 9: The base model was trained on a variety of aspect ratios on images with resolution 1024^2. 5 model. 3 (I found 0. To learn how to use SDXL for various tasks, how to optimize performance, and other usage examples, take a look at the Stable Diffusion XL guide. 10:51 High resolution fix testing with SDXL (Hires. sdxl-recommended-res-calc. Here's the code to generate your own custom resolutions: SDFX : New UI for Stable Diffusion. It is a more flexible and accurate way to control the image generation process. In addition to this, with the release of SDXL, StabilityAI have confirmed that they expect LoRA's to be the most popular way of enhancing images on top of the SDXL v1. Not really. To learn how to use SDXL for various tasks, how to optimize performance, and other usage examples, take a look at the Stable Diffusion XL guide. Enter the following activate the virtual environment: source venvinactivate. 45it /s Reply reply. Thankfully, some people have made this much easier by publishing their own workflow and sharing them # SeargeSDXL. 0 model from Stability AI is a game-changer in the world of AI art and image creation. 5 for 6 months without any problem. to do img2img, you essentially do the exact same setup as text to image, but have the first KSampler's latent output into the second KSampler's latent_image input. We present SDXL, a latent diffusion model for text-to-image synthesis. The release model handles resolutions lower than 1024x1024 a lot better so far. Guidelines for SDXL Finetuning ; Set the Max resolution to at least 1024x1024, as this is the standard resolution for SDXL. That model architecture is big and heavy enough to accomplish that the. However, you can still change the aspect ratio of your images. The model’s visual quality—trained at 1024x1024 resolution compared to version 1. 0, renowned as the best open model for photorealistic image generation, offers vibrant, accurate colors, superior contrast, and detailed shadows at a native resolution of…VRAM consumption is surprisingly okay even at the resolution which is above 1024x1024 default. 5 models. I’ve created these images using ComfyUI. I’ll create images at 1024 size and then will want to upscale them. But it seems to be fixed when moving on to 48G vram GPUs. License: SDXL 0. It’s in the diffusers repo under examples/dreambooth. maybe you need to check your negative prompt, add everything you don't want to like "stains, cartoon". The default is "512,512". (As a sample, we have prepared a resolution set for SD1. Notes . The. I still saw double and stretched bodies when going outside the 1024x1024 standard SDXL resolution. But this bleeding-edge performance comes at a cost: SDXL requires a GPU with a minimum of 6GB of VRAM, requires larger. Stability AI published a couple of images alongside the announcement, and the improvement can be seen between outcomes (Image Credit) arXiv. Abstract and Figures. If you choose to use a lower resolution, such as <code> (256, 256)</code>, the model still generates 1024x1024 images, but they'll look like the low resolution images (simpler. Model type: Diffusion-based text-to-image generative model. compare that to fine-tuning SD 2. The SDXL 1. SDXL is a diffusion model for images and has no ability to be coherent or temporal between batches. Did you disable upscaling bucket resolutions?SDXL comes with an integrated Dreambooth feature. With 4 times more pixels, the AI has more room to play with, resulting in better composition and. SDXL v0. Reply reply SDXL is composed of two models, a base and a refiner. Additionally, it accurately reproduces hands, which was a flaw in earlier AI-generated images. With native 1024×1024 resolution, the generated images are detailed and visually stunning. We present SDXL, a latent diffusion model for text-to-image synthesis. We present SDXL, a latent diffusion model for text-to-image synthesis. It is convenient to use these presets to switch between image sizes. ai Discord server to generate SDXL images, visit one of the #bot-1 – #bot-10 channels. 5 method. You may want to try switching to the sd_xl_base_1. 1990s anime low resolution screengrab couple walking away in street at night. More Intelligent with Simpler Language. A very nice feature is defining presets. Stabilty. 9, which adds image-to-image generation and other capabilities. The new version generates high-resolution graphics while using less processing power and requiring fewer text inputs. 9 and SD 2. 9 Research License. Next (A1111 fork, also has many extensions) are the most feature rich. As the newest evolution of Stable Diffusion, it’s blowing its predecessors out of the water and producing images that are competitive with black-box. Negative prompt: 3d render, smooth, plastic, blurry, grainy, low-resolution, anime. For the best results, it is. Added support for custom resolutions and custom resolutions list. json as a template). 9, so I guess it will do as well when SDXL 1. SDXL 1. 0 version. For me what I found is best is to generate at 1024x576, and then upscale 2x to get 2048x1152 (both 16:9 resolutions) which is larger than my monitor resolution (1920x1080). SDXL offers negative_original_size, negative_crops_coords_top_left, and negative_target_size to negatively condition the model on image resolution and. A new fine-tuning beta feature is also being introduced that uses a small set of images to fine-tune SDXL 1. Note: The base SDXL model is trained to best create images around 1024x1024 resolution. If the training images exceed the resolution. The comparison of SDXL 0. It is a Latent Diffusion Model that uses two fixed, pretrained text encoders ( OpenCLIP-ViT/G and CLIP-ViT/L ). g. Add this topic to your repo. According to SDXL paper references (Page 17), it's advised to avoid arbitrary resolutions and stick to. Compared to previous versions of Stable Diffusion, SDXL leverages a three times larger UNet backbone. 9 Model. model_id: sdxl. Using the SDXL base model on the txt2img page is no different from using any other models. That way you can create and refine the image without having to constantly swap back and forth between models. Detailed Explanation about SDXL sizes and where to use each size When creating images with Stable Diffusion, one important consideration is the image size or resolution. Support for custom resolutions - you can just type it now in Resolution field, like "1280x640". A new architecture with 2. (Left - SDXL Beta, Right - SDXL 0. Any tips are welcome! For context, I've been at this since October, 5 iterations over 6 months, using 500k original content on a 4x A10 AWS server. Stable Diffusion XL (SDXL) is a powerful text-to-image generation model that iterates on the previous Stable Diffusion models in three key ways: the UNet. I suspect that the dataset that was used for SDXL is the cause, but I'm no expert. Several models are available, from different providers, e. 704x384 ~16:9. Stable Diffusion XL (SDXL) is a powerful text-to-image generation model that iterates on the previous Stable Diffusion models in three key ways: the UNet is 3x larger and SDXL combines a second text encoder (OpenCLIP ViT-bigG/14) with the original text encoder to significantly increase the number of parameters. SDXL 1. Inpaint: Precise removal of imperfections. The speed hit SDXL brings is much more noticeable than the quality improvement. The higher base resolution mostly just means that it. The memory use is great too, I can work with very large resolutions with no problem. it can generate good images at different resolutions beyond the native training resolution without hires fix etc. 5. best settings for Stable Diffusion XL 0. " GitHub is where people build software. Here is the recommended configuration for creating images using SDXL models. Instead you have to let it VAEdecode to an image, then VAEencode it back to a latent image with the VAE from SDXL and then upscale. ago. The AI model was trained on images of varying sizes, so you can generate results at different resolutions. Inpainting Workflow for ComfyUI. ResolutionSelector for ComfyUI. 640x448 ~4:3. Negative Prompt:3d render, smooth, plastic, blurry, grainy, low-resolution, anime, deep-fried, oversaturated Here is the recommended configuration for creating images using SDXL models. 1, not the 1. impressed with SDXL's ability to scale resolution!) --- Edit - you can achieve upscaling by adding a latent. 0 is particularly well-tuned for vibrant and accurate colors, with better contrast, lighting, and shadows than its predecessor, all in native 1024×1024 resolution,” the company said in its announcement. 9, produces visuals that are more realistic than its predecessor. DS games a resolution of 256x192. Now, let’s take a closer look at how some of these additions compare to previous stable diffusion models. Reply replySDXL is composed of two models, a base and a refiner. 🧨 DiffusersIntroduction Pre-requisites Initial Setup Preparing Your Dataset The Model Start Training Using Captions Config-Based Training Aspect Ratio / Resolution Bucketing Resume Training Batches, Epochs…Due to the current structure of ComfyUI, it is unable to distinguish between SDXL latent and SD1. Firstly, we perform pre-training at a resolution of 512x512. ago. Stability AI claims that the new model is “a leap. Just like its predecessors, SDXL has the ability to generate image variations using image-to-image prompting, inpainting (reimagining. Edit the file resolutions. r/StableDiffusion • SDXL Resolution Cheat Sheet. A custom node for Stable Diffusion ComfyUI to enable easy selection of image resolutions for SDXL SD15 SD21. "AI image generation is as good as done," CEO Mostaque said in a Q&A on the official Discord server shortly after SDXL's announcement. 896 x 1152 - 7:9. Massive 4K Resolution Woman & Man Class Ground Truth Stable Diffusion Regularization Images DatasetThe train_instruct_pix2pix_sdxl. Kicking the resolution up to 768x768, Stable Diffusion likes to have quite a bit more VRAM in order to run well. The images being trained in a 1024×1024 resolution means that your output images will be of extremely high quality right off the bat. py script shows how to implement the training procedure and adapt it for Stable Diffusion XL. 9 architecture. 4 just looks better. Tips for SDXL training. With Stable Diffusion XL 1. some stupid scripting workaround to fix the buggy implementation and to make sure it redirects you to the actual full resolution original images (which are PNGs in this case), otherwise it. Conclusion: Diving into the realm of Stable Diffusion XL (SDXL 1. I'm super excited for the upcoming weeks and months on what the wider community will come up with in terms of additional fine tuned models. SDXL represents a landmark achievement in high-resolution image synthesis. 9 is run on two CLIP models, including one of the largest CLIP models trained to date (CLIP ViT-g/14), which beefs up 0. 9’s processing power and ability to create realistic imagery with greater depth and a high-resolution 1024x1024 resolution. 0 is particularly well-tuned for vibrant and accurate colors, with better contrast, lighting, and shadows than its predecessor, all in native 1024x1024 resolution. Try to add "pixel art" at the start of the prompt, and your style and the end, for example: "pixel art, a dinosaur on a forest, landscape, ghibli style". Compact resolution and style selection (thx to runew0lf for hints). Dhanshree Shripad Shenwai. In our experiments, we found that SDXL yields good initial results without extensive hyperparameter tuning. 5 wins for a lot of use cases, especially at 512x512. 9 and Stable Diffusion 1. Varying Aspect Ratios. SD generations used 20 sampling steps while SDXL used 50 sampling steps. json file during node initialization, allowing you to save custom resolution settings in a separate file. 0. SDXL Resolutions: U don't need a calculator ;) Face Detailer: Refinement of facial features for lifelike results. yeah, upscaling to a higher resolution will so bring out more detail with highres fix, or with img2img. Just like its predecessors, SDXL has the ability to generate image variations using image-to-image prompting, inpainting (reimagining of the selected. Official list of SDXL resolutions (as defined in SDXL paper). This model operates through a two-step pipeline, leveraging a base model to generate latents of the desired output size and then utilizing a specialized high-resolution model and the SDEdit technique to transform these latents based on a given. Here are some examples of what I mean:Negative prompt: 3d render, smooth, plastic, blurry, grainy, low-resolution, anime. Run webui-user. 0 offers better design capabilities as compared to V1. My goal is to create a darker, grittier model. 0 base model. Some users have suggested using SDXL for the general picture composition and version 1. Gradient checkpointing enabled, adam8b, constant scheduler, 24 dim and. So I won't really know how terrible it is till it's done and I can test it the way SDXL prefers to generate images. 0, the flagship image model developed by Stability AI, stands as the pinnacle of open models for image generation. Avec sa capacité à générer des images de haute résolution à partir de descriptions textuelles et sa fonctionnalité de réglage fin intégrée, SDXL 1. We can't use 1. For the record I can run SDXL fine on my 3060ti 8gb card by adding those arguments. RMSprop 8bit or Adagrad 8bit may work. Here's a simple script ( also a Custom Node in ComfyUI thanks to u/CapsAdmin ), to calculate and automatically set the recommended initial latent size for SDXL image. Select base SDXL resolution, width and height are returned as INT values which can be connected to latent image inputs or other inputs such as the CLIPTextEncodeSDXL width, height,. 1 even. 5 in sd_resolution_set. json - use resolutions-example. Compact resolution and style selection (thx to runew0lf for hints). Link in comments. train_batch_size — Batch size (per device) for the training data loader. ; The fine-tuning can be done with 24GB GPU memory with the batch size of 1. According to many references, it's advised to avoid arbitrary resolutions and stick to this initial resolution, as SDXL was trained using this specific resolution. 5 to get their lora's working again, sometimes requiring the models to be retrained from scratch. In the AI world, we can expect it to be better. 0 enhancements include native 1024-pixel image generation at a variety of aspect ratios. 0 is an open-source diffusion model, the long waited upgrade to Stable Diffusion v2. 0 ComfyUI workflow with a few changes, here's the sample json file for the workflow I was using to generate these images:. You generate the normal way, then you send the image to imgtoimg and use the sdxl refiner model to enhance it. Max resolution. IMO do img2img in comfyui as well. 5 and SDXL. 0 is latest AI SOTA text 2 image model which gives ultra realistic images in higher resolutions of 1024. 5 such as the better resolution and different prompt interpertation. VAE. 008/image: SDXL Fine-tuning: 500: N/A: N/A: $. I always use 3 as it looks more realistic in every model the only problem is that to make proper letters with SDXL you need higher CFG. txt is updated to support SDXL training. Like the original Stable Diffusion series, SDXL 1. Specific Goals and Preferences: Not everyone is aiming to create MidJourney-like images. Stable Diffusion XL 0. Le Communiqué de presse sur SDXL 1. A custom node for Stable Diffusion ComfyUI to enable easy selection of image resolutions for SDXL SD15 SD21. Our training examples use Stable Diffusion 1. After that, the bot should generate two images for your prompt. 9 Tutorial (better than Midjourney AI)Stability AI recently released SDXL 0. You really want to follow a guy named Scott Detweiler. 0, an open model representing the next evolutionary step in text-to-image generation models. Comfyui is more optimized though. As usual, enter and negative prompt, and feel free to tweak the parameters. 6, and now I'm getting 1 minute renders, even faster on ComfyUI. Keep in mind the default resolution for SDXL is supposed to be 1024x1024, but people are using the refiner to generate images competently at 680x680, so maybe someone should try training smaller images on the refiner instead?SDXL 1. The below settings for width and height are optimal for use on SDXL 1. Pass that to another base ksampler. 9 espcially if you have an 8gb card. The release went mostly under-the-radar because the generative image AI buzz has cooled. The chart above evaluates user preference for SDXL (with and without refinement) over SDXL 0. Stable Diffusion XL (SDXL) is a powerful text-to-image generation model that iterates on the previous Stable Diffusion models in three key ways: the UNet is 3x larger and SDXL combines a second text encoder (OpenCLIP ViT-bigG/14) with the original text encoder to significantly increase the number of parameters. The SDXL base checkpoint can be used like any regular checkpoint in ComfyUI. 0, allowing users to specialize the generation to specific people or products using as few as five images. Traditional library with floor-to-ceiling bookcases, rolling ladder, large wooden desk, leather armchair, antique rug, warm lighting, high resolution textures, intellectual and inviting atmosphere ; 113: Contemporary glass and steel building with sleek lines and an innovative facade, surrounded by an urban landscape, modern, high resolution. I also tried different online service for SDXL and it had similar. SDXL 1. 5 checkpoints since I've started using SD. 0_0. Back in the terminal, make sure you are in the kohya_ss directory: cd ~/ai/dreambooth/kohya_ss. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. 0 is the new foundational model from Stability AI that’s making waves as a drastically-improved version of Stable Diffusion, a latent diffusion model (LDM) for text-to-image synthesis. 1990s anime low resolution screengrab couple walking away in street at night. How to use the Prompts for Refine, Base, and General with the new SDXL Model. Here’s everything I did to cut SDXL invocation to as fast as 1. 5 generates good enough images at high speed. Stable Diffusion 2. 5 and 2. 9 to create realistic imagery with greater depth and a higher resolution of 1024x1024. Resolution Sampler Price; SDXL: 30: 1024x1024: DDIM (and any not listed below as premium) $. Negative Prompt:3d render, smooth, plastic, blurry, grainy, low-resolution, anime, deep-fried, oversaturated. SDXL does support resolutions for higher total pixel values, however results will not be optimal. The fine-tuning can be done with 24GB GPU memory with the batch size of 1. 1 so AI artists have returned to SD 1. (I’ll see myself out. Dynamic engines generally offer slightly. Stability AI a maintenant mis fin à la phase de beta test et annoncé une nouvelle version : SDXL 0. The SDXL base checkpoint can be used like any regular checkpoint in ComfyUI. The new version generates high-resolution graphics while using less processing power and requiring fewer text inputs. Stability AI has released the latest version of its text-to-image algorithm, SDXL 1. SDXL's VAE is known to suffer from numerical instability issues. Stable Diffusion XL (SDXL 1. 0. Support for multiple native resolutions instead of just one for SD1. What is SDXL 1. Here are the image sizes that are used in DreamStudio, Stability AI’s official image generator: 21:9 – 1536 x 640; 16:9 – 1344 x 768; 3:2 – 1216 x 832; 5:4 – 1152 x 896; 1:1 – 1024 x. Below are the presets I use. resolution — The resolution for input images, all the images in the train/validation datasets will be resized to this. 0 : Un pas en avant dans la génération d'images d'IA. SDXL consists of a two-step pipeline for latent diffusion: First, we use a base model to generate latents of the desired output size. 43 MRE ; Added support for Control-LoRA: Depth. It was developed by researchers. 7gb without generating anything. PTRD-41 • 2 mo. For models SDXL and custom models based on SDXL are the latest. 5 had. Initiate the download: Click on the download button or link provided to start downloading the SDXL 1. I was looking at that figuring out all the argparse commands. 0 natively generates images best in 1024 x 1024. You should use 1024x1024 resolution for 1:1 aspect ratio and 512x2048 for 1:4 aspect ratio. SDXL 1. json as a template). When an SDXL model is selected, only SDXL Lora's are compatible and the SD1. Multiples fo 1024x1024 will create some artifacts, but you can fix them with inpainting. This is why we also expose a CLI argument namely --pretrained_vae_model_name_or_path that lets you specify the location of a better VAE (such as this one). On 26th July, StabilityAI released the SDXL 1. Thanks. select the SDXL base model from the dropdown. Abstract. Tout d'abord, SDXL 1. Bien que les résolutions et ratios ci-dessus soient recommandés, vous pouvez également essayer d'autres variations. 2000 steps is fairly low for a dataset of 400 images. 8 (80%) High noise fraction. What does SDXL stand for? SDXL stands for "Schedule Data EXchange Language". A simple script to calculate the recommended initial latent size for SDXL image generation and its Upscale Factor based on the desired Final Resolution output. Author Stability. 0 est capable de générer des images de haute résolution, allant jusqu'à 1024x1024 pixels, à partir de simples descriptions textuelles. Nodes are unpinned, allowing you to understand the workflow and its connections. 0 model is trained on 1024×1024 dimension images which results in much better detail and quality of images generated. You should either use exactly 1024x1024 res or multiples of it. A Faster and better training recipe: In our previous version, training directly at a resolution of 1024x1024 proved to be highly inefficient. 5 LoRAs I trained on this dataset had pretty bad-looking sample images, too, but the LoRA worked decently considering my dataset is still small. SDXL is not trained for 512x512 resolution , so whenever I use an SDXL model on A1111 I have to manually change it to 1024x1024 (or other trained resolutions) before generating. But the clip refiner is built in for retouches which I didn't need since I was too flabbergasted with the results SDXL 0. The situation SDXL is facing atm is that SD1. ; Like SDXL, Hotshot-XL was trained. 12700k cpu For sdxl, I can generate some 512x512 pic but when I try to do 1024x1024, immediately out of memory. ; Added MRE changelog. I extract that aspect ratio full list from SDXL technical report below. To prevent this from happening, SDXL accepts cropping and target resolution values that allow us to control how much (if any) cropping we want to apply to the generated images, and the level of. Switch (image,mask), Switch (latent), Switch (SEGS) - Among multiple inputs, it selects the input designated by the selector and outputs it. If you choose to use a lower resolution, such as <code> (256, 256)</code>, the model still generates 1024x1024 images, but they'll look like the low resolution images (simpler patterns, blurring) in the dataset. Some models aditionally have versions that require smaller memory footprints, which make them more suitable to be. ; Added Canny and Depth model selection. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Explained(GPTにて要約) Summary SDXL(Stable Diffusion XL)は高解像度画像合成のための潜在的拡散モデルの改良版であり、オープンソースである。モデルは効果的で、アーキテクチャに多くの変更が加えられており、データの変更だけでなく. Please see Additional Notes for a list of aspect ratios the base Hotshot-XL model was trained with. For best results, keep height and width at 1024 x 1024 or use resolutions that have the same total number of pixels as 1024*1024 (1048576 pixels) Here are some examples: 896 x 1152; 1536 x 640 SDXL is often referred to as having a 1024x1024 preferred resolutions.