> Welcome to onnx-web, a Stable Diffusion tool designed for straightforward and versatile use in AI art. Whether you're
> running on an AMD or Nvidia GPU, on a Windows desktop, or a Linux server, onnx-web is compatible across various
> setups. It goes a step further by supporting multiple GPUs simultaneously, and with SDXL and LCM available for all
> platforms, users can harness its capabilities without constraints. The panorama mode stands out, enabling regional
> prompts without the need for additional plugins. Get ready to explore the technical aspects of onnx-web and discover
> how it seamlessly fits into your AI art toolkit.
OR
> Welcome to onnx-web, your gateway to exploring the cutting-edge realm of Stable Diffusion in AI art. This guide is tailored for new users seeking a seamless entry into the diverse and creative possibilities offered by onnx-web. Whether you're a seasoned AI art enthusiast or just dipping your toes into the world of generative art, onnx-web provides a user-friendly yet powerful environment to unleash your creativity.
> Installation and System Requirements:
> Before diving into the creative process, it's crucial to get onnx-web up and running on your system. We offer multiple installation methods to cater to users of varying technical expertise. For beginners on Windows, an all-in-one EXE bundle simplifies the installation process. Intermediate users can opt for a cross-platform installation using a Python virtual environment, providing greater customization. Server administrators can explore OCI containers for streamlined deployment. Keep in mind the minimum and recommended system requirements to ensure optimal performance, with options for optimizations tailored to low-memory users.
> Understanding onnx-web's Core Features:
> onnx-web introduces a set of core features that form the backbone of your AI art journey. The Stable Diffusion process, capable of running on both AMD and Nvidia GPUs, powers the image generation pipeline. Explore the diverse tabs in the web UI, each offering unique functionalities such as text-based image generation, upscaling, blending, and model management. Dive into the technical details of prompt syntax, model conversions, and the intricacies of parameters, gaining a deeper understanding of how to fine-tune the AI art creation process.
> Unlocking Specialized Pipelines:
> Delve into the specialized pipelines within onnx-web, such as the panorama and highres features, each designed to elevate your creative output. The panorama pipeline allows for the generation of large and seamless images, enhanced by the utilization of region prompts and seeds. Highres, on the other hand, provides a super-resolution upscaling technique, refining images with iterative img2img processes. Learn how to harness these features effectively, understanding the nuances of region-based modifications, tokens, and optimal tile configurations.
> Optimizing Performance and Memory Usage:
> Discover how onnx-web caters to users with varying hardware configurations. Uncover optimization techniques such as converting models to fp16 mode, offloading computations to the CPU, and leveraging specialized features like the panorama pipeline for efficient memory usage. Gain insights into the considerations and trade-offs involved in choosing parameters, tile configurations, and employing unique prompts for distinct creative outcomes.
> This guide sets the stage for your onnx-web journey, offering a balance of technical depth and user-friendly insights to empower you in your AI art exploration. Let's embark on this creative venture together, where innovation meets technical precision.
- > Role: The scheduler dictates the annealing schedule during the diffusion process.
- > Explanation: It determines how the noise level changes over time, influencing the diffusion process to achieve the desired balance between exploration and exploitation during image generation.
- > Role: CFG is integral for conditional image generation, allowing users to influence the generation based on specific conditions.
- > Explanation: By adjusting the CFG, users can guide the diffusion process to respond to certain prompts, achieving conditional outputs aligned with the specified criteria.
- > Role: Steps determine the number of diffusion steps applied to the image.
- > Explanation: More steps generally result in a more refined image but require additional computation. Users can fine-tune this parameter based on the desired trade-off between quality and computational resources.
- > Role: The seed initializes the randomization process, ensuring reproducibility.
- > Explanation: Setting a seed allows users to reproduce the same image by maintaining a consistent starting point for the random processes involved in the diffusion, facilitating result replication.
- > Role: Batch size influences the number of samples processed simultaneously.
- > Explanation: A larger batch size can expedite computation but may require more memory. It impacts the efficiency of the Stable Diffusion process, with users adjusting it based on available resources and computational preferences.
- > Role: The prompt provides the textual or visual input guiding the image generation.
- > Explanation: It serves as the creative input for the algorithm, shaping the direction of the generated content. Users articulate their artistic vision or preferences through carefully crafted prompts.
- > Role: Negative prompts offer a counterbalance to positive prompts, influencing the generation towards desired qualities or away from specific characteristics.
- > Explanation: By including a negative prompt, users can fine-tune the generated output, steering it away from undesired elements or towards a more nuanced and controlled result.
- > Role: These parameters define the dimensions of the generated image.
- > Explanation: Users specify the width and height to control the resolution and aspect ratio of the output. This allows for customization based on the intended use or artistic preferences.
- > One such parameter is the UNet tile size. This parameter governs the maximum size for each instance the UNet model runs. While it aids in reducing memory usage during panoramas and high-resolution processes, caution is needed. Reducing this below the image size in txt2img mode can result in repeated "totem pole" bodies, highlighting the importance of aligning the tile size appropriately with the intended use case.
- > The UNet overlap parameter plays a pivotal role in determining how much UNet tiles overlap. For most high-resolution applications, a value of 0.25 is recommended. However, for larger panoramas, opting for values between 0.5 to 0.75 seamlessly blends tiles, significantly enhancing panorama quality. Balancing this parameter ensures optimal performance in diverse scenarios.
- > For users engaging with the tiled VAE parameter, the choice revolves around whether the VAE (Variational Autoencoder) operates on the entire image or in smaller tiles. Opting for the tiled VAE not only accommodates larger images but also reduces VRAM usage. Notably, it doesn't exert a substantial impact on image quality, making it a pragmatic choice for scenarios where resource efficiency is a priority.
- > Parallel to UNet, the VAE (Variational Autoencoder) introduces two additional parameters: VAE tile size and VAE overlap. These mirror the UNet tile size and UNet overlap parameters, applying specifically to the VAE when the tiled VAE is active. Careful consideration of these parameters ensures effective utilization of onnx-web's capabilities while adapting to the unique requirements of your image generation tasks.