fbpx

PirateDiffusion Guide

Pirate Diffusion

pirate diffusion logo

Overview

Pirate Diffusion by Graydient AI is the most powerful bot on Telegram. It is a multi-modal bot, meaning that it handles large language models and thousands of image models like FLUX, Pony, and SDXL and also video like LTX LightTricks.

 

Incredible Value

Unlike other generative AI services, there are no “tokens” or “credits” required. PirateDiffusion is designed for unlimited use, royalty free, and it’s bundled with Graydient’s webUI.

 

Why use a bot?

It’s extremely fast and lightweight on mobile, and you can use it solo or in groups of strangers or with friends. Our community is making chat macros for powerfuly ComfyUI workflows, so you get the full benefit of a desktop rendering result from a simple chat bot. It’s insane. 

 

What else can it do?

 

Images, Video, and LLM chats. It can do just about anything. 

You can make 4k images in a few keystrokes without a GUI. You can use every major Stable Diffusion via chat. Some features require a visual interface, and it pops up a web page for those (stable2go integration)

Create by yourself with your private bot, or join groups and see what other people are making. You can also create themed bots like /animebot which connect our PollyGPT LLMs to StableDiffusion models, and chat with them to help you create! Create a workflow that’s totally unique to you using loadouts, recipe (macros) widgets (visual gui builder) and custom bots. It does a lot!

Origin Story

The name Pirate Diffusion comes from the leaked Stable Diffusion 1.5 model in October 2022, which was open sourced. We built the first Stable Diffusion bot on Telegram and thousands of people showed up, and that’s how we started. But to be perfectly clear: there’s no piracy going on here, we just loved the name. Still, enough people (and our bank) said the name was a little much, so we renamed our company to “Graydient” but we love Pirate Diffusion. It attracts fun, interesting people.

Happy New Year – we are testing video generation!  

If you’re a member of Graydient Plus, you can try our beta today. There are two video modes currently available – text to video, and image to video.  To use video, call the /workflow command or the shortcut /wf.  There are five video workflows at this time, which we’ll cover below:
 

Text to video

/wf /run:video cinematic low angle video of a ronald mcdonald clown eating a square hamburger, the restaurant ((sign text says "Wendys")), ronald mcdonald's clown costume has red afro hair and a red nose with white face paint, the restaurant is brown, the burger is pointy and square, the background has blur bokeh and people are walking around
 
The underlying model is called LTX or LightTricks Video.  In LTX, the structure of the prompt matters a lot. A short prompt will result in a static image. A prompt with too many actions and instructions will cause the video to pan to different random rooms or characters.  
Best Practices: How to make your images move cohesively
 
We recommend a prompt pattern like this:
  1. First describe what the camera is doing or who its following. For example, a low angle camera zoom, an overhead camera, a slow pan, zooming out or away, etc.
  2. Next describe the subject and one action they are performing onto what or whom. This part takes practice!  In the example above, notice that Ronald eating the burger came after the camera and scene setup
  3. Describe the scene.  This helps the AI “segment” the things that you want to see. So in our example, we describe the clown costume and background.
  4. Give supporting reference material.  For example, say “This looks like a scene from a movie or TV show”
  5. You can specify a Flux lora to control the art direction or character likeness. Add this to the end of the prompt like <muscularwoman-flux>
  6.  You can specify size like /size:468×832 and /guidance to control the “HDR” effect

Additional Workflows

These video workflows are available right now, we’re live!

video  =  Full sized LTX with Flux integration at the first step for loras.  Use case: beautiful videos but short.  Our best looking video workflow, hands down.

video-medium   = Well rounded. Can handle motion well. Doesn’t use loras but its fast and can make longer videos than the flux one.

video-turbo = The fastest, with quality tradeoff.  Best for still subjects and subtle animations.  Can look nice at higher sizes and steps, I like to start my ideas here to see if they will even animate, then move my seed and other settings up the chain.

Image to video

You can also upload a photo and turn it into a video.  To do this, first paste the photo into the chat and click “Reply” as if you’re going to talk to the photo, and then give it a command like this:

/wf /run:animate camera video that slightly changes the angle, focused on a lovely girl smiling and looking at the camera, she looks curious and confident while maintaining eyes on the viewer, her hair is parted, she sits in front of a bookshelf and peeping gremlin eyes behind her, she is relaxing vibe

There are two levels of this video workflow, which are:

animate   = convert an image to video 

animate-turbo  = same, using small/fast models

Special parameters:

/slot1 = length of the video in frames.  Safe settings are 89, 97, 105, 113, 121, 137, 153, 185, 201, 225, 241, 257.   More is possible but unstable

/slot2 = frames per second.  24 is recommended.  Turbo workflows run at 18 fps but can be changed.  above 24 is cinematic, 30fps looks more realistic.  60fps is possible at low frames but it looks like chipmunk speed.

Limitations:

  • You must be a member of Graydient Plus to try the beta
  • Many video samples are available in the VIP and Playroom channel today.  Come hang out and send us your prompt ideas while we put the finishing touches on video.
 
Pictured below: The Flux “Atomix” workflow but direct prompt-to-video, no Img2img step required.
Try a new recipe! Type /render #quick and your prompt
Tip: Use /render #quick - a macro to achive this quality without typing negatives

Guidance (CFG)

The Classifier-Free Guidance scale is a parameter that controls how closely the AI follows the prompt; higher values mean more adherence to the prompt. 

When this value is set higher, the image can appear sharper but the AI will have less “creativity” to fill in the spaces, so pixelation and glitches may occur. 

A safe default is 7 for the most common base models. However, there are special high efficiency models that use a different guidance scale, which are explained below.

SYNTAX

/render <sdxl> [[<fastnegative-xl:-2>]]
/guidance:7
/size:1024x1024
Takoyaki on a plate

How high or how low the guidance should be set depends on the sampler that you are using. Samplers are explained below. The amount of steps allowed to “solve” an image can also play an important role.

 

Exceptions to the rule 

Typical models follow this guidance and step pattern, but newer high efficiency models require far less guidance to function in the same way, between 1.5 – 2.5.  This is explained below:

High Efficiency Models

Low Steps, Low Guidance

Most concepts require a guidance of 7 and 35+ steps to generate a great image. This is changing as higher efficiency models have arrived.

These models can create images in 1/4 of the time, only requiring 4-12 steps with lower guidance. You can find them tagged as Turbo, Hyper, LCM, and Lightning in the concepts system, and they’re compatible with classic models. You can use them alongside Loras and Inversions of the same model family. The SDXL family has the biggest selection (use the pulldown menu, far right). Juggernaut 9 Lightining is a popular choice.

Some of our other favorite Lightning models are <boltning-xl> and <realvis4light-xl> which look great with a guidance of 2, steps between 4-12, and Refiner (no fix) turned off. Polish it off with a good negative like [[<fastnegative-xl:-2>]].  Follow it up with an upscale, and the effects are stunning!

Look into the notes of these special model types for more details on how to use them, like Aetherverse-XL (pictured below), with a guidance of 2.5 and 8 steps as pictured below.

VASS (SDXL only)

Vass is an HDR mode for SDXL, which may also improve composition and reduce color saturation. Some prefer it, others may not. If the image looks too colorful, try it without Refiner (NoFix)

The name comes from Timothy Alexis Vass, an independent researcher that has been exploring the SDXL latent space and has made some interesting observations. His aim is color correction, and improving the content of images. We have adapted his published code to run in PirateDiffusion.

/render a cool cat <sdxl> /vass

Why and when to use it: Try it on SDXL images that are too yellow, off-center, or the color range feels limited. You should see better vibrance and cleaned up backgrounds.

Limitations: This only works in SDXL. 

 
 

More tool (reply command)

The More tool creates variations of the same image

To see the same subject in slightly different variations, use the more tool. 

DESCRIBE A PHOTO

Updated!  There are now two modes of describe: CLIP and FLORENCE2

Generate a prompt from any image with computer vision with Describe! It is a “reply” command, so right click on the image as if you were going to talk to it, and write

/describe /florence

The additional Florence parameter gives you a much more detailed prompt. It uses the new Florence2 computer vision model.  /describe by itself uses the CLIP model

Example

Launch widgets within PirateDiffusion