🧪

[Bria-LoRa-FineTune]

🔥
Join our Discord community for more information, tutorials, tools, and to connect with other users!

đź’ˇ

To use the links below right click on them and open in new tab


The following guide demonstrates Bria best practices for fine tuning on top of our foundation models using Lora architecture and Bria foundation models.

Full implementation of the guide below:

huggingface.co
https://huggingface.co/spaces/Bar-Fin/lora-sdxl-finetuning/blob/main/lora_finetune.ipynb

Theory

Dream boot

is a fine-tuning technique designed to personalize generative models like Stable Diffusion. It allows users to train the model on a small set of images (e.g., photos of a person, object, or style) and integrate the learned concept into the model’s vocabulary.

Lora Architecture vs Regular fine tuning

LoRA is a technique to efficiently fine-tune large machine learning models by reducing the number of trainable parameters. Instead of updating the full set of model weights during training, LoRA represents weight updates as the product of two smaller matrices (low-rank matrices).

This approach significantly reduces computational and memory requirements while maintaining high performance, making it particularly useful for large-scale models like transformers. LoRA is widely adopted in applications like NLP and computer vision where fine-tuning massive pre-trained models would otherwise be resource-intensive.

Stochastic Gradient descent

Analogy - Stochastic Gradient Descent (SGD) is like finding the lowest point in a bumpy valley by taking small steps downhill, but instead of looking at the whole valley at once, you only look at one random part of it each time to decide your step.

The diffusion process

Training script

On Bria production we use the standard diffusers code:

LoRA
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/docs/diffusers/en/training/lora

But recommend evaluating the more advance methods as well

Recipe

This is the recommended recipe for our auto trading feature:


accelerate launch \
    --config_file accelerate_config.yaml \
    train_new.py \
    --caption_column="..." \
    --pretrained_model_name_or_path="briaai/BRIA-2.3" \
	  --dataset_name=$DATASET_NAME \
    --resolution=1024 \
    --center_crop \
    --train_batch_size=1 \
    --gradient_accumulation_steps=4 \
    --gradient_checkpointing \
    --max_train_steps=1000 \
    --checkpointing_steps=200 \
    --use_8bit_adam \
    --learning_rate=1e-04 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --mixed_precision="bf16" \
    --validation_epochs=5 \
    --output_dir=$MODEL_DIR \
    --rank=16
đź’ˇ
Using rank 256 consume lots of memory and increase model size, we recommend experimenting with lower ones e.g. 64, 32, 16

Model

Use Bria 2.3 as the “go to” model, but if needed our HD can fix quality issues for specific use-cases

briaai/BRIA-2.3 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/briaai/BRIA-2.3

Best practices for Tailored Generation training datasets

Dataset description

Enhance training performance by providing a concise and clear description of your style or subject.

Examples of dataset descriptions:

Flat vector illustration
3D render
Water Color
Pixel game art

Style

When training a model for a specific style type, it is crucial to provide images that contain the right information to guide the model. You should use around 20-60 images, and the dataset should consist of a clear style within a specific domain.

The images in your dataset should consider multiple perspectives and the appropriate background styles you aim to create.

Examples of common use cases:

Share the same style:

Datasets can include a wide range of variations as long as they share the same artistic style.


Mixing image styles may lead to poor results.

Ensure your dataset contains images with uniform style, including color schemes and design techniques, to achieve the desired outcomes from the model.


Single subject

When training a model for a single-subject type, it is essential to provide images that include the right information to guide the model. The dataset should contain 10-20 images and should consist of a single subject type, such as a person, car, bottle, animated character, etc.

The images in your dataset should consider multiple perspectives and the appropriate background styles you aim to create.

Here are some examples that demonstrate common use cases:

Multi-Perspective

If you aim for your model to generate images of a single subject from various angles or perspectives, ensure your dataset includes examples showcasing these perspectives.


Incorporating Backgrounds:

Should you desire your model not only to capture the subject but also to learn and replicate the surrounding scenery accurately, it's crucial to include images with backgrounds in your dataset. This approach allows the model to understand how the subject interacts with its environment, enabling it to generate more contextually rich images.


Transparent or solid background:

In cases where the subject is presented against a background of transparent or solid colors (such as white, black, blue, etc.), it is essential to ensure that the subject covers most of the image size. If necessary, it is better to crop the solid margins of the image to reduce the amount of transparency or solid color present.


Consistent image style:

Ensure you don't mix styles within your dataset; for example, a dataset should not contain both animated cars and photo-realistic cars together..


Group of subjects:

If your goal is to generate images featuring your subject in a group, it is advisable to include multiple examples of such groupings in the dataset.

Icons

When training a model for a specific icon style, it is crucial to provide images that contain the right information to guide the model. Users should upload 20-50 images, and the dataset should consist of a clear icons’ style within a specific domain.

The images in your dataset should consider multiple types of icons sharing the same style.

Examples of common use cases:

Share the same style:

Datasets can include a wide range of variations as long as they share the same icons style.


Define the style of the icons in details:

Ensure the description of the icon’s style is as detailed as possible.For example: vector illustration , line art, very thick continuous outlines, minimalistic illustration, vector drawn strokes, continuous strokes


For SVG images, use simple 2D images for training:

To create high-quality images in SVG format, use simple 2D images in your dataset. Images should not include many details, shading, or complex styling.

Captions / Prompts

WIP

Compute

We run on Nvidia A10 GPU:


For any additional questions please contact bar@bria.ai