AI_Program8 の履歴(No.2) - PukiWiki

[ トップ ] [ 一覧 | 検索 | 履歴 | ログイン ]

私的AI研究会 > AI_Program3

生成 AI プログラミング３ == 編集中 ==†

　これまで検証してきた結果をもとに、Python で生成 AI プログラムを書く

▲　目　次

生成 AI プログラミング３ == 編集中 ==
参考資料

※ 最終更新:2025/06/16　

diffusersではじめめる Stable Diffusion （実践編）†

　画像生成のプログラムを書く

Step 40：テキストから画像を生成する†

「sd_txt2img.py」

## sd_txt2img.py【SD1.5】　テキストから画像生成（txt2img）テンプレート

import torch
from diffusers import StableDiffusionPipeline, logging
from translate import Translator

logging.set_verbosity_error()

# モデルのフォルダーのパス
model_path = "/StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors"        # モデル

# GPUを使う場合は"cuda" 使わない場合は"cpu"
device = 'cuda'

# seed 値
seed = 12345678

# パイプラインを作成
pipeline = StableDiffusionPipeline.from_single_file(model_path).to(device)

# プロンプト
trans = Translator('en','ja').translate
prompt_jp = '満開の蘭'                                                                                   # プロンプト
prompt = trans(prompt_jp)

# Generatorオブジェクト作成
generator = torch.Generator(device).manual_seed(seed)

print(f'Seed: {seed}, Model: {model_path}')
print(f'prompt : {prompt_jp} → {prompt}')

# 画像を生成
image = pipeline(
                    prompt=prompt,
                    num_inference_steps = 30,
                    guidance_scale = 7.5,
                    width = 512,
                    height = 512,
                    generator = generator,
                    ).images[0]
image.save("results/sd_txt2img.png")                                                                     # 生成画像

プログラムを実行する（実行時間：約 3秒 RTX 4070 Ti 12GB）

(sd_test) PS > python sd_txt2img.py

Fetching 11 files: 100%|████████████████████| 11/11 [00:00<00:00, 11048.21it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:00<00:00,  8.85it/s]
Seed: 12345678, Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors
prompt : 満開の蘭 → Orchid in full bloom
100%|██████████████████████████████████████████| 30/30 [00:03<00:00,  8.31it/s]

画像ファイル「sd_txt2img.png」が生成される

Step 41：画像から画像を生成する†

img2img 画像から画像生成

「sd_img2img.py」

## sd_038.py　特定の部分だけ修正（inpaint）

import torch
from PIL import Image
from diffusers import StableDiffusionInpaintPipeline, logging
from translate import Translator

logging.set_verbosity_error()

# モデルのフォルダーのパス
model_path = 'runwayml/stable-diffusion-inpainting'         # モデル
image_path = 'images/sd_038_test.png'                       # 元画像
mask_path = 'images/sd_038_test_mask.png'                   # マスク画像

# GPUを使う場合は"cuda" 使わない場合は"cpu"
device = 'cuda'

# seed 値
seed = 12345678

# パイプラインを作成
pipeline = StableDiffusionInpaintPipeline.from_pretrained(
                    model_path,
                    torch_dtype = torch.float16,
                    variant = 'fp16'
                    ).to(device)

# プロンプト
trans = Translator('en','ja').translate
prompt_jp = 'こっちを見て微笑んでいる女の子'
prompt = trans(prompt_jp)

# Generatorオブジェクト作成
generator = torch.Generator(device).manual_seed(seed)

img0 = Image.open(image_path)
img_mask = Image.open(mask_path)

print(f'Seed: {seed}')
print(f'prompt : {prompt_jp} → {prompt}')
print(f'Model  : {model_path}')
print(f'source : {image_path}')
print(f'mask   : {mask_path}')

# 画像を生成
image = pipeline(
                    prompt=prompt,
                    image = img0,
                    mask_image = img_mask,
                    num_inference_steps = 20,
                    generator = generator,
                    ).images[0]

image.save("results/sd_038.png")

プログラムを実行する（実行時間：約 5秒 RTX 4070 Ti 12GB）

(sd_test) PS > python sd_img2img.py

Fetching 11 files: 100%|███████████████████████████████| 11/11 [00:00<?, ?it/s]
Loading pipeline components...: 100%|████████████| 6/6 [00:01<00:00,  3.51it/s]
Seed: 12345678, Model: /StabilityMatrix/Data/Models/StableDiffusion/SD1.5/v1-5-pruned-emaonly.safetensors
prompt : 兎 → Domestic Rabbit
100%|██████████████████████████████████████████| 18/18 [00:04<00:00,  3.78it/s]
(sd_test) PS D:\anaconda_win\workspace_3\sd_test> python sd_037.py

画像ファイル「sd_img2img.png」が生成される

忘備録†

更新履歴†

2025/06/16 初版

参考資料†

Stable Diffusion

書籍など
- 日経ソフトウエア 2025年7月号「ローカル生成AIプログラミング」
- Interface 2025年3月号「画像による異常検出＆ローカルLLM作り - 仕事のための生成AI」