GAN-e4e の履歴(No.2)

履歴一覧
差分を表示
現在との差分を表示
ソースを表示
GAN-e4e へ行く。
- 1 (2023-12-17 (日) 13:19:18)
- 2 (2024-08-07 (水) 14:23:32)

私的AI研究会 > GAN-e4e

StyleGANを使った画像編集：StyleGAN e4e†

　「StyleGAN」で画像編集する。

StyleGANを使った画像編集：StyleGAN e4e
- サイト『StyleGANを使った画像編集をe4eで高速化する』の検証
- 更新履歴
参考資料

※ 最終更新:2024/08/07　

↑

サイト『StyleGANを使った画像編集をe4eで高速化する』の検証†

↑

概要†

　StyleGAN を使った画像編集では「潜在変数の推定」プロセスに時間が掛かる。
これをを高速に実行できる e4e（encoder4editing）という技術について、上記サイトに従い検証してみる。
e4e は StyleGAN の学習済みモデルを使用して、画像を入力すると目的の潜在変数を直接出力する専用エンコーダを作る手法。

　Encoderに画像を入力すると１つの潜在変数 w とN個のオフセット△が出力され、これらを合成し N個の潜在変数とし Pretrained StyleGAN に入力する。このとき、元画像と出力画像の誤差を表すロスに加えて、オフセットの分散を表すロスを設定し、これら２つのロスの合計を最小化するように、Encoder のパラメータを学習する。

StyleGAN による画像編集アプローチについて、このサイト作者の以下のものがあり参考になるが現在の「Google Colaboratory」では動作検証はできないようだ。
（tensorflowの 1.x 系と cuda バージョンなどの問題で現在の「Google Colaboratory」では環境構築ができない）
論文「Designing an Encoder for StyleGAN Image Manipulation」→ https://arxiv.org/pdf/2102.02766.pdf

↑

Google Colaboratory に実行環境を作成†

上記サイト作者のデモサイトを開き「Open in Colab」① ボタンを押す
『e4e_demo』の Google Colab が開くので「ファイル」メニューから「ドライブにコピーを保存」を選択
『e4e_demo　のコピー』のタイトルで開いた Google Colab のページで以降の操作を行う
データファイルをダウンロードして解凍する（解凍した「update/work/gan_e4e/」を使用する
　update_20231117.zip (18.3MB) <アップデート・データ>

↑

環境設定†

以下のセルを実行する ①

# --- セットアップ ---

import os
os.chdir('/content')
CODE_DIR = 'encoder4editing'

!git clone https://github.com/cedro3/encoder4editing.git $CODE_DIR
!wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
!sudo unzip ninja-linux.zip -d /usr/local/bin/
!sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force
os.chdir(f'./{CODE_DIR}')

from argparse import Namespace
import time
import os
import sys
import numpy as np
from PIL import Image
import torch
import torchvision.transforms as transforms

sys.path.append(".")
sys.path.append("..")

from utils.common import tensor2im
from models.psp import pSp  # we use the pSp framework to load the e4e encoder.

%load_ext autoreload
%autoreload 2

# 学習済みパラメータのダウンロード
! pip install --upgrade gdown
import os
import gdown
os.makedirs('pretrained_models', exist_ok=True)
gdown.download('https://drive.google.com/u/1/uc?id=1Du_8FzOPKJhk6aJmiOBhAWVe3_6vAyET', 'pretrained_models/e4e_ffhq_encode.pt', quiet=False)

# ランドマークデータのダウンロード
! wget http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
! bzip2 -dk shape_predictor_68_face_landmarks.dat.bz2

# モデルに学習済みパラメータをロード
model_path = 'pretrained_models/e4e_ffhq_encode.pt'  ####
ckpt = torch.load(model_path, map_location='cpu')
opts = ckpt['opts']
opts['checkpoint_path'] = model_path
opts= Namespace(**opts)
net = pSp(opts)
net.eval()
net.cuda()
print('Model successfully loaded!')

▼　- log -　GoogleColab Tesla T4

Cloning into 'encoder4editing'...
remote: Enumerating objects: 233, done.
remote: Counting objects: 100% (233/233), done.
remote: Compressing objects: 100% (151/151), done.
remote: Total 233 (delta 80), reused 203 (delta 78), pack-reused 0
Receiving objects: 100% (233/233), 35.00 MiB | 16.75 MiB/s, done.
Resolving deltas: 100% (80/80), done.
--2023-11-07 06:51:38--  https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/1335132/d2f252e2-9801-11e7-9fbf-bc7b4e4b5c83?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20231107%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20231107T065138Z&X-Amz-Expires=300&X-Amz-Signature=4617d4529030a1f5d0ae22305ff687cda7088748788810590aac934fdc28c9cf&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=1335132&response-content-disposition=attachment%3B%20filename%3Dninja-linux.zip&response-content-type=application%2Foctet-stream [following]
--2023-11-07 06:51:38--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/1335132/d2f252e2-9801-11e7-9fbf-bc7b4e4b5c83?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20231107%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20231107T065138Z&X-Amz-Expires=300&X-Amz-Signature=4617d4529030a1f5d0ae22305ff687cda7088748788810590aac934fdc28c9cf&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=1335132&response-content-disposition=attachment%3B%20filename%3Dninja-linux.zip&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 77854 (76K) [application/octet-stream]
Saving to: ‘ninja-linux.zip’

ninja-linux.zip     100%[===================>]  76.03K  --.-KB/s    in 0.02s   

2023-11-07 06:51:38 (3.29 MB/s) - ‘ninja-linux.zip’ saved [77854/77854]

Archive:  ninja-linux.zip
  inflating: /usr/local/bin/ninja    
update-alternatives: using /usr/local/bin/ninja to provide /usr/bin/ninja (ninja) in auto mode
Requirement already satisfied: gdown in /usr/local/lib/python3.10/dist-packages (4.6.6)
Collecting gdown
  Downloading gdown-4.7.1-py3-none-any.whl (15 kB)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from gdown) (3.12.4)
Requirement already satisfied: requests[socks] in /usr/local/lib/python3.10/dist-packages (from gdown) (2.31.0)
Requirement already satisfied: six in /usr/local/lib/python3.10/dist-packages (from gdown) (1.16.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from gdown) (4.66.1)
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/dist-packages (from gdown) (4.11.2)
Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4->gdown) (2.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (3.3.1)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (2023.7.22)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /usr/local/lib/python3.10/dist-packages (from requests[socks]->gdown) (1.7.1)
Installing collected packages: gdown
  Attempting uninstall: gdown
    Found existing installation: gdown 4.6.6
    Uninstalling gdown-4.6.6:
      Successfully uninstalled gdown-4.6.6
Successfully installed gdown-4.7.1
Downloading...
From (uriginal): https://drive.google.com/u/1/uc?id=1Du_8FzOPKJhk6aJmiOBhAWVe3_6vAyET
From (redirected): https://drive.google.com/uc?id=1Du_8FzOPKJhk6aJmiOBhAWVe3_6vAyET&confirm=t&uuid=c056dfae-dda2-4d7d-ab3a-3aa59c1f856d
To: /content/encoder4editing/pretrained_models/e4e_ffhq_encode.pt
100%|██████████| 1.20G/1.20G [00:13<00:00, 86.3MB/s]
--2023-11-07 06:53:26--  http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
Resolving dlib.net (dlib.net)... 107.180.26.78
Connecting to dlib.net (dlib.net)|107.180.26.78|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 64040097 (61M)
Saving to: ‘shape_predictor_68_face_landmarks.dat.bz2’

shape_predictor_68_ 100%[===================>]  61.07M  89.6MB/s    in 0.7s    

2023-11-07 06:53:27 (89.6 MB/s) - ‘shape_predictor_68_face_landmarks.dat.bz2’ saved [64040097/64040097]

Loading e4e over the pSp framework from checkpoint: pretrained_models/e4e_ffhq_encode.pt
Model successfully loaded!

セルの実行終了② 後、左サイドバーの「ファイルボタン」を押す
「encoder4editing」③ の下の「images」」④ フォルダがあることを確認する

「images」フォルダに用意した顔画像を追加（ローカルマシンの画面からドラッグ＆ドロップで OK）

↑

事前準備†

顔画像の切り出し
・StyleGAN が学習した顔画像に合わせてサンプル画像から顔を切り出し「./align」フォルダに同じファイル名で保存（顔のランドマーク（目・鼻・口など）が所定の位置にある）
・以下のセルを実行する

# --- 顔画像の切り出し ---

import os
import shutil
from tqdm import tqdm

if os.path.isdir('align'):
     shutil.rmtree('align')
os.makedirs('align', exist_ok=True)

def run_alignment(image_path):
  import dlib
  from utils.alignment import align_face
  predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")
  aligned_image = align_face(filepath=image_path, predictor=predictor)
  return aligned_image

path = './images'
files = sorted(os.listdir(path))
for i, file in enumerate(tqdm(files)):
  if file=='.ipynb_checkpoints':
     continue
  input_image = run_alignment(path+'/'+file)
  input_image.resize((256,256))
  input_image.save('./align/'+file)

▼　- log -　GoogleColab Tesla T4

100%|██████████| 9/9 [00:07<00:00,  1.23it/s]

潜在変数の推定
・切り出した顔画像から潜在変数を推定しこれから生成した画像は同じファイル名で「./vec_pic」に保存する
・以下のセルを実行する

# --- 潜在変数の推定 ---

if os.path.isdir('vec_pic'):
     shutil.rmtree('vec_pic')
os.makedirs('vec_pic', exist_ok=True)

if os.path.isdir('vec'):
     shutil.rmtree('vec')
os.makedirs('vec', exist_ok=True)

img_transforms = transforms.Compose([
        transforms.Resize((256, 256)),
        transforms.ToTensor(),
        transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])

path = './align'
files = sorted(os.listdir(path))
for i, file in enumerate(tqdm(files)):
  if file=='.ipynb_checkpoints':
     continue
  input_image = Image.open(path+'/'+file)
  transformed_image = img_transforms(input_image)
  with torch.no_grad():
     images, latents = net(transformed_image.unsqueeze(0).to('cuda').float(), randomize_noise=False, return_latents=True)
     result_image, latent = images[0], latents[0]
     tensor2im(result_image).save('./vec_pic/'+file) # vec_pic 保存
     torch.save(latents, './vec/'+file[:-4]+'.pt') # vec  保存

▼　- log -　GoogleColab Tesla T4

100%|██████████| 9/9 [00:07<00:00,  1.23it/s]

元画像と生成画像の表示
・推定した潜在変数は同じファイル名（拡張子はjpg → pt）で「./vec」に保存さる
・下記の 1行をセルのコード最初に追加する

%matplotlib inline

・変更した以下のセルを実行する

# --- 元画像と生成画像の表示 ---

%matplotlib inline
import matplotlib.pyplot as plt
from PIL import Image
import os
def display_pic(folder):
    fig = plt.figure(figsize=(30, 40))
    files = os.listdir(folder)
    files.sort()
    for i, file in enumerate(files):
        img = Image.open(folder+'/'+file)
        images = np.asarray(img)
        ax = fig.add_subplot(10, 10, i+1, xticks=[], yticks=[])
        image_plt = np.array(images)
        ax.imshow(image_plt)
        ax.set_xlabel(folder+'/'+file, fontsize=15)
    plt.show()
    plt.close()

display_pic('align')
display_pic('vec_pic')

・上段が切り出した顔画像、下段が潜在変数から生成した画像~
・切り出した顔画像とほぼ同じ画像を生成できる潜在変数が素早く推定できたことが分かる
・これで、編集できる潜在変数を得ることができたことになる

↑

画像編集†

編集のために用意されている潜在変数（フォルダ「./editings/interfacegan_directions）
- age.pt 　　　 : 年齢による顔の変化
- pose.pt　　　 : 顔の水平方向の回転
- smile.pt　　　: 笑顔にする
- age+pose.pt　: 年齢による顔の変化＋顔の水平方向の回転
設定例「03.jpg」
- latent は対応する潜在変数 03.pt
- direction は年齢による顔の変化の潜在変数
- age 、min は-50、max は50
  潜在変数 03.pt に潜在変数 age の-5 〜 5 倍（設定は内部で1/10される）したものを加算して、StyleGANに入力し画像を生成する設定

設定

・以下のセルを実行する

#@title 設定
latent = "03.pt"#@param {type:"string"}
direction = "pose" #@param ["age", "pose", "smile", "age+pose"] {allow-input: true}
min = -50 #@param {type:"slider", min:-50, max:0, step:10}
max = 50 #@param {type:"slider", min:0, max:50, step:10}

静止画の生成
・以下のセルを実行する

# --- 静止画の生成 ---

import os
import shutil
if os.path.isdir('pic'):
     shutil.rmtree('pic')
os.makedirs('pic', exist_ok=True)

from editings import latent_editor
from tqdm import trange

folder = 'vec'
latents = torch.load(folder+'/'+latent)
editor = latent_editor.LatentEditor(net.decoder, False)

interfacegan_directions = {
        'age': 'editings/interfacegan_directions/age.pt',
        'smile': 'editings/interfacegan_directions/smile.pt',
        'pose': 'editings/interfacegan_directions/pose.pt',
        'age+pose':  'editings/interfacegan_directions/age+pose.pt'
    }

interfacegan_direction = torch.load(interfacegan_directions[direction]).cuda()
cnt = 0

for i in trange(0, min, -1, desc='0 -> min'):
     result = editor.apply_interfacegan(latents, interfacegan_direction, factor=i).resize((512,512))
     result.save('./pic/'+str(cnt).zfill(6)+'.jpg')
     cnt +=1

for i in trange(min, max, desc='min -> max'):
     result = editor.apply_interfacegan(latents, interfacegan_direction, factor=i).resize((512,512))
     result.save('./pic/'+str(cnt).zfill(6)+'.jpg')
     cnt +=1

for i in trange(max, 0, -1, desc='max -> 0'):
     result = editor.apply_interfacegan(latents, interfacegan_direction, factor=i).resize((512,512))
     result.save('./pic/'+str(cnt).zfill(6)+'.jpg')
     cnt +=1

▼　- log -　GoogleColab Tesla T4

0 -> min: 100%|██████████| 50/50 [00:07<00:00,  6.87it/s]
min -> max: 100%|██████████| 100/100 [00:13<00:00,  7.45it/s]
max -> 0: 100%|██████████| 50/50 [00:06<00:00,  7.48it/s]

・潜在変数 age に掛ける数字を少しづつ変化させながら、StyleGAN が出力する静止画を保存する。
・0 から min まで変化させ、次に min から max まで変化させ、最後に max から 0 まで変化させる。
・静止画は「000001.jpg」という形の6桁の連番名で「./pic」に保存する

動画の生成
・「./pic」に保存された静止画から mp4 動画を作成する
・以下のセルを実行する

# --- mp4動画の作成 ---

# 既に output.mp4 があれば削除する
import os
if os.path.exists('./output.mp4'):
   os.remove('./output.mp4')

# pic フォルダーの静止画から動画を作成
! ffmpeg -r 30 -i pic/%6d.jpg\
               -vcodec libx264 -pix_fmt yuv420p output.mp4

# movieフォルダへ名前を付けてコピー
import shutil
os.makedirs('movie', exist_ok=True)
shutil.copy('output.mp4', 'movie/'+direction+'_'+latent[:-3]+'.mp4')

▼　- log -　GoogleColab Tesla T4

ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Input #0, image2, from 'pic/%6d.jpg':
  Duration: 00:00:08.00, start: 0.000000, bitrate: N/A
  Stream #0:0: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/unknown), 512x512 [SAR 1:1 DAR 1:1], 25 fps, 25 tbr, 25 tbn, 25 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (mjpeg (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[swscaler @ 0x5c700b908e40] deprecated pixel format used, make sure you did set range correctly
[libx264 @ 0x5c700b587380] using SAR=1/1
[libx264 @ 0x5c700b587380] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x5c700b587380] profile High, level 3.0, 4:2:0, 8-bit
[libx264 @ 0x5c700b587380] 264 - core 163 r3060 5db6aa6 - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.mp4':
  Metadata:
    encoder         : Lavf58.76.100
  Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv420p(tv, bt470bg/unknown/unknown, progressive), 512x512 [SAR 1:1 DAR 1:1], q=2-31, 30 fps, 15360 tbn
    Metadata:
      encoder         : Lavc58.134.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame=  200 fps= 61 q=-1.0 Lsize=     705kB time=00:00:06.56 bitrate= 879.8kbits/s speed=1.99x    
video:702kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.455133%
[libx264 @ 0x5c700b587380] frame I:1     Avg QP:24.63  size: 15243
[libx264 @ 0x5c700b587380] frame P:50    Avg QP:24.28  size:  9558
[libx264 @ 0x5c700b587380] frame B:149   Avg QP:28.18  size:  1511
[libx264 @ 0x5c700b587380] consecutive B-frames:  0.5%  0.0%  1.5% 98.0%
[libx264 @ 0x5c700b587380] mb I  I16..4:  3.7% 94.5%  1.8%
[libx264 @ 0x5c700b587380] mb P  I16..4:  2.4% 31.6%  0.5%  P16..4: 31.0% 16.9%  6.6%  0.0%  0.0%    skip:11.0%
[libx264 @ 0x5c700b587380] mb B  I16..4:  0.1%  0.7%  0.0%  B16..8: 22.9%  3.3%  0.9%  direct: 4.4%  skip:67.8%  L0:35.2% L1:53.5% BI:11.4%
[libx264 @ 0x5c700b587380] 8x8 transform intra:91.3% inter:81.5%
[libx264 @ 0x5c700b587380] coded y,uvDC,uvAC intra: 72.7% 50.1% 4.0% inter: 14.1% 9.0% 0.1%
[libx264 @ 0x5c700b587380] i16 v,h,dc,p: 25% 23% 14% 38%
[libx264 @ 0x5c700b587380] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 35% 10% 24%  3%  5% 10%  3%  7%  3%
[libx264 @ 0x5c700b587380] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 35%  4% 13%  4% 15% 17%  5%  6%  1%
[libx264 @ 0x5c700b587380] i8c dc,h,v,p: 52% 14% 31%  3%
[libx264 @ 0x5c700b587380] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x5c700b587380] ref P L0: 57.9% 23.8% 14.6%  3.7%
[libx264 @ 0x5c700b587380] ref B L0: 92.1%  6.3%  1.6%
[libx264 @ 0x5c700b587380] ref B L1: 98.5%  1.5%
[libx264 @ 0x5c700b587380] kb/s:861.87
movie/pose_03.mp4

・作成された動画は「output.mp4」で保存すると共に「./movie」に「<画像ファイル名>.mp4」という形で保存する

動画の再生
・作成した「output.mp4」を再生する
・以下のセルを実行する

# --- mp4動画の再生 ---
from IPython.display import HTML
from base64 import b64encode

mp4 = open('./output.mp4', 'rb').read()
data_url = 'data:video/mp4;base64,' + b64encode(mp4).decode()
HTML(f"""
<video width="50%" height="50%" controls>
      <source src="{data_url}" type="video/mp4">
</video>""")

・作成画像を取得する場合は「./movie/<画像ファイル名>.mp4」をダウンロードする

↑

別の画像を編集†

「画像編集」以降を繰り返せば潜在変数作成済みの別の画像で編集ができる

latent に編集する画像の潜在変数「<画像名>.pt」入力
編集する潜在変数「age / pose / smile/ age+pose」を選択する
min / max を設定（smile の場合は、少し弱目にで min 0　max 30 位で）
Colab「ランタイム」メニューから「以降のセルを実行」を選んで実行する

実行例 1（okegawa_m1.pt / yaoi_m1.pt / nitta_m2.pt / izutsu_m2.pt：smile」：min 0：max 30）
実行例 2（okegawa_m1.pt / yaoi_m1.pt / nitta_m2.pt / izutsu_m2.pt：pose」：min -50：max 50）
実行例 3（okegawa_m1.pt / yaoi_m1.pt / nitta_m2.pt / izutsu_m2.pt：age」：min -50：max 50）
実行例 4（okegawa_m1.pt / yaoi_m1.pt / nitta_m2.pt / izutsu_m2.pt：age+pose」：min -50：max 50）

↑

編集の終了・再接続後の実行†

編集を終えるときは Colab「ランタイム」→「ランタイムを接続解除して削除」を選択する
・GPU 占有時間を少なくするためすべての実行作業が終了した場合は接続解除しておくことが望ましい
・接続解除して削除を実行しても、ノートブック上の実行結果はそのまま残る
再接続の場合は上記の環境設定からもう一度実行同じ手順をする

↑