Home Artists Posts Import Register

Downloads

Content

Patreon exclusive posts index

Join discord and tell me your discord username to get a special rank : SECourses Discord

27 November 2023 Huge Update

  • The dataset is improved and expanded to 5200 images for both Woman and Man dataset
  • The cropping and resize scripts are further improved and all images are processed again
  • Moreover all images are now sorted according to the face quality of the images
  • So the training scripts will use the very best ones
  • Naming is made starting from man_10001 or woman_10001 so when training script starts using reg images, the very best ones will be used
  • Please re-download all of the new images for best quality
  • Total time took is over 10 full days to prepare all reg images
  • Both woman and man new datasets are added to the resources below

20 September 2023 Massive Update

  • All of the images are reprocessed with a newer face detection algorithm RetinaFace
  • RetinaFace is much better to detect and focus faces but it is really really slow
  • Newest processing scripts are shared here (YOLO V7 cropper and RetinaFace resizer) : https://www.patreon.com/posts/sota-subject-and-88391247
  • So with this update the datasets are much better. Please redownload them before using
  • Processing all of the datasets took like 6 days with 13900K CPU and 3090 TI

How To Download All On A RunPod Or A Unix System

  • download_man_reg_imgs.sh file will download and automatically extract 512x512, 768x768 and 1024x1024 man images. You can edit the file and add other resolutions if you need.
  • download_woman_reg_imgs.sh file will download and automatically extract 512x512, 768x768 and 1024x1024 woman images. You can edit the file and add other resolutions if you need.
  • These files can be used for Unix and possibly for MacOs systems as well. Don't forget to comment (put # beginning of a link) the links that you don't want to download and change folder paths if you wish.
  • Upload into workspace folder of RunPod and execute below command
  • cd /workspace
  • chmod +x download_man_reg_imgs.sh
  • ./download_woman_reg_imgs.sh
  • cd /workspace
  • chmod +x download_woman_reg_imgs.sh
  • ./download_woman_reg_imgs.sh

How Datasets Are Prepared

I have gathered 40k images for woman and man class from unsplash . com. So total gathered images count is above 80k.

They are all real images. 0 AI image are used.

Then I post processed them with several AI models to clean the dataset. At the end, finally I checked each one of the images manually. Whole process took about 70 (for woman) + 70 (for man) hours.

The final output is 5200 perfect images for woman and 5200 for man. Minimum resolution of images are above 1536 x 1536 pixels and max resolution is up to 14999 x 9999 pixels.

The raw images and exact resolution having images are shared below. If you also need any other specific resolution let me know and hopefully I will update this post.

To use them on Windows you only need to extract zip images. If you can't make it install Winrar from https://www.rarlab.com/

Man Dataset

Woman Dataset

How To Use On RunPod Or Other Cloud or Linux

To use these files unrunpod

First you need to install 7zip

  • yes | apt-get install p7zip-full

Then download them with wget. Copy their link with right click and copy link then as below

wget

e.g. man:

Then use below command to extract them

  • 7z x man_5200_imgs_512x512.zip
  • or another one
  • 7z x man_5200_imgs_1024x1024.zip

e.g. woman :

Then use below command to extract them

  • 7z x woman_5200_imgs_512x512.zip
  • or another one
  • 7z x woman_5200_imgs_1024x1024.zip

Comments

Anonymous

This is great! Can you add the cropper script along with the versioned requirements.txt as a zip to the updates? I would love to run cropping on my own datasets and the cropper is a very cool tool!

Furkan Gözükara

yes i will share it. i was planning to share with a video but i will share right away hopefully today after several hours for you.

Meito

the woman and man ones, are the same links for the max res ones

Furkan Gözükara

no each one has different links. they are all shared in the post. you should download the specific size you are going to do training or you can download and resize yourself.

Anonymous

Unfortunately after training I'm ending up with very distorted and blurred faces (and some of the 'rainbowy stretch' distortion that seems to be characteristic of a problematic latent space). Any idea what I might be doing wrong? I didn't crop or resize my training images. Is that critical?

Furkan Gözükara

i think it is about your training images dataset. also which settings did you use for the training? you can message me more details from discord

Anonymous

i think you can recommend birme.net/ to resize images faster brother! thanks!

Anonymous

I am getting this error: huggingface_hub.utils._validators.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'C:\Users\Ester\Desktop\stable-diffusion-webui\models\dreambooth\Monica5\working\tokenizer'. 0%| | 0/3823 [00:00

Anonymous

I've reinstalled and fixed the error, but now the following appears: File "C:\Users\Ester\Desktop\SD DREAMBOOTH\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 717, in forward causal_attention_mask = self._build_causal_attention_mask( File "C:\Users\Ester\Desktop\SD DREAMBOOTH\stable-diffusion-webui\venv\lib\site-packages\transformers\models\clip\modeling_clip.py", line 760, in _build_causal_attention_mask mask.triu_(1) # zero out the lower diagonal RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16'

daniel mendoza

Can you place the images in 768x768 in png format, since jpeg is a format that introduces compression artifacts in the images?

Franco Acosta Diaz

Hellow I think there is a mistake in: How To Download All On A RunPod Or A Unix System. It should be: cd /workspace chmod +x download_man_reg_imgs.sh ./download_man_reg_imgs.sh