What models can I use for Image-to-Image?

The keras-io/low-light-image-enhancement, keras-io/super-resolution, lambdalabs/sd-image-variations-diffusers, mfidabel/controlnet-segment-anything, and timbrooks/instruct-pix2pix models can be used for Image-to-Image.

What datasets can I use for Image-to-Image?

The VIDITand huggan/CelebA-faces datasets can be used for Image-to-Image.

What metrics can I use for Image-to-Image?

The PSNR, SSIM, and IS metrics can be used for Image-to-Image.

Tasks

Image-to-Image

Image-to-image is the task of transforming a source image to match the characteristics of a target image or a target image domain. Any image manipulation and enhancement is possible with image to image models.

Inputs

Image-to-Image Model

Output

About Image-to-Image

Use Cases

Style transfer

One of the most popular use cases of image to image is the style transfer. Style transfer models can convert a regular photography into a painting in the style of a famous painter.

Task Variants

Image inpainting

Image inpainting is widely used during photography editing to remove unwanted objects, such as poles, wires or sensor dust.

Image colorization

Old, black and white images can be brought up to life using an image colorization model.

Super Resolution

Super resolution models increase the resolution of an image, allowing for higher quality viewing and printing.

Inference

You can use pipelines for image-to-image in 🧨diffusers library to easily use image-to-image models. See an example for StableDiffusionImg2ImgPipeline below.

from PIL import Image
from diffusers import StableDiffusionImg2ImgPipeline

model_id_or_path = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16)
pipe = pipe.to(cuda)

init_image = Image.open("mountains_image.jpeg").convert("RGB").resize((768, 512))
prompt = "A fantasy landscape, trending on artstation"

images = pipe(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5).images
images[0].save("fantasy_landscape.png")

You can use Model Database.js to infer image-to-image models on Model Database Hub.

import { HfInference } from "@Model Database/inference";

const inference = new HfInference(HF_ACCESS_TOKEN);
await inference.imageToImage({
  data: await (await fetch('image')).blob(),
  model: "timbrooks/instruct-pix2pix", 
  parameters: {
    prompt: "Deblur this image"
  }
})

ControlNet

Controlling outputs of diffusion models only with a text prompt is a challenging problem. ControlNet is a neural network type that provides an image based control to diffusion models. These controls can be edges or landmarks in an image.

Many ControlNet models were trained in our community event, JAX Diffusers sprint. You can see the full list of the ControlNet models available here.

Most Used Model for the Task

Pix2Pix is a popular model used for image to image translation tasks. It is based on a conditional-GAN (generative adversarial network) where instead of a noise vector a 2D image is given as input. More information about Pix2Pix can be retrieved from this link where the associated paper and the GitHub repository can be found.

Below images show some of the examples shared in the paper that can be obtained using Pix2Pix. There are various cases this model can be applied on. It is capable of relatively simpler things, e.g. converting a grayscale image to its colored version. But more importantly, it can generate realistic pictures from rough sketches (can be seen in the purse example) or from painting-like images (can be seen in the street and facade examples below).

Useful Resources

References

[1] P. Isola, J. -Y. Zhu, T. Zhou and A. A. Efros, "Image-to-Image Translation with Conditional Adversarial Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5967-5976, doi: 10.1109/CVPR.2017.632.

This page was made possible thanks to the efforts of Paul Gafton and Osman Alenbey.

Image-to-Image demo

using lllyasviel/sd-controlnet-canny

Image-to-Image

Examples

This model can be loaded on the Inference API on-demand.

Models for Image-to-Image

Browse Models (197)

keras-io/low-light-image-enhancement

Image-to-Image • Updated May 5, 2022 • 100 • 21

Note A model that enhances images captured in low light conditions.

keras-io/super-resolution

Image-to-Image • Updated May 5, 2022 • 16 • 13

Note A model that increases the resolution of an image.

lambdalabs/sd-image-variations-diffusers

Image-to-Image • Updated Feb 8 • 14.3k • 277

Note A model that creates a set of variations of the input image in the style of DALL-E using Stable Diffusion.

mfidabel/controlnet-segment-anything

Image-to-Image • Updated May 14 • 824 • 16

Note A model that generates images based on segments in the input image and the text prompt.

timbrooks/instruct-pix2pix

Image-to-Image • Updated Jul 5 • 92.9k • 668

Note A model that takes an image and an instruction to edit the image.

Datasets for Image-to-Image

Browse Datasets (70)

No example dataset is defined for this task.

Note Contribute by proposing a dataset for this task !

Spaces using Image-to-Image

🎆

keras-io/low-light-image-enhancement

Note Image enhancer application for low light.

🐠

keras-io/neural-style-transfer

Note Style transfer application.

😻

mfidabel/controlnet-segment-anything

Note An application that generates images based on segment control.

🌖

hysts/ControlNet

Note Image generation application that takes image control and text prompt.

💻

ioclab/brightness-controlnet

Note Colorize any image using this app.

🚀

timbrooks/instruct-pix2pix

Note Edit images with instructions.

Metrics for Image-to-Image

PSNR: Peak Signal to Noise Ratio (PSNR) is an approximation of the human perception, considering the ratio of the absolute intensity with respect to the variations. Measured in dB, a high value indicates a high fidelity.

SSIM: Structural Similarity Index (SSIM) is a perceptual metric which compares the luminance, contrast and structure of two images. The values of SSIM range between -1 and 1, and higher values indicate closer resemblance to the original image.

IS: Inception Score (IS) is an analysis of the labels predicted by an image classification model when presented with a sample of the generated images.