Transform your images and videos into stunning works of art! ArtFusion uses the power of Fast Neural Style Transfer to apply artistic styles to your content in near real-time, all through an easy-to-use web interface built with Streamlit.
Neural Style Transfer (NST) is a fascinating technique that emerged from deep learning research, allowing us to separate the content of one image from the style of another and combine them. The original approach, pioneered by Gatys et al., involved using a pre-trained Convolutional Neural Network (CNN, typically VGG) to extract content and style features. It then iteratively optimized a new image (starting from noise or the content image) to minimize a combined loss function:
- Content Loss: Ensures the output image retains the subject matter of the content image.
- Style Loss: Ensures the output image matches the textural patterns and color palettes of the style image across different network layers.
The Challenge with Traditional NST: While powerful and flexible (it can work with any content and style image pair), the optimization process is computationally expensive and slow. Generating a single stylized image can take minutes or even longer, making it unsuitable for real-time applications or video processing.
Enter Fast Neural Style Transfer: To overcome the speed limitation, researchers like Johnson et al. proposed a different approach. Instead of optimizing an output image directly, they trained a separate, feed-forward neural network for each specific style.
- Training: A dedicated "Style Transfer Network" is trained on a large dataset of content images. During training, it learns to transform any input image into the target artistic style while preserving content, using the same perceptual loss functions (content and style loss) derived from a fixed loss network (like VGG).
- Inference (Stylization): Once trained, applying the style is incredibly fast. You simply pass your content image through the specific trained network in a single forward pass.
Comparison:
| Feature | Traditional NST | Fast NST |
|---|---|---|
| Speed | Slow (optimization per image) | Fast (single forward pass per image) |
| Flexibility | High (any style image) | Lower (requires a trained network per style) |
| Training | None (optimization at runtime) | Required (one network per style, offline) |
| Real-time | Difficult | Yes |
| Video | Very slow / Impractical | Feasible |
This project provides a user-friendly interface to experiment with Fast Neural Style Transfer:
- Backend: Uses TensorFlow and Keras to load and run pre-trained Fast Style Transfer models (
.kerasor.h5format). Each model is trained for a specific artistic style. - Frontend: A web application built with Streamlit allows users to:
- Select from a list of available artistic styles (detected from models found in the
Modelfolder). - View a preview of the selected style image.
- Upload their own content image (JPG, PNG, JPEG) or video (MP4, AVI, MOV, MKV).
- Select from a list of available artistic styles (detected from models found in the
- Processing Pipeline:
- Image Input: The uploaded image is preprocessed (resized, normalized) and fed into the selected style transfer model. The model outputs the stylized image tensor.
- Video Input: The uploaded video is processed frame-by-frame. Each frame is extracted, preprocessed, passed through the style transfer model, deprocessed, and then re-encoded into a new output video file. OpenCV is used for video reading and writing.
- Output: The stylized image or video is displayed in the web app, and a download button is provided.
- Model & Style Management: The application automatically discovers available styles by looking for model files (
.keras,.h5) in theModeldirectory and corresponding style preview images (with the same base name) in thedataset/style_imagesdirectory.
Here's an example of ArtFusion transforming a content image using different artistic styles:
Original Content Image:
Transformed Image :
Follow these steps to set up and run the ArtFusion Streamlit application locally.
- Python: Version 3.10 is required. You can download it from python.org.
- pip: Python's package installer (usually comes with Python).
- Git: To clone the repository (optional, you can also download the code as a ZIP).
- Clone the Repository:
git clone https://github.com/srrishtea/ArtFusion.git cd ArtFusion - Install Dependencies:
txt streamlit numpy tensorflow Pillow opencv-python - Download and Place Models & Style Images:
- Download the pre-trained models from the link .
- Place the downloaded model files (e.g.,
starry_night.keras) inside theModelfolder in the project's root directory. - Download the style images dataset from the link .
- Ensure the style preview images (e.g.,
starry_night.jpg) are placed inside thedataset/style_imagesfolder. The base name of the style image (without extension) must match the base name of its corresponding model file.
-
Navigate to the Web App Directory: Make sure your terminal/command prompt is inside the project's root directory where you cloned the repository. Then change into the
Web Appfolder:cd "Web App"
-
Run the Streamlit Application:
streamlit run app.py
-
Streamlit will start the server, and the application should automatically open in your default web browser. You can also navigate to the local URL provided in the terminal (usually
http://localhost:8501).
Beyond just using the app, understanding the underlying concepts is valuable.
While the core losses for style transfer are Content Loss and Style Loss, sometimes an additional loss term called Total Variation (TV) Loss is used, particularly during the training of Fast NST networks :
- Purpose: TV Loss acts as a spatial regularizer. It encourages smoothness in the generated image by penalizing large differences between adjacent pixel values.
- Effect: It helps to reduce high-frequency artifacts, noise, or pixelation in the output image, leading to a more visually coherent and smoother result. While it might slightly reduce fine details, it often improves the overall quality of the stylized output.
The foundational work for the technique used in this project is:
- Title: "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"
- Authors: Justin Johnson, Alexandre Alahi, Li Fei-Fei
- Conference: European Conference on Computer Vision (ECCV), 2016
- Link: arXiv:1603.08155
This paper details the architecture of the style transfer network and the use of perceptual loss functions for training feed-forward networks capable of fast stylization.