AniDoc: Animation Creation Made Easier

Published 18 Dec 2024 in cs.CV | (2412.14173v2)

Abstract: The production of 2D animation follows an industry-standard workflow, encompassing four essential stages: character design, keyframe animation, in-betweening, and coloring. Our research focuses on reducing the labor costs in the above process by harnessing the potential of increasingly powerful generative AI. Using video diffusion models as the foundation, AniDoc emerges as a video line art colorization tool, which automatically converts sketch sequences into colored animations following the reference character specification. Our model exploits correspondence matching as an explicit guidance, yielding strong robustness to the variations (e.g., posture) between the reference character and each line art frame. In addition, our model could even automate the in-betweening process, such that users can easily create a temporally consistent animation by simply providing a character image as well as the start and end sketches. Our code is available at: https://yihao-meng.github.io/AniDoc_demo.

Abstract PDF HTML Upgrade to Chat

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a two-stage training method using video diffusion models to automate the 2D animation colorization process.
The approach overcomes challenges such as pose variations and binarization by using explicit color correspondence and background augmentation.
Results show improved visual fidelity and temporal coherence, with quantitative gains in metrics like PSNR, SSIM, and LPIPS.

AniDoc: Animation Creation Made Easier

The paper "AniDoc: Animation Creation Made Easier" presents a novel approach to automate aspects of the 2D animation colorization process. The authors propose a method based on video diffusion models to efficiently colorize line art sketches, which fits seamlessly into existing animation workflows. This approach significantly lowers labor costs in animation production by minimizing manual steps in the coloring process.

Animation Production Workflow Overview

The 2D animation production workflow typically involves several stages: character design, keyframe animation, in-betweening, and coloring. The proposed method focuses on automating the coloring stage while maintaining temporal coherence and fidelity to character designs.

Figure 1: Illustration of the workflow of 2D animation production.

Method Overview

The authors introduce a two-stage training process leveraging video diffusion models to tackle colorization challenges in animation. The first stage involves dense-sketch training to establish correspondences between reference images and sketches. This step deals with extracting keypoints to form point maps, ensuring color consistency from the reference image throughout the animation.

Figure 2: Overview of pipeline. We adopt a two-stage training strategy...

Key Challenges Addressed

Mismatch Between Reference and Sketches: The method uses an explicit correspondence matching mechanism to align colors from reference images to sketch frames, overcoming differences in pose, angle, and scale.
Binarization and Background Augmentation: To simulate production conditions, sketches are binarized, which excludes hidden color information. The model trains with background augmentation to improve robustness and distinguish between character and background.
Sparse Sketch Training: The second stage of training removes intermediate sketches, relying instead on matching points and start/end frames to interpolate smooth transitions, requiring only minimal sketches.

Results

Qualitative results show superior visual fidelity and temporal coherence with fewer artifacts compared to previous methods. Quantitative assessments using metrics like PSNR, SSIM, and LPIPS reflect improved colorization accuracy, while FID and FVD demonstrate enhanced generation quality.

Figure 3: Visual comparison of reference-based colorization with four methods.

Flexibility and Robustness

The method showcases robustness to varying animation styles and scales, demonstrating its applicability across diverse scenarios. It efficiently colorizes character sequences with significantly different poses and scales using a single reference image.

Figure 4: Illustration of the flexible usage of our model...

Conclusion

AniDoc represents a step forward in automating animation workflows, particularly in the field of video colorization with reference images. Despite its advantages, there remain limitations such as color uncertainty in unseen objects and attire variations. Future work may focus on enriching the model's contextual understanding and broadening its applicability to more complex animation scenarios.