Purdue built its benchmark around the Political Deepfakes Incident Database (PDID), which focuses on deepfake incidents circulating on X/Twitter, YouTube, TikTok, and Instagram. Real-world distribution shifts are where detectors tend to fail, so Purdue designed the test to reflect what security teams encounter in practice. The dataset intentionally includes "messy" characteristics common in the wild: Heavy compression and re-encoding Sub-720p resolution Short, social-media-style clips Heterogeneous generation pipelines and post-processing PDID contains 232 images and 173 videos. Detectors were evaluated end-to-end using standard metrics (accuracy, AUC, and FAR), covering academic, government, and commercial approaches. These realistic inputs reveal how models are likely to perform in production, not just in the lab.
Source: https://thehackernews.com/expert-insights/2025/12/purdue-universitys-real-world-deepfake.html