This seems like the kind of thing that you could use machine learning to do. Use neural networks and computer vision to extract common visual motifs (Batman, some thot in a dollar-store Elsa costume, literal human stuff) from the literally millions of these videos that exist, and then train them on fitness based on which videos performed the best. Then, hook it up to some kind of webcrawler script and have it stuff out brand-new videos every minute. Instant millions.
But then you realize that it's producing videos that were made of things not in the original content. All of a sudden it's messages about the fall of civilization and the end of market-based economics. Elsa takes a scimitar and cuts Batman's head off. Batman's torso is labeled with 'Bourgeoisie'. But wait, she's pulling stuff out of a burlap sack. Wait, it's Totino's Pizza Roll government vouchers.
Suddenly there's brain-washed child soldiers marching through the streets of Washington. The government is getting beheaded in the streets. A marble statue of Elsa is hoisted onto the front-lawn of the White House. The machines have won.