Well, I could have told you that. In fairness, I’ve taught a lot of intro poetry courses, and I’d put some of these at a solid B+ for fledgling writers.
For its training, GPT-2 was given a corpus of 8 million webpages, chosen with a quintessentially internet-y method of natural selection: “In order to preserve document quality,” OpenAI’s post states, “we used only pages which have been curated/filtered by humans—specifically, we used outbound links from Reddit which received at least 3 karma.” Through trial and error, GPT-2 learned how to predict the rest of a piece of text, given only the first few words or sentences. In turn, this gave it a general method for completing other texts, regardless of content or genre.
Wait a friggin minute — you used Reddit to help teach an AI? Sweet merciful baby Jesus. We’re all going to die.