Scientific sonnets: Duke team wins competition for poetry-generating algorithm

<p>Courtesy of Wikimedia Commons</p>

Courtesy of Wikimedia Commons

Computer science and poetry may be considered very different fields, but four Duke undergraduates have found a way to combine them.

In October, the Duke Data Science team won the 2018 PoetiX Literary Creative Turing Test competition for a computer algorithm that wrote the most "human-like" sonnets. The team was formed from an undergraduate course called Data Science Competition, taught by Cynthia Rudin, associate professor of computer science and electrical and computer engineering.

“Our undergraduates started with no prior experience in computer-generated poetry and won the competition after only one semester of work,” Rudin wrote in an email.

Judges for the competition are presented with human-written sonnets or machine-written sonnets, and are then asked to label the sonnets as from humans or machines. The competitor's sonnet that is the most indistinguishable from a human's sonnet is declared the winner.

The Duke Data Science team, comprised of senior Peter Hase, senior John Benhart, junior Liuyi Zhu and Tianlin Duan, Trinity ’18. Like existing computer-generated poetry algorithms, the team’s program selects the rhyming end words for the sonnet’s 14 lines before filling in the rest of the syllables by writing backward.

What gave the Duke team the edge was a series of novel features the students added to the algorithm, such as more rhythmic automated punctuation placement and basic rules that prevent certain part-of-speech errors. For example, a pronoun typically does not directly precede another pronoun in English, meaning ‘he it’ would not occur in the sonnets.

“We collected many [part-of-speech] sequences that could not occur, then used this knowledge to ensure that word sequences did not violate these basic rules,” reads a paper by the Duke team about the algorithm. 

The Duke team also utilized machine learning to teach the bot poetry. The team fed the bot a more sonnet-like corpus of texts, such as the works of Walt Whitman and even "The Hunger Games," in lieu of the traditionally used corpus of song lyrics. According to the paper, the texts better fit the meter and cadence that sonnets require.

The key difficulty was in figuring out how to program poetic command of a language in something that lacks even a basic understanding of the words’ meanings.

“Computers currently cannot learn from a large amount of text how to write a poem, they don’t understand how the words relate to each other and they do not understand the meaning of the words,” Rudin wrote.

Yet, computers do possess a valuable advantage in writing poems quickly. Unlike humans, algorithms are able to follow meter and rhyme flawlessly, Rudin noted. 

“I actually think there’s a lot of synergy between the aspects of the poetry that humans can generate and the other aspects that the computer can generate,” Rudin wrote.

Despite its success, the team is not done improving the algorithm. Indeed, the bot was not able to successfully convince the panel of judges that its poetry was of human creation, even though it was the most convincing among the entries.

According to Rudin, the team is trying to “imbue more meaning into the poems,” and they have already recruited several new members to help with this continued endeavor. 

The Turing Test, the thought experiment on which the PoetiX competition is based, has never been passed since Alan Turing’s establishment of it in 1950. A computer possessing “poetic” abilities indistinguishable from that of a human could be an enormously consequential breakthrough in the philosophy of artificial intelligence.

Rudin is not so convinced. 

“[The computers] are repeating and combining patterns they find in the data, and the data in our case is just a large amount of poetry,” she wrote. “They are not intelligent.” 


Share and discuss “Scientific sonnets: Duke team wins competition for poetry-generating algorithm” on social media.