In the weeks after Mazurenko’s death, friends debated the best way to preserve his memory. One person suggested making a coffee-table book about his life, illustrated with photography of his legendary parties. Another friend suggested a memorial website. To Kuyda, every suggestion seemed inadequate.
As she grieved, Kuyda found herself rereading the endless text messages her friend had sent her over the years — thousands of them, from the mundane to the hilarious. She smiled at Mazurenko’s unconventional spelling — he struggled with dyslexia — and at the idiosyncratic phrases with which he peppered his conversation. Mazurenko was mostly indifferent to social media — his Facebook page was barren, he rarely tweeted, and he deleted most of his photos on Instagram. His body had been cremated, leaving her no grave to visit. Texts and photos were nearly all that was left of him, Kuyda thought.
For two years she had been building Luka, whose first product was a messenger app for interacting with bots. Backed by the prestigious Silicon Valley startup incubator Y Combinator, the company began with a bot for making restaurant reservations. Kuyda’s co-founder, Philip Dudchuk, has a degree in computational linguistics, and much of their team was recruited from Yandex, the Russian search giant.
Reading Mazurenko’s messages, it occurred to Kuyda that they might serve as the basis for a different kind of bot — one that mimicked an individual person’s speech patterns. Aided by a rapidly developing neural network, perhaps she could speak with her friend once again.
She set aside for a moment the questions that were already beginning to nag at her.
What if it didn’t sound like him?
What if it did?
[---]
Two weeks before Mazurenko was killed, Google released TensorFlow for free under an open-source license. TensorFlow is a kind of Google in a box — a flexible machine-learning system that the company uses to do everything from improve search algorithms to write captions for YouTube videos automatically. The product of decades of academic research and billions of dollars in private investment was suddenly available as a free software library that anyone could download from GitHub.
Luka had been using TensorFlow to build neural networks for its restaurant bot. Using 35 million lines of English text, Luka trained a bot to understand queries about vegetarian dishes, barbecue, and valet parking. On a lark, the 15-person team had also tried to build bots that imitated television characters. It scraped the closed captioning on every episode of HBO’s Silicon Valley and trained the neural network to mimic Richard, Bachman, and the rest of the gang.
In February, Kuyda asked her engineers to build a neural network in Russian. At first she didn’t mention its purpose, but given that most of the team was Russian, no one asked questions. Using more than 30 million lines of Russian text, Luka built its second neural network. Meanwhile, Kuyda copied hundreds of her exchanges with Mazurenko from the app Telegram and pasted them into a file. She edited out a handful of messages that she believed would be too personal to share broadly. Then Kuyda asked her team for help with the next step: training the Russian network to speak in Mazurenko’s voice.
The project was tangentially related to Luka’s work, though Kuyda considered it a personal favor. (An engineer told her that the project would only take about a day.) Mazurenko was well-known to most of the team — he had worked out of Luka’s Moscow office, where the employees labored beneath a neon sign that quoted Wittgenstein: “The limits of my language are the limits of my world.” Kuyda trained the bot with dozens of tests queries, and her engineers put on the finishing touches.
Only a small percentage of the Roman bot’s responses reflected his actual words. But the neural network was tuned to favor his speech whenever possible. Any time the bot could respond to a query using Mazurenko’s own words, it would. Other times it would default to the generic Russian. After the bot blinked to life, she began peppering it with questions.
Who’s your best friend?, she asked.
Don’t show your insecurities, came the reply.
It sounds like him, she thought.
- More Here
As she grieved, Kuyda found herself rereading the endless text messages her friend had sent her over the years — thousands of them, from the mundane to the hilarious. She smiled at Mazurenko’s unconventional spelling — he struggled with dyslexia — and at the idiosyncratic phrases with which he peppered his conversation. Mazurenko was mostly indifferent to social media — his Facebook page was barren, he rarely tweeted, and he deleted most of his photos on Instagram. His body had been cremated, leaving her no grave to visit. Texts and photos were nearly all that was left of him, Kuyda thought.
For two years she had been building Luka, whose first product was a messenger app for interacting with bots. Backed by the prestigious Silicon Valley startup incubator Y Combinator, the company began with a bot for making restaurant reservations. Kuyda’s co-founder, Philip Dudchuk, has a degree in computational linguistics, and much of their team was recruited from Yandex, the Russian search giant.
Reading Mazurenko’s messages, it occurred to Kuyda that they might serve as the basis for a different kind of bot — one that mimicked an individual person’s speech patterns. Aided by a rapidly developing neural network, perhaps she could speak with her friend once again.
She set aside for a moment the questions that were already beginning to nag at her.
What if it didn’t sound like him?
What if it did?
[---]
Two weeks before Mazurenko was killed, Google released TensorFlow for free under an open-source license. TensorFlow is a kind of Google in a box — a flexible machine-learning system that the company uses to do everything from improve search algorithms to write captions for YouTube videos automatically. The product of decades of academic research and billions of dollars in private investment was suddenly available as a free software library that anyone could download from GitHub.
Luka had been using TensorFlow to build neural networks for its restaurant bot. Using 35 million lines of English text, Luka trained a bot to understand queries about vegetarian dishes, barbecue, and valet parking. On a lark, the 15-person team had also tried to build bots that imitated television characters. It scraped the closed captioning on every episode of HBO’s Silicon Valley and trained the neural network to mimic Richard, Bachman, and the rest of the gang.
In February, Kuyda asked her engineers to build a neural network in Russian. At first she didn’t mention its purpose, but given that most of the team was Russian, no one asked questions. Using more than 30 million lines of Russian text, Luka built its second neural network. Meanwhile, Kuyda copied hundreds of her exchanges with Mazurenko from the app Telegram and pasted them into a file. She edited out a handful of messages that she believed would be too personal to share broadly. Then Kuyda asked her team for help with the next step: training the Russian network to speak in Mazurenko’s voice.
The project was tangentially related to Luka’s work, though Kuyda considered it a personal favor. (An engineer told her that the project would only take about a day.) Mazurenko was well-known to most of the team — he had worked out of Luka’s Moscow office, where the employees labored beneath a neon sign that quoted Wittgenstein: “The limits of my language are the limits of my world.” Kuyda trained the bot with dozens of tests queries, and her engineers put on the finishing touches.
Only a small percentage of the Roman bot’s responses reflected his actual words. But the neural network was tuned to favor his speech whenever possible. Any time the bot could respond to a query using Mazurenko’s own words, it would. Other times it would default to the generic Russian. After the bot blinked to life, she began peppering it with questions.
Who’s your best friend?, she asked.
Don’t show your insecurities, came the reply.
It sounds like him, she thought.
- More Here
No comments:
Post a Comment