Sunday, September 25, 2016

Stealing an AI Algorithm & It's Underlying Data is a “High-School Level Exercise”

The researchers found that the complexity of the algorithm mirrored how hard it was to steal. Simple yes-or-no algorithms, which can be used to predict whether a tumor is malignant or mortality rates from sepsis, can be copied in just 41 queries, less than $0.10 under Google’s payment structure. Complex neural networks, like those used in handwriting recognition, on average took 108,200 queries, but achieved more than 98% accuracy when tested against the original algorithm.

These attacks are limited by a few parameters: since APIs are typically monetized per use, this methods can get expensive over 100,000 uses, and also raise red flags with the service provider. Ristenpart says that deeper neural networks are vexing, especially if the approach is a conglomeration of a few different algorithms.

Once they had stolen an algorithm, the team was also able to reveal the data that had been used to train it. They tested this attack on a public data set of faces, often used for facial recognition, and found that every face could be reconstructed. The algorithm had memorized each face to such an extent that it could generate each person’s likeness. If a company were to train their algorithm on private data, like health records or their users’ information, there’s no guarantee it would be safe if the API were accessible.


- More Here

No comments: