The worship of big data downplays many issues, some profound. To make sense of all this data, researchers are using a type of artificial intelligence known as neural networks. But no matter their “depth” and sophistication, they merely fit curves to existing data. They can fail in circumstances beyond the range of the data used to train them. All they can, in effect, say is that “based on the people we have seen and treated before, we expect the patient in front of us now to do this”.
Still, they can be useful. Two decades ago, one of us (Peter) used big data and neural networks to predict the thickening times of complex slurries (semi-liquid mixtures) from infrared spectrums of cement powders. But, even though this became a commercial offering, it has not brought us one iota closer to understanding what mechanisms are at play, which is what is needed to design new kinds of cement.
The most profound challenge arises because, in biology, big data is actually tiny relative to the complexity of a cell, organ or body. One needs to know which data is important for a particular objective. Physicists understand this only too well. The discovery of the Higgs boson at CERN’s Large Hadron Collider required petabytes of data; nevertheless, they used theory to guide their search. Nor do we predict tomorrow’s weather by averaging historic records of that day’s weather – mathematical models do a much better job with the help of daily data from satellites.
Some even dream of minting new physical laws by mining data. But the results to date are limited and unconvincing. As Edward put it: “Does anyone really believe that data mining could produce the general theory of relativity?"
- More Here
Still, they can be useful. Two decades ago, one of us (Peter) used big data and neural networks to predict the thickening times of complex slurries (semi-liquid mixtures) from infrared spectrums of cement powders. But, even though this became a commercial offering, it has not brought us one iota closer to understanding what mechanisms are at play, which is what is needed to design new kinds of cement.
The most profound challenge arises because, in biology, big data is actually tiny relative to the complexity of a cell, organ or body. One needs to know which data is important for a particular objective. Physicists understand this only too well. The discovery of the Higgs boson at CERN’s Large Hadron Collider required petabytes of data; nevertheless, they used theory to guide their search. Nor do we predict tomorrow’s weather by averaging historic records of that day’s weather – mathematical models do a much better job with the help of daily data from satellites.
Some even dream of minting new physical laws by mining data. But the results to date are limited and unconvincing. As Edward put it: “Does anyone really believe that data mining could produce the general theory of relativity?"
- More Here
No comments:
Post a Comment