Saturday, July 29, 2017

Science Magazine special issue on AI and machine learning

It's very fitting that, as I worked to get this site up and running this month, Science released a special issue on artificial intelligence and machine learning entitled 'The cyberscientist'. I think it's wonderful that the computational sciences (especially bioinformatics) are now part of the mainstream scientific discourse, and I think this is evidence that more and more positions are going to need to be filled by candidates with training in this space.

From the issue:
Big data has met its match. In field after field, the ability to collect data has exploded, overwhelming human insight and analysis. But the computing advances that helped deliver the data have also conjured powerful new tools for making sense of it all. In a revolution that extends across much of science, researchers are unleashing artificial intelligence (AI), often in the form of artificial neural networks, on these mountains of data. Unlike earlier attempts at AI, such “deep learning” systems don’t need to be programmed with a human expert’s knowledge. Instead, they learn on their own, often from large training data sets, until they can see patterns and spot anomalies in data sets far larger and messier than human beings can cope with.
Wait a second, that's a lot of buzzwords for someone not familiar with the field. What's the difference between artificial intelligence, machine learning, and deep learning? In short, artificial intelligence is just human intelligence exhibited by a machine. Machine learning uses algorithms to parse data, learns from it, and then makes some sort of prediction. Deep learning is a type of machine learning uses neural networks with multiple hidden layers. 

Before you go down the rabbit hole of neural networks and deep learning, you should ask yourself why you should care as a bioinformatician. Back to the issue in Science for a great example:
For geneticists, autism is a vexing challenge. Inheritance patterns suggest it has a strong genetic component. But variants in scores of genes known to play some role in autism can explain only about 20% of all cases. Finding other variants that might contribute requires looking for clues in data on the 25,000 other human genes and their surrounding DNA—an overwhelming task for human investigators. So computational biologists have enlisted the tools of artificial intelligence (AI), which can ask a trillion questions where scientists can ask only 10. First, these researchers combined hundreds of genomics data sets and used machine learning build a map of gene interactions. They compared those of the few well-established autism risk genes with those of thousands of other unknown genes and last year flagged another 2500 genes likely to be involved in this disorder. Now they have developed a deep learning tool to find non-coding DNA that may also play a role in autism and other diseases.
Machine learning is an advanced but indispensable tool for the bioinformatician's toolbox. It just so happens that Andew Ng, a renowned world expert of machine learning, runs an online Coursera course on Machine Learning. The course runs continuously, so if you missed the enrollment date for this session you can sign up again in a few weeks. If you're interested in self-teaching, I highly recommend the introductory text An Introduction to Statistical Learning: with Applications in R.

Popular Posts