Remember how you started recognizing fruits, animals, cars and for that matter any other object by looking at them from our childhood?
Our brain gets trained over the years to recognize these images and then further classify them as apple, orange, banana, cat, dog, horse. then it gets even more interesting — aside from figuring out what to eat and what to avoid, we learn brands and their differences: Toyota, Honda, BMW and so on.
See also: How to use machine learning in today’s enterprise environment
Inspired by these biological processes of the human brain, artificial neural networks (ANN) were developed. “Deep learning” refers to these artificial neural networks that are composed of many layers. It is the fastest-growing field in machine learning. It uses many-layered Deep Neural Networks (DNNs) to learn levels of representation and abstraction that make sense of data such as images, sound, and text
So what makes it deep?
Why is deep learning called deep? It is because of the structure of those ANNs. Four decades back, neural networks were only two layers deep as it was not computationally feasible to build larger networks. Now, it is common to have neural networks with 10+ layers and even 100+ layer ANNs are being tried upon.
Using multiple levels of neural networks in deep learning, computers now have the capacity to see, learn, and react to complex situations as well or better than humans.
Normally data scientists spend a lot of time in data preparation – feature extraction or selecting variables which are actually useful to predictive analytics. Deep learning does this job automatically and makes life easier.
To spur this development, many technology companies have made their deep learning libraries as open source, like Google’s Tensorflow and Facebook’s open source modules for Torch. Amazon released DSSTNE on GitHub, while Microsoft also released CNTK — its open source deep learning toolkit — on GitHub.
And so, today we see a lot of examples of deep learning around, including:
- Google Translate is using deep learning and image recognition to translate not only voice but written languages as well.
- With CamFind app, simply take a picture of any object and it uses mobile visual search technology to tell you what it is. It provides fast, accurate results with no typing necessary. Snap a picture, learn more. That’s it.
- All digital assistants like Siri, Cortana, Alexa & Google Now are using deep learning for natural language processing and speech recognition.
- Amazon, Netflix & Spotify are using recommendation engines using deep learning for the next best offers, movies or music.
- Google PlaNet can look at the photo and tell where it was taken.
- DCGAN is used for enhancing and completing the human faces.
- DeepStereo: Turns images from Street View into a 3D space that shows unseen views from different angles by figuring out the depth and color of each pixel.
- DeepMind’s WaveNet is able to generate speech which mimics any human voice that sounds more natural than the best existing Text-to-Speech systems.
- Paypal is using deep learning to prevent fraud in payments.
Until now, deep learning has aided image classification, language translation, speech recognition and it can be used to solve any pattern recognition problem, and all of it is happening without human intervention.
This is without a doubt a disruptive digital technology that is being used by more and more companies to create new business models.