Artificial Intelligence: From Science Fiction to Alexa

For science fiction fans like me, Artificial Intelligence (AI) has always fired up the imagination. As a field of study, AI has been a part of academia since the mid-1950s.

Since then, AI has been hyped as the key to our civilization’s future, and panned as nothing more than entertainment for nerds.

Over the past few years though, AI has started to gain real traction. A lot of this has to do with the availability of powerful, cheaper and faster computing capability, the emergence of the Internet of Things, and the explosion of data generated as images, text, messages, documents, transactions, mapping and other data.

Many companies are aggressively adopting AI, for instance, to free up highly-skilled workers from routine, repetitive, low-skilled tasks. In fact, International Data Corporation is forecasting spending on AI and machine learning will grow from $8B in 2016 to $47B by 2020.

All you have to do is look around to see the mainstreaming of AI applications, including broader technologies in speech and natural language, as well as industrial automation and robotics. And the use cases are numerous, including:

  • Financial services.
  • Customer service.
  • Healthcare.
  • Drug discovery.
  • Genomics.

Most recently, AI is beginning to gain traction in knowledge-based work and creative areas like visual arts, music, images, and video and movie script generation.

In general terms, many of these AI applications fall into the category of “Narrow AI”. Technologies that perform certain tasks as well as, or sometimes better than humans for tasks such as image classification on Pinterest and face recognition on Facebook.

These technologies expose some facets of human intelligence (that I will outline in a minute). But how? Where does that intelligence come from?

Back to the Basics – What is Artificial Intelligence?

John McCarthy, one of the fathers of AI, defined AI as

“the science and engineering of making intelligent machines, especially intelligent computer programs”.

In other words, a way of making either the hardware or the software running on it, think intelligently, similar to the way humans think.
So, it all comes down to understanding how our brain works, how we learn, decide, and act while trying to solve a problem, and then using these insights to develop “intelligent” software and systems.

What is Intelligence?

For our purposes, we can use the definition of intelligence crafted in the Artificial Intelligence – Intelligent Systems article:

“The ability of a system to calculate, reason, perceive relationships and analogies, learn from experience, store and retrieve information from memory, solve problems, comprehend complex ideas, use natural language fluently, classify, generalize, and adapt new situations”.

So, when we talk about intelligence in a very simplified form, we are actually involving several very complex functions:

  • Learning
  • Reasoning
  • Problem Solving
  • Perception
  • Linguistic Intelligence

That’s a lot. Today we tend to call a system “artificially intelligent” if it handles at least one of those “intelligences”. The more, the better, of course.

Learning is of special interest because we want to give these systems the capability to learn, on their own.

Machine Learning

Machine Learning in its most basic form is about designing “models.” Models are composed of algorithms that use data, learn from it, and produce an “inference” or prediction about something.

So, rather than coding software functions to accomplish a particular task, the intelligent machine is “trained” on data and models that give it the ability to learn how to deliver an inference, a determination, a prediction.

One of the most polished applications for machine learning yet is image detection and classification. Back in the day, this required a great deal of hand-coding to work. Programmers would write classifiers which, you guessed it, helped to classify images!

For example, they used edge detection code to identify where an object started and ended; shape detection code to determine if the object had a certain number of sides or segments; and other classifiers to recognize letters, numbers or symbols.

With the help of all those classifiers, programmers would write algorithms to try to make sense of an image and determine “this object is a STOP sign”. The outcome was good, but not great. An example of a not-so-great outcome? On a foggy day when the “STOP” sign isn’t clearly visible, or a tree branch covers part of it, the system would not produce the expected result. How scary is that?

But all of that was about to radically change.

Deep Learning – Neurons in layers

Neural Networks are based on our interpretation of the connections between neurons in our brains. While real neurons can connect to any other neuron close enough in any direction, artificial neural networks have specific layers, connections, and directions in which data travels.

In a neural network, data is fed into the first layer of neurons, each individual neuron passes its output data to a second layer. The second layer does its work, and so on, until the last layer and the final output is produced. Each neuron in the network assigns a “weight” to its input defining how accurate or inaccurate it is relative to the task being performed.  The final result is then determined by the aggregated total of those weightings. The neural network comes up with a “probability vector”, an educated guess, based on the weightings.

The network is “trained” as it’s coming up with wrong and right answers. In our “STOP” sign example above, it needs to see thousands or millions of images, until the weightings of the neuron inputs are tuned finely enough so it produces the right answer most of the time, no matter if it’s raining or foggy. At that point, the neural network has learned, taught itself what a STOP sign should look like; or how to recognize someone’s face in Facebook, or a dog, a fish or a cat, which is what Andrew Ng did at Google.

Ng made his neural networks really big, “deep”, increasing the number of neurons and layers, and running huge volumes of data (he used more than 10 million images from YouTube videos) through the system to train them. Hence, “Deep Learning.” The result? Image recognition that is, in some cases the same, if not better than, a human’s.

But working with images is just the tip of the iceberg. Deep Learning enabled many other practical applications, making all kinds of “smart things” possible, including driverless cars, better preventive healthcare, movie recommendations, expert systems, chat boxes and much more.

AI runs on data. Lots of data

It’s all about the data. AI’s Deep Learning power comes from its ability to learn patterns from large amounts of data. This makes understanding data sets critical.

Most major AI advances in recent times have been pushed by the availability of increased processing power, modeling and algorithm enhancements and huge improvements in data sets. We have access to massive amounts of data. That’s the good news.

The bad news is? In the context of specific AI use cases and applications, there is good data and there is bad data (and in case you’re wondering, yes, there’s even “fake” data…). The thing is, collection, classification and labeling datasets used to train the models is proving to be a very difficult and expensive part of the AI journey.

There are tools like Facets, an open source data visualization tool, that helps you understand your machine learning datasets, their characteristics and the interaction between the different data elements.

Frequently, companies start their AI projects but don’t realize the importance of good data sets, which is not the same as having a lot of data. This can become a difficult problem to solve once you are deep in the project and there are plenty of examples of failures due to it, even from the big-time players.

Getting your hands dirty

You can divide the typical AI process in a sequence of steps:

  1. data collection
  2. data preparation
  3. choosing the model
  4. training the model
  5. evaluation
  6. parameter tuning
  7. inference or prediction

There’s a great video from Yufeng G describing this process as a series of 7 steps in simple to understand terms. Watch it. His article on Machine Learning (and included video) is also fun to read (and watch).

Tools and Frameworks

TensorFlow is probably the most popular deep learning framework today. TensorFlow is an open source library originally developed by the Google Brain Team, and the TensorFlow team has created a large number of models, many of which include trained model weights. Some great examples are the Object Detection API or tf-seq2sec for translation services. The Tensorflow Image Recognition Tutorial and Tensorflow Playground are good places where to start pulling AI strings.

There are other powerful frameworks, of course, like Caffe, PyTorch, and BigDL. They keep getting better and are quickly gaining users and use cases.

New Deep Learning tools are being developed to simplify neural network architecture definition and parameter tuning, distributed training, and model deployment and management.

Also, simulators, such as digital twins, allow developers to accelerate development of AI systems and the reinforcement learning libraries that integrate with them (check out the RL library that’s part of RISE Lab’s Ray – it’s an awesome example).

Machine learning tasks

In general terms, we can catalog the primary machine learning tasks in four groups:

  • Supervised learning: used to infer a function from labeled data. The goal is to find the best model parameters to accurately predict unknown labels on other objects. If the label is a number, the task is called regression. If the label is from a relatively small and defined number of values, then it’s a classification.
  • Unsupervised learning: used when we have limited information or labels about the objects. In this case, we put objects together in clusters by finding similarities among them.
  • Semi-supervised learning: used when we have both labeled and unlabeled objects. This approach can improve accuracy significantly, because of the mix of training data.
  • Reinforcement learning: this is what we would call “learning by adapting”. Think of it as a game where you get points for every move. You make your first move trying to maximize the points from that move. This impacts the environment so you “adapt to it”, make your next move again trying to maximize for points. You repeat this until you have reached a particular point objective. This task goes pretty far down the rabbit hole. If you want to learn more, go ahead and take the blue pill and click on the link – don’t say I didn’t warn you!


Over the years, different algorithms have been developed to resolve different types of use cases and include decision tree learning, inductive logic programming, reinforced learning, Bayesian networks and clustering, among many others.

Some of the most commonly used algorithms are:

  • Linear regression and linear classifier: simple and very useful in many cases where more complex algorithms might be overkill.
  • Logistic regression: the simplest non-linear classifier, combining linear parameters and nonlinear sigmoid function.
  • Decision tree: this algorithmic approach is similar in principle to how we humans make decisions.
  • K-means: a very easy to understand algorithm, originally derived from signal processing, divides a number of objects into clusters where each object belongs to the cluster with the nearest mean.

For those interested in coding, Machine Learning Algorithms and Sample Code and AI Tools, Platforms, Libraries are good reads.

Applications of AI

AI is increasingly being integrated into business processes across a number of different areas. To name just a few:

  • Sales and CRM Applications
  • Payments and Payment Services
  • Logistics and Delivery
  • Customer Recommendations
  • Manufacturing

One example: Northface uses IBM Watson to help consumers determine what outfit is best for them, based on location, gender and activity preferences. Great if you are thinking about hiking in northern Norway in November.

PayPal has leveraged fraud detection models to protect customer digital transactions. Using Deep Learning for transaction security, PayPal has reduced their fraud rate down to 0.32% which is about 1% lower than the average industry rate.

AI is gaining momentum almost everywhere, but a few areas are worth highlighting:

  • Vision. These systems “see”, understand and interpret visual input. Helping to diagnose a patient, or face recognition for “person of interest” purposes are great examples.
  • Speech Recognition. Hearing and comprehending natural language, dealing with accents, noise, inflection, etc. Think about automated translation, voice to text conversion, voice input for devices.
  • Handwriting Recognition
  • Expert Systems. Applications integrating AI and data providing explanation and advice to end users.

What’s next

There’s no question AI has a lot of momentum right now, in good part due to the confluence of enablers like tremendous, inexpensive processing power, access to huge amounts of data, virtually unlimited storage and progress made in models and algorithms.

Companies in all industries are rapidly finding use cases where AI can be successfully applied.

In the short term though, it seems applications and use cases that have the highest rate of success and adoption are those that have a direct, measurable impact or return on investment. “Improving customer engagement” is great but it is a softer benefit than “reducing lost packages by 5-10%” or “increase upsell probability by 8%”.

Although there’s still a lot of room to achieve science fiction caliber AI, it’s evident that significant progress is being made and real benefits are already being delivered to businesses and consumers as well.

Sorry, I have to go. Alexa is calling me.

In the meantime, if you are or want to be an early AI adopter, check out this on-demand webinar:

How to Help Your Business Become an AI Early Adopter

Happy Deep Learning.