The Two Ideas of AI
AI in recent years has brought a lot of hype, and at the same time also brought unnecessary fears. Both of these overhype and AI-related technophobia are often caused by a common misconception regarding AI as a concept.
It’s important to understand that there are two different concepts—or ideas— surrounding AI:
1. Artificial Narrow Intelligence (ANI)
This type of Artificial Intelligence is focused on one narrow (specific) task and one only. They are programmed to do—and learn about (in the case of machine learning) — one thing.
For example, the AI software is developed to predict the price of Bitcoin based on historical data, volume of transactions, social media mentions, etc. This same AI, as an example, won’t be able to predict gold price.
Nowadays, we have many different applications of ANI from social media analytics tools to various predictive and automation tools, and we can also expect various new inventions and implementations that will bring new values, benefits, and probably open up new fields in our society.
All innovation and implementations surrounding AI in recent years that we’ve heard all over the news is about ANI. Since ANI can only do a single task, it won’t catch up to human intelligence anytime soon. We have only made progress in ANI.
2. Artificial General Intelligence (AGI)
Opposed to ANI, Artificial General Intelligence should be able to do several different tasks just as a human could. AGI is the end goal of AI, a machine that can replace, or even smarter than the average human.
AGI is often the trigger of fear in many people, that AI will someday will be smarter than humans and will steal our jobs.
It’s important to know that AGI is not yet realized today, and we still need many breakthroughs before we can truly achieve the realization of AGI.
Important Terms Surrounding AI
Artificial Intelligence by itself is a very deep and complex subject with a lot of technical terms that can confuse anyone.
Here, we will discuss some of the most important terms around Artificial Intelligence, and the integral concepts that might be related to these terms.
Also, some of these terms are often used interchangeably, but here are the most common definitions used for the terms:
Artificial Intelligence, in its purest definition, is a blanket term referring to an area of computer engineering focusing on the creation of intelligent machines that can think, work, react, and learn like humans—in short, possessing intelligence.
It’s important to understand that most people refer to Artificial General Intelligence (AGI) when they use the term ‘AI’, as discussed above, and people will use ‘narrow AI’ or ‘ANI’ when they specifically refer to Artificial General Intelligence.
Machine learning is an implementation of Artificial Narrow Intelligence, and has developed as a subfield for AI.
The main concept of machine learning is that AI can ‘learn’ by analyzing data it gathers, and reprogram itself. So, this will eliminate the need to program the software over and over again with the changes in variables.
For example, an AI software designed to predict the trading price of bitcoin can train itself according to the historical data, and for instance, can adapt its algorithm when a new cryptocurrency is launched. Machine Learning AI can reprogram itself according to the introduction of new variables and changes in existing variables with minimum to no intervention from the programmer/user.
If machine learning is a subfield of AI, then deep learning is an aspect, a sub-part of machine learning. The invention, and implementation of deep learning is actually what caused all the hype surrounding AI (or more accurately, ANI) in the past few years.
The main concept of deep learning is still similar to machine learning: the AI can teach itself using algorithm-labeled data.
However, deep learning (or hierarchical learning) utilizes artificial neural network (ANN,will be discussed below) in contrast to simpler algorithms in traditional machine learning.
Deep learning, allows AI to learn more “naturally” just as humans do, and can now even outperform humans in certain tasks (keep in mind that it can only do one specialized task as ANI).
Deep learning need much more computational power and highly skilled engineers to supervise it, but can process much more data than traditional machine learning.
Artificial Neural Network
Artificial Neural Networks (or ANNs) are initially inspired by the human brain and nervous system (hence the name), but in reality, the mechanisms of how they work today are completely different than how the biological neural system works.
It’s important to note that the term Artificial Neural Network—or just Neural Network—, is often used interchangeably with Deep Learning.
The key concept in ANN is the existence of the artificial neurons, which can transmit a signal to each other. A neuron that receives data will process it and then signal another neuron connected to it.
While Artificial Neural Network offers much more computing power than traditional algorithm, it’s not always better. For instance, it’s more expensive and harder (more time consuming to build), and harder to control.
Cognitive computing is another blanket term to describe many different technologies based on AI and signal processing. Including processes like (but not limited to) machine learning, human-computer interaction, natural language processing—or voice recognition—, and data mining.
When people refer to “cognitive computing”, the term is often used interchangeably with AGI (or incorrectly as just ‘AI’) to describe any technology that mimics how human brains function to solve problems.
So, cognitive computing can be defined as a field pursuing artificial modelling of the human brain and neural system—much like artificial neural network before it evolved to the state it is today, that is focused on doing a single task.
Data Science, as the name suggests, is the science of deriving value from data to get insights and knowledge by analyzing the data. Mainly, computational statistical techniques are the main analysis approach, but we can also use other methods like data visualization or regression.
Depending on who you ask, some people might say AI is an aspect of data science, and some will say it’s the other way around. However we define data science as a field of science field that utilizes many applications and tools from AI, especially machine learning and deep learning. It’s important to note that data science can use other tools and technologies outside AI.
More About Machine Learning
Since machine learning is such an important concept for the current state of AI, even the core technology of AI today, it deserves a special discussion.
Machine learning, in a nutshell, utilizes computational statistic methods to gradually improve itself (its computing performance and its ability to understand data, among others) without being re-programmed by a human programmer—to ‘learn’, hence the name.
Machine learning technology is the tool that has enabled most of the benefits we have from AI today, and can be divided into two main types of tasks:
- Supervised Learning
In supervised learning, both input and output variables will be given, and the main goal is to better determine and understand the function/relationship so it can predict the output when a new input data is given. Common examples of supervised learning tasks are classification and regression functions.
- Unsupervised Learning
As opposed to supervised learning, here only the input data will be given, and the learning goal is to find hidden patterns, structure, and possible relationships in the given input data to learn more about the data. Common examples of unsupervised learning tasks are clustering and association functions.
Supervised learning is currently responsible for roughly 80% of all machine learning applications.
For example, let’s say we are going to train an AI software to check whether a user (input variable) will click on the buy now button or not (output variable). Here, we feed the AI with specific information about users (for example, referral source) and the outcome. We give the AI thousands of these variables, and after the AI has ‘learned’ about the pattern, now we give it a completely new user information as an input. The AI can now predict whether the user will make the purchase or not.
Data In AI Applications
Artificial Intelligence cannot be separated with data as its input and output.
Data can come from many different sources from sensors (i.e. temperature data from thermostat), images, audio, numbers in spreadsheets, and so on.
Data can be categorized into two different types: structured and unstructured.
Structured data, simply put, is data that is already stored in a structured format (mainly database or spreadsheet). Thus, the data is already defined and structured, and the software (in this case, the AI), can jump to the analysis process right away.
Opposed to structured data, unstructured data refers to any type of data that is not (yet) structured with a predefined scheme. The most common forms of unstructured data are images, videos, documents, audio, and so on. With unstructured data, first the software must recognize (define) the data before it can analyze the value and functions.
With regards to AI, data usually always means labeled data. For example, we might have a set of data for 10,000 videos, and each video is “labeled” with, for example ‘positive review video’ and ‘negative review video’. Another example is customer data, for example we put different labels for ‘under 25’ and ‘above 25’.
How To Label The Data
Data labeling is a very important aspect of AI, and you can get various different datasets (a set of labeled data) online, some are free and some might cost you a lot. However, in most cases we have to create our own dataset for the specific problem/application we are going to use the AI for, and there are three main approaches to do this:
- Manual Approach
This is an approach used when the data (or set of data) is fairly easy to identify as humans. For example, we have two different sets of photos: photos of cats and dogs, or children and adults. We can assign a label to every photo. We can also hire people to do the job.
Here, we observe data coming from a sensor, analytic tool, or any other source. For example, we observe website visitors data from Google Analytics, and we define the variables to label the data. For example, label 1 for visitors who click and label 2 for those who don’t. Another example is temperature data coming from thermostats, where we can label them as, for example, label 1 for temperatures above a certain threshold, and label 2 for those below.
- Get a Ready Data Set
You can use Google Data Search tool to find datasets that are ready to use. Some of them are free, but for popular ones you might need to spend some money. There are also platforms like Kaggle that offer downloadable datasets.
Challenges in Using Data of AI Applications
A lot of things can go wrong in acquiring data for the AI application, and at the same time, the AI’s performance is limited to the input data. If the quality of the data input is bad, you will also get bad results.
Here are some of the common challenges in acquiring and using data:
1.Filtering Irrelevant Data
As mentioned, filtering out irrelevant data is one of the biggest challenges in AI. It is a common misconception just to give the AI engineers a huge chunk of dataset accumulated over the years, and assume the AI can always extract value. It’s very important to first filter out irrelevant data, and create an optimal, high-quality dataset.
There is also the possibility that the data you possess is not useful at all. This is why it’s better to give the data to an AI expert as soon as possible, rather than accumulating it. The AI engineer can tell you what parts are useful, and how to approach the next steps. This way, you can eliminate the risk of having totally useless data that are years in accumulation.
2. Incorrect Labeling
Incorrect labeling is a very common challenge, as it will prevent the AI algorithm from properly learning about the different variables. Proper labeling process and re-checking the dataset for correct labels are necessary.
Fortunately, the bigger the dataset size, incorrect labels will be less important. For example, if we have 100 photos as data, 5 incorrect labels will be significant. However, if we have over 5 million photos, 5 incorrect labels won’t affect the AI’s learning performance.
While AI is certainly a deep and complex field, it’s not impossible to learn even for us without any technical background. The key here is to first understand the high-level concepts surrounding AI, and the important terms.