When I introduce myself as a Data scientist, I often get the question “What is the difference between Data Science and Machine learning?”. Or “Does that mean that you are working in the field of Artificial Intelligence (AI)”? I had to explain and answer a lot of the same questions, and today I would like to write this article to better explain: Data Science, Machine Learning and AI.
The fields are overlapping. Most experts in these fields have a visual understanding of specific classified jobs such as data science, data, machine learning and artificial intelligence, even if it is difficult to describe verbally. In this article I do not go too deep into what is the definition of Data Science, Machine Learning or AI. I’m only interested in how programme programmes use and the specific jobs for these three areas. So let me simply give 3 very short definitions:
Data science: creates value from data
Machine learning: learning from data
Artificial intelligence: take action from data
It must also be added, that the above definitions cannot fully imply information and concepts about the areas in question. Nor can it be relied upon to clearly define roles or positions. However, the above brief information can help distinguish three different types of work.
1/ Data Science analyzes data to find solutions to complex problems
Data science is distinguished from the other two sectors because its objectives are humane: collecting and understanding data insight. Jeff Leek has a great definition of the types of insights Data Science can achieve, including descriptions (e.g. average customers have a 70% chance of innovation), exploration (different sales staff have different renewals, different customer service) and causes (random testing shows that customers assigned to Alice are more likely to close orders than customers who deliver to Bob).
Again, not everything that generates value from data is qualified is Data Science (the classic definition of data science is related to a combination of statistics, software engineering, and field of expertise). But we can also use this definition to differentiate it from Machine Learning and AI. The main difference is that in Data Science there are always specific people in the loop: someone is understanding the hidden problems (insight) of the data, seeing the data and benefiting from the conclusions when analyzing. It’s not right to say ” Our chess algorithm uses Data Science to calculate the next move in the game” or “Google Maps uses Data Science to give the App directions “.
The definition of Data Science here emphasizes:
Statistical infer infereries
Data scientists can use simple tools when analyzing data: they can report percentages and create line charts based on SQL queries. They can also use very complex methods such as working with distributed data repatrics to analyze trillions of records, develop advanced statistical techniques, and build interactive images. No matter how they’re used, the main goal is to better understand their data.
2/ Machine learning generates predictions
Machine learning programming
I think of machine learning as the field of prediction: “example X with special features, predicting Y about it”. These predictions may be about the future (“predict whether this patient has an infection”), but they can also be about qualities that are not immediately apparent to the computer (“predict whether this image has a bird in it”). Almost all Kaggle competitions are machine learning issues: they provide some training data and then see if competitors can make accurate predictions about new examples.
There is a lot of intertwining between Data Science and Machine Learning. For example, logic can be used to draw insights into relationships (“the richer the user the more likely they are to buy our products, so we should change the marketing strategy”) and make predictions (” %the more chances customers will buy the product, so recommend it to the company”).
Most programmable students, if they understand these two concepts, will easily switch between the two tasks, for example, I use both Machine Learning and Data Science knowledge in my work, I can fit the traffice data model on Stack Overflow to determine which users are capable of finding a job (Machine Learning), but then build summaries and images to check why this model (Data Science) works. This is an important way to detect flaws in your model and to contest algorithmal deviations. This is one of the reasons that data science programmeers are often responsible for developing machine learning components of a product.
3/ Artificial intelligence creates action
AI is the intelligence expressed by any artificial system. It is the future, the science of fiction, and a part of our daily lives. AI was born a long time ago and is widely recognized, most commonly in the programming world, and therefore quite difficult to fully understand in this field.
A common theme in the definition of “artificial intelligence” is an independent substance that performs or proposes action (e.g. Poole, Mackworth and Goebel 1998, Russell and Norvig 2003). Some of the systems I think should be mentioned when describing AI:
Gaming algorithms (Deep Blue, AIphaGo)
Theory of robotics and control (motion planning, walking with a bipedAI robot)
Optimization (Google Maps select route)
Natural language processing (bots2)
Enhanced and accelerated learning
Once again, we can see the intertwining of AI, Machine Learning and Data Science when creating a product, Deep learning has been a very active topic of AI discussion. This is an algorithm based on ideas from the human brain to the reception of multiple layers of expression. It has driven progress in areas as diverse as cognitive, automatic translation, voice recognition….
But there are also differences, if I analyze some sales data and discover that customers from specific industries are more likely to close contracts than other customers, you can give a number and graph rather than a specific action. Managers can use those conclusions to change sales strategies, but they’re not automatically available. This means I will describe my work as a data science programnist.
4 / Specific case: How to use all 3 fields in a project?
Let’s say we study the production of a self-driving car and are working on stopping at places where there are signs of stopping. We will need specific skills for these three areas:
+ Machine Learning: The car must recognize the stop sign with the camera. We build a data set of millions of street photos and set up an algorithm to predict those stop signs.
+ AI: Once the car can recognize the stop signs, it needs to decide when to take action to apply the brakes. Applying too early or too late is very dangerous and we need AI to handle different road conditions (for example, on this steep slippery road, it is recommended to slow down). This is the theoretical issue of control action.