Welcome to Data 2 Machine learning
Data has become the backbone of many if not all businesses. Through data we can get insights on how to increase revenue, reduce cost and minimise risk. data2ml is a place where you learn how to apply Data science and Machine learning principles and techniques to get valuable insights from your data. We describe concepts in-depth showcasing when and when not to use them and demonstrating on how to apply them in a project. We take you through the process of creating valuable data products and mining critical insights from data.
Who is a Data Scientist?
A data Scientist is an individual who solves business problems (simple & complex) using data by applying scientific process. To solve these problems data scientists are equipped with necessary skills. We usually refer to the below Venn diagram to describe a data scientist skills;
From the Venn diagram above we see that data science is an inter-disciplinary field from other fields.
- Mathematics and Statistics. Mathematics is at the core of data science techniques. Most tools and approaches used in data science have mathematical underpinning. The most important but not limited mathematical concepts in data science include; calculus and algebra. Statistics and statistics help us answer questions about data and interpret insights.
- Computer Science. Computer science provides us the capability of solving data problems at scale. It encapsulates most of mathematical functions to a computer program that we can use for diverse use-cases. With computer science we can create scalable systems that solves complex data problems efficiently.
- Domain Expertise. Business knowledge is critical to any data scientist as it helps in clearly defining the problem and the constraints encompassing it. Understanding a problem is important to enable the data scientists in finding the optimal solution.
Specialisations within Data
The field of data has many roles; below are few of the major roles in data;
- Data Engineers. These are data professionals who create scalable data pipelines. They design and maintain data management systems
- Data Scientists. These are data professionals that solves business problems using data and scientific techniques.
- Machine Learning Engineers. These are professionals responsible for designing and developing Machine Learning systems.
- Data Architects. These are professionals that focus on designing and implementing business/organisations data strategies and architectures.
- Data Analytics. They are professionals who creates analytics reports/dashboard and presents business metrics to management.
Machine learning as a tool for solving Data science problems
Machine learning is a sub-field of artificial intelligence that enable computers to learn from data without being explicitly programmed wikipedia. It provides us with a set of tools (algorithms) that we use on data to uncover hidden insights from data. The machine learning algorithm is a mathematical function that learns from data (training) to map (approximate) the underlying patterns to come up with a generalized function (model) that can be applied to unseen data to make predictions. We train a machine learning algorithm on data to obtain a model which we use for predictions and inferences. Some of the applications of machine learning includes; recommendation systems, natural language translation, speech recognition, computer vision, medical disease diagnosis, forecasting of stocks and self-driving cars.
With machine learning we can solve complex problem quickly and easily that otherwise would be difficult through manual and other heuristic techniques. Machine learning can be broadly categorised into 3 fields;
1. Supervised learning. Uses labelled data for training the algorithm.
2. Unsupervised learning. Applied to unlabelled data to discover patterns in data.
3. Reinforcement learning. Employs a reward-based approach to self-learn.
Data is regarded as an asset in organisations. However, to get the maximum value from it we must effectively manage and use it. Data science is an inter-disciplinary field that helps in solving business problems using data. It uncovers hidden insights from data. In Data science field we use different tools to solve complex problems, such tools includes Machine learning. Machine learning provides us a set of tools (algorithms) that learns from data without being programmed explicitly. Visit our next blog to discus about Tools & Technologies used in the Data Science and Machine Learning process.