Skip to content
Data 2 Decision

Data 2 Decision

With Machine Learning

  • Home
  • Data Integration
  • Data Analytics
    • Business Intelligence
    • Data Analysis
    • Data Visualization
    • Geospatial Analysis
  • Machine Learning
    • Data Pre-processing
    • Big Data Pre-processing
    • Feature Engineering
  • Projects

Category: Data Pre-processing

Data Augmentation

January 11, 2022May 15, 2022
Sammy Ongaya
Data Pre-processing

Data augmentation is a technique of generating extra data with the purpose of improving the performance of machine learning model. Most machine learning algorithms especially neural networks performs well with large and varied sets of data, sometimes the challenge lies

Read more

Data Leakage

January 11, 2022March 8, 2022
Sammy Ongaya
Data Pre-processing

More often Data Scientist and Machine learning engineers end up developing models that suffer from data leakage without easily noticing. The model performs perfectly well with high performance on validation set but fail while deployed to production. Data leakage is

Read more

Data Sampling

January 9, 2022January 11, 2022
Sammy Ongaya
Data Pre-processing

In this era of big data we often end up with large data that we need to analyse and model. This might take too much time to process, analyse and model. To avoid this we need to select a small

Read more

Outliers in Data

January 7, 2022May 15, 2022
Sammy Ongaya
Data Pre-processing

Outliers impact the quality of data. Outliers are data points that are far away from rest of majority data points. This might be as a results of measurement error or variability in population distribution. Some statistical measures such as mean

Read more

Handling Imbalanced Data

January 7, 2022March 7, 2022
Sammy Ongaya
Data Pre-processing

In machine learning classification problem imbalanced dataset affects the performance of the model. Imbalance data is when the classes in the target variable have unequal distributions in the dataset. If the target variable has two classes A and B, if

Read more

Handling Missing Data

January 5, 2022January 7, 2022
Sammy Ongaya
Data Pre-processing

Missing data is a common data quality issue that every data practitioner has to deal with. Missing data affects the accuracy of analysis and the performance of the model. Most machine learning algorithms are sensitive to missing data. In the

Read more

Introduction to Data Reduction

January 4, 2022March 7, 2022
Sammy Ongaya
Data Pre-processing

Data reduction is the process of reducing the amount of data. Data reduction can be in terms of the volume or number of features of the data. Reduction in volume can results in storage efficiency. In machine learning modelling we

Read more

Introduction to Data Transformation

January 4, 2022January 4, 2022
Sammy Ongaya
Data Pre-processing

Data Transformation is the process of converting data from one format to another format which is useful to business users for decision making. In Data management data transformation is a key component of data integration and pre-processing. Data transformation involves

Read more

Introduction to Data Integration

January 3, 2022January 4, 2022
Sammy Ongaya
Data Pre-processing

Data Integration is the process of consolidating data from different sources (heterogeneous data sources) to a unified dataset. Data integration can be a simple process within the data pre-processing phase of machine learning modelling or it can be a comprehensive

Read more

Data Cleaning

January 3, 2022January 4, 2022
Sammy Ongaya
Data Pre-processing

Data Scientists and analytics specialists spend most of their time cleaning data. Data cleaning also referred to as data cleansing is the process of making raw, noisy, inaccurate and incomplete data correct, complete and useful. As a process in data

Read more

Posts navigation

Older posts

Categories

  • Big Data Pre-processing
  • Business Intelligence
  • Data Analysis
  • Data Integration
  • Data Pre-processing
  • Data Visualization
  • Feature Engineering
  • Geospatial Analysis
  • Machine Learning
  • Projects
  • Uncategorized
Data 2 Decision
© 2023
Powered by WordPress
Theme: Masonic by ThemeGrill