Skip to content
Data 2 Decision

Data 2 Decision

With Machine Learning

  • Home
  • Data Integration
  • Data Analytics
    • Business Intelligence
    • Data Analysis
    • Data Visualization
    • Geospatial Analysis
  • Machine Learning
    • Data Pre-processing
    • Big Data Pre-processing
    • Feature Engineering
  • Projects

Category: Data Integration

Executing External Scripts

November 18, 2021November 18, 2021
Sammy Ongaya
Data Integration

In airflow we can easily execute external scripts using BashOperator and PythonOperator. This provides flexibility in writing scalable data pipelines. In this post we will discuss on how to execute external script written in Python to perform a simple data

Read more

Creating Your First DAG

November 16, 2021November 18, 2021
Sammy Ongaya
Data Integration

In apache airflow a Directed Acyclic Graph (DAG) is a collection of all tasks ready to run and organized according to their relationships and dependencies. This post assumes that you have installed, configured and tested airflow if not visit our

Read more

Installing Airflow

November 16, 2021November 16, 2021
Sammy Ongaya
Data Integration

Airflow is a powerful and flexible data engineering tool for programmatically authoring, scheduling, monitoring and managing data pipelines and workflows. It is open source hence freely available to use and extend to fit the business use-case. There are various ways

Read more

Linux on Windows

November 15, 2021November 16, 2021
Sammy Ongaya
Data Integration

Microsoft Windows 10 version 2004 and higher (Build 19041 and higher) allows users to natively run Linux on Windows 10 through Windows subsystem for Linux (WSL). To check windows version press Windows logo key + R. Currently WSL comes in

Read more

Introduction to Airflow

November 15, 2021November 15, 2021
Sammy Ongaya
Data Integration

Apache Airflow is a platform for authoring, scheduling and monitoring workflows/data pipeline programmatically. Airflow is primarily written in Python and uses Directed Acyclic Graphs (DAGs) to manage workflow orchestration. It’s an open source data-pipeline orchestration tool that allows you to

Read more

Data Integration Tools

November 12, 2021November 15, 2021
Sammy Ongaya
Data Integration

Data integration is the process of consolidating data from various source systems to a target system where the business can have a holistic view of what’s happening to the organisation. To enable data integration in an organisation we need data

Read more

Data Integration

November 11, 2021November 12, 2021
Sammy Ongaya
Data Integration

For an organisation to be competitive in this data-driven era it needs a proper data management structure as a strategy for digital transformation. Data Integration is the process of consolidating data from different sources (heterogeneous data sources) to a single

Read more

Data Collection

November 10, 2021November 18, 2021
Sammy Ongaya
Data Integration

Data collection is an important phase in Data science and Machine learning project development. This phase assumes that the business problem is statement well stated, the project goal and success criteria is clearly defined, and you have structured the project

Read more

Data 2 Machine Learning

October 30, 2021November 9, 2021
Sammy Ongaya
Data Integration

Data has become the backbone of many if not all businesses. Through data we can get insights on how to increase revenue, reduce cost and minimise risk. data2ml is a place where you learn how to apply data principles and techniques to get valuable insights from your data.

Read more

Categories

  • Big Data Pre-processing
  • Business Intelligence
  • Data Analysis
  • Data Integration
  • Data Pre-processing
  • Data Visualization
  • Feature Engineering
  • Geospatial Analysis
  • Machine Learning
  • Projects
  • Uncategorized
Data 2 Decision
© 2023
Powered by WordPress
Theme: Masonic by ThemeGrill