Analytics with Python

Python is used widely by many data analysts and scientists to explore, visualize, and analyze data. Since Python is open source, most and if not all of its data science capabilities are built upon a few very important packages that introduce visualization capabilities and new data structures. These packages include numpy, pandas, matplotlib, plotly, and scikitlearn, to name a few. It is the most widely used general-purpose programming language by data scientists.

Tech Kits

Tech Kits are part of the walk-in service provided by Innovate Labs. There are three levels of difficulty meant for different users and their experience with the different technologies. Many of the Tech Kits build off each other as you progress.

Beginner

Pandas

Length: 30-60 minutes

Description: Pandas is a Python package used by data analysts and other professionals to analyze and explore datasets. For this, it comes with two datatypes: the Series and the Dataframe. Pandas have functions used to manipulate these data types, convert from datasets, and gain statistical insights from them.

Intermediate

Matplotlib and Plotly

Length: 30-60 minutes

Description: Plotly and matplotlib are both data visualization libraries in Python. Matplotlib is very powerful, as almost every single function introduces tons of customizability for visualizing pandas datasets. Meanwhile, Plotly is much more visually appealing and allows for interactive visualizations similar to that of Tableau. They are both very important tools to visualize data and with that, analyze it.

Advanced

Exploratory Data Analysis

Length: 30-60 minutes

Description: Exploratory data analysis is the sum of what you have done previously in the past two tech kits. It is cleaning, analyzing, and visualizing data in order to arrive at some kind of conclusion. Of course, the questions asked to arrive at such conclusions are up to whoever is performing the analysis. Therefore, this is the best way to practice data analysis that could be done in the real world, where data analysts have to make these decisions themselves independently in order to reach a certain goal.

 

 

Resources

VSCode logo

Visual Studio Code

Type: Application

Description: VSCode is a free, open source application that allows users to edit code with the help of built0in programming features.

 

Python logo

Python 3.5

Type: Programming Language

Description: Python is an interpreted high-level programming language for general-purpose programming. Version 3.5 is a part of the many new versions that continuously are being put out.

 

Google Colaboratory logo

Google Colab

Type: Development Environment

Description: Google Colab is a code development environment that runs in the browser using Google Cloud and utilizes cloud computing.

Plotly library logo

Plotly Library

Type: Python library

Description: Plotly is a Python library that is visually appealing and allows for interactive visualization similar to that of Tableau.

 

Pandas library logo

Pandas Library

Type: Python library

Description: Pandas is a Python package used by data analysts and other professions to analyze and explore datasets.

 

Matplotlib logo

Matplotlib Library

Type: Python library

Description: Matplotlib is very powerful data visualization library in Python, as almost every single function introduces tons of customizability for visualizing pandas datasets.