Author: David Suárez

Data Science

Dimensionality reduction – TSNE

t-SNE | T-distributed stochastic neighbor embedding The goal is to take a set of points in a high-dimensional space and find a faithful representation of those points in a lower-dimensional space, typically the 2D plane. The algorithm is non-linear and

Read More »
Data Science

What is Dimensionality Reduction?

A lot of Machine Learning problems consist of hundreds to thousands of features. Having such a large number of features poses certain problems. This problem is also sometimes known as The Curse of Dimensionality and Dimensionality Reduction or Dimension reduction

Read More »
Data Science

Getting Started with Matplotlib – Lesson 1

Introduction Visualization as a tool takes part of the analysis coming from the data scientist in order to extract conclusions from a dataset. In today’s article we are going to go through Matplotlib library. Matplotlib is a third-party library for

Read More »
Data Science

Getting Started with Numpy – Lesson 1

Introduction NumPy is a third-party library for numerical computing, optimized for working with single- and multi-dimensional arrays. Its primary type is the array type called ndarray. This library contains many routines for statistical analysis. Creating, Getting Info, Selecting and Util

Read More »
Data Science

Getting Started with Pandas – Lesson 4

Introduction We begin with the fourth and final article of our saga of training with Pandas. In this article we are going to make a summary of the different functions that are used in Pandas to perform the missing data

Read More »
Data Science

Getting Started with Pandas – Lesson 3

Introduction We begin with the third post of our data science training saga with Pandas. In this article we are going to make a summary of the different functions that are used in Pandas to perform Iteration, Maps, Grouping and

Read More »
Data Science

Getting Started with Pandas – Lesson 2

Introduction We begin with the second post of our training saga with Pandas. In this article we are going to make a summary of the different functions that are used in Pandas to perform Indexing, Selection and Filtering. Indexing, Selecting

Read More »
Data Science

Getting Started with Pandas – Lesson 1

Introduction Today we will like to introduce one of the first inner training chapters in which we introduce the fundamentals of DataScience treatment tools. We are talking about Pandas, Numpy and Matplotlib. Pandas is a third-party library for numerical computing

Read More »
Secured By miniOrange