What is a Dataset?

Collecting data is an early and important step of the data analysis process. During this phase you will work with a data set and for simplicity lets say it is tabular (which is the case if working with database tables). A dataset is just a collection of data in a row and column structure where each row is a unique observation and each column is a feature (also known as a dimension, field, or variable).

For example, say you work for an ecommerce company and they store their transactional data in a date warehouse. A transactional dataset on the order level would have each row be a unique order ID. Each column would relate to that order and contain details of the order such as the customer ID, order date, units purchased, sales amount, etc. Please see example below:

Leave a Comment