---
description: In this tutorial, we will learn about the Pandas read_csv() Method and groupby() Method to Import and Read CSV data in Pandas with Examples.
title: Pandas read_csv() with Example
image: https://www.guru99.com/images/tensorflow/082918_0617_PandasTutor2.png
---

 

[Skip to content](#main) 

## Import CSV in Pandas

During the TensorFlow tutorial, you will use the adult dataset. It is often used with classification task. It is available in this URL <https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data>

The data is stored in a CSV format. This dataset includes eights categorical variables:

This dataset includes 8 categorical variables:

* workclass
* education
* marital
* occupation
* relationship
* race
* sex
* native\_country

And, 6 continuous variables:

* age
* fnlwgt
* education\_num
* capital\_gain
* capital\_loss
* hours\_week

## Pandas read\_csv() Method

To import a CSV dataset, you can use the object pd.read\_csv(). The basic argument inside is:

### Pandas read\_csv() Syntax

pandas.read_csv(filepath_or_buffer,sep=', ',`names=None`,`index_col=None`,`skipinitialspace=False`)

* **filepath\_or\_buffer**: Path or URL with the data
* **sep=’, ‘**: Define the delimiter to use
* **\`names=None\`**: Name the columns. If the dataset has ten columns, you need to pass ten names
* **\`index\_col=None\`**: If yes, the first column is used as a row index
* **\`skipinitialspace=False\`**: Skip spaces after delimiter.

For more information about read\_csv(), please check the official documentation:

<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read%5Fcsv.html>

### Pandas read\_csv() Example

## Import csv
import pandas as pd
## Define path data
COLUMNS = ['age','workclass', 'fnlwgt', 'education', 'education_num', 'marital',
           'occupation', 'relationship', 'race', 'sex', 'capital_gain', 'capital_loss',
           'hours_week', 'native_country', 'label']
PATH = "https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"
df_train = pd.read_csv(PATH,
                       skipinitialspace=True,
                       names = COLUMNS,
                       index_col=False)
df_train.shape

**Output:**

(32561, 15)

## Pandas groupby() Method

An easy way to see the data is to use the groupby method. This method can help you to summarize the data by group. Below is a list of methods available with groupby() method:

* count: count
* min: min
* max: max
* mean: mean
* median: median
* standard deviation: sdt
* etc

Inside groupby(), you can use the column you want to apply the method.

Let’s have a look at a single grouping with the adult dataset. You will get the mean of all the continuous variables by type of revenue, i.e., above 50k or below 50k:

df_train.groupby(['label']).mean()

| label     | age       | fnlwgt       | education\_num | capital\_gain | capital\_loss | hours\_week |
| --------- | --------- | ------------ | -------------- | ------------- | ------------- | ----------- |
| **<=50K** | 36.783738 | 190340.86517 | 9.595065       | 148.752468    | 53.142921     | 38.840210   |
| **\>50K** | 44.249841 | 188005.00000 | 11.611657      | 4006.142456   | 195.001530    | 45.473026   |

You can get the minimum of age by type of household:

df_train.groupby(['label'])['age'].min()
label
<=50K    17
>50K     19
Name: age, dtype: int64

You can also group by multiple columns. For instance, you can get the maximum capital gain according to the household type and marital status.

df_train.groupby(['label', 'marital'])['capital_gain'].max()				
label  marital              
<=50K  Divorced                 34095
       Married-AF-spouse         2653
       Married-civ-spouse       41310
       Married-spouse-absent     6849
       Never-married            34095
       Separated                 7443
       Widowed                   6849
>50K   Divorced                 99999
       Married-AF-spouse         7298
       Married-civ-spouse       99999
       Married-spouse-absent    99999
       Never-married            99999
       Separated                99999
       Widowed                  99999
Name: capital_gain, dtype: int64

You can create a plot following groupby. One way to do it is to use a plot after the grouping.

To create a more excellent plot, you will use unstack() after mean() so that you have the same multilevel index, or you join the values by revenue lower than 50k and above 50k. In this case, the plot will have two groups instead of 14 (2\*7).

If you use [Jupyter Notebook](https://www.guru99.com/jupyter-notebook-tutorial.html), make sure to add %matplotlib inline, otherwise, no plot will be displayed

%matplotlib inline
df_plot = df_train.groupby(['label', 'marital'])['capital_gain'].mean().unstack()
df_plot

[](https://www.guru99.com/images/tensorflow/082918%5F0617%5FPandasTutor2.png)

## Summary

* To import a CSV dataset in [Pandas](https://www.guru99.com/python-pandas-tutorial.html), you can use the object pd.read\_csv().
* The groupby() method can help you to summarize the data by group.
* You can also group by multiple columns. For instance, you can get the maximum capital gain according to the household type and marital status.

#### Summarize this post with:

ChatGPT Perplexity Grok Google AI 

**Stay Updated on AI** **Get Weekly AI Skills, Trends, Actionable Advice.** 

##### Sign up for the newsletter

Subscribe for Free 

You have successfully subscribed.  
Please check your inbox. 

![AI-Newsletter]() Chosen by over **350,000+** professionals 

[Scroll to top ](#wrapper)Scroll to top 

× 

Toggle Menu Close 

Search for: 

Search

```json
{"@context":"https://schema.org","@graph":[{"@type":"Organization","@id":"https://www.guru99.com/#organization","name":"Guru99","sameAs":["https://www.facebook.com/Guru99Official","https://twitter.com/guru99com"],"logo":{"@type":"ImageObject","@id":"https://www.guru99.com/#logo","url":"https://www.guru99.com/images/guru99-logo-v1-150x59.png","contentUrl":"https://www.guru99.com/images/guru99-logo-v1-150x59.png","caption":"Guru99","inLanguage":"en-US"}},{"@type":"WebSite","@id":"https://www.guru99.com/#website","url":"https://www.guru99.com","name":"Guru99","publisher":{"@id":"https://www.guru99.com/#organization"},"inLanguage":"en-US"},{"@type":"ImageObject","@id":"https://www.guru99.com/images/pandas-read-csv.png","url":"https://www.guru99.com/images/pandas-read-csv.png","width":"248","height":"180","inLanguage":"en-US"},{"@type":"BreadcrumbList","@id":"https://www.guru99.com/pandas-read-csv.html#breadcrumb","itemListElement":[{"@type":"ListItem","position":"1","item":{"@id":"https://www.guru99.com","name":"Home"}},{"@type":"ListItem","position":"2","item":{"@id":"https://www.guru99.com/tensorflow","name":"TensorFlow"}},{"@type":"ListItem","position":"3","item":{"@id":"https://www.guru99.com/pandas-read-csv.html","name":"Pandas read_csv() with Example"}}]},{"@type":"WebPage","@id":"https://www.guru99.com/pandas-read-csv.html#webpage","url":"https://www.guru99.com/pandas-read-csv.html","name":"Pandas read_csv() with Example","dateModified":"2024-06-12T19:41:01+05:30","isPartOf":{"@id":"https://www.guru99.com/#website"},"primaryImageOfPage":{"@id":"https://www.guru99.com/images/pandas-read-csv.png"},"inLanguage":"en-US","breadcrumb":{"@id":"https://www.guru99.com/pandas-read-csv.html#breadcrumb"}},{"@type":"Person","@id":"https://www.guru99.com/author/evelyn","name":"Evelyn Clarke","description":"I'm Evelyn Clarke, an AI Research Scientist specializing in machine learning and natural language processing, dedicated to advancing the field responsibly.","url":"https://www.guru99.com/author/evelyn","image":{"@type":"ImageObject","@id":"https://www.guru99.com/images/evelyn-clarke-author-120x120.png","url":"https://www.guru99.com/images/evelyn-clarke-author-120x120.png","caption":"Evelyn Clarke","inLanguage":"en-US"},"worksFor":{"@id":"https://www.guru99.com/#organization"}},{"@type":"NewsArticle","headline":"Pandas read_csv() with Example","keywords":"tensorflow, Data Warehousing, Tableau","dateModified":"2024-06-12T19:41:01+05:30","articleSection":"TensorFlow","author":{"@id":"https://www.guru99.com/author/evelyn","name":"Evelyn Clarke"},"publisher":{"@id":"https://www.guru99.com/#organization"},"description":"In this tutorial, we will learn about the Pandas read_csv() Method and groupby() Method to Import and Read CSV data in Pandas with Examples.","copyrightYear":"2024","copyrightHolder":{"@id":"https://www.guru99.com/#organization"},"name":"Pandas read_csv() with Example","@id":"https://www.guru99.com/pandas-read-csv.html#richSnippet","isPartOf":{"@id":"https://www.guru99.com/pandas-read-csv.html#webpage"},"image":{"@id":"https://www.guru99.com/images/pandas-read-csv.png"},"inLanguage":"en-US","mainEntityOfPage":{"@id":"https://www.guru99.com/pandas-read-csv.html#webpage"}}]}
```