Methods and Attributes¶
Remember
Methods ends with parentheses, while attributes don’t
df.shape: Attribute
df.info(): Method
# import pandas
import pandas as pd
# read a dataset of top-rated IMDb movies into a DataFrame
movies = pd.read_csv('http://bit.ly/imdbratings')
# example method: show the first 5 rows
movies.head()
star_rating | title | content_rating | genre | duration | actors_list | |
---|---|---|---|---|---|---|
0 | 9.3 | The Shawshank Redemption | R | Crime | 142 | [u'Tim Robbins', u'Morgan Freeman', u'Bob Gunt... |
1 | 9.2 | The Godfather | R | Crime | 175 | [u'Marlon Brando', u'Al Pacino', u'James Caan'] |
2 | 9.1 | The Godfather: Part II | R | Crime | 200 | [u'Al Pacino', u'Robert De Niro', u'Robert Duv... |
3 | 9.0 | The Dark Knight | PG-13 | Action | 152 | [u'Christian Bale', u'Heath Ledger', u'Aaron E... |
4 | 8.9 | Pulp Fiction | R | Crime | 154 | [u'John Travolta', u'Uma Thurman', u'Samuel L.... |
# example method: calculate summary statistics
movies.describe()
star_rating | duration | |
---|---|---|
count | 979.000000 | 979.000000 |
mean | 7.889785 | 120.979571 |
std | 0.336069 | 26.218010 |
min | 7.400000 | 64.000000 |
25% | 7.600000 | 102.000000 |
50% | 7.800000 | 117.000000 |
75% | 8.100000 | 134.000000 |
max | 9.300000 | 242.000000 |
# example attribute: number of rows and columns
movies.shape
(979, 6)
# example attribute: data type of each column
movies.dtypes
star_rating float64
title object
content_rating object
genre object
duration int64
actors_list object
dtype: object
# use an optional parameter to the describe method to summarize only 'object' column
movies.describe(include='object')
title | content_rating | genre | actors_list | |
---|---|---|---|---|
count | 979 | 976 | 979 | 979 |
unique | 975 | 12 | 16 | 969 |
top | True Grit | R | Drama | [u'Daniel Radcliffe', u'Emma Watson', u'Rupert... |
freq | 2 | 460 | 278 | 6 |