Return to Course Home Page


Practice 1-4: Structured Data in Python

⬅️ Previous Session | 🏠 Course Home | ➡️ Next Session |

đź“š Practice 1.

Dictionaries are an extremely common data structure. Often data is easiest to store according to keys, where each row of data has a unique key.

For example, a dictionary of students might have a key for each student’s name or id, and then a set of values that are associated with that student.

Make a dictionary of your peers.

Come up with 3-4 questions and ask 5-6 of your peers to give answers to each one. The questions can be anything you want, but they should be questions that can be answered with a single word or number.

For example, you might ask: - What is your favorite color? (pretty weak question) - How many siblings do you have? (ok, but not great) - What is your favorite genre of film? (better) - On a scale of 1-10, how well do you feel that you understand python dictionaries? (better still)

Make sure you have a key for each person, and then a set of values for each question.

Based on each person’s responses, build a dictionary in the cell below:

(note: You will end up with a dictionary of dictionaries)

# Example:
my_dict = {
    'person1': {
        'question1': 'answer1',
        'question2': 'answer2',
        'question3': 'answer3',
        'question4': 'answer4',
    },
    'person2': {
        'question1': 'answer1',
        'question2': 'answer2',
        'question3': 'answer3',
        'question4': 'answer4',
    },
    'person3': {
        'question1': 'answer1',
        'question2': 'answer2',
        'question3': 'answer3',
        'question4': 'answer4',
    },
}

Dictionary to dataframe

Now that you have a dictionary, you can convert it to a dataframe.

Convert your dictionary to a dataframe

Use the pd.DataFrame() function to convert your dictionary to a dataframe.

import pandas as pd

# Convert a dictionary to a dataframe
df = pd.DataFrame(my_dict)

Investigate your dataframe using the following functions:

  • df.head()
  • df.tail()
  • df.info()

Is your dataframe what you expected? If not, what is different? Why?? How would you need to change your data structure to get the dataframe you expected?

df.transpose() might be helpful here. What does it do? create a new dataframe that is a transposed version of your original dataframe.

# Transpose a dataframe
df_transposed = df.transpose()
Code
# Transpose your dataframe so that the people are the rows and the questions are the columns.

Visualize/summarize your dataframe using one or more of the following functions:

  • df.describe()
  • df.plot()
  • df.hist()

đź“š Practice 2.

Structured data search: Find structured data on the internet and convert it to a dataframe.

You’re looking for data that is in a table format, like a spreadsheet, but not available as an easy to download .csv or .excel file. This turns out to describe a lot of data!

Often you can find this kind of data on wikipedia, or on government websites, or in research articles that contain tables of results.

Convert your data to a dataframe

Explore your dataframe using the following functions:

  • df.describe()
  • df.plot()
  • df.hist()