Introduction to Python Data Visualization 6 - Exercises

힘센캥거루
2023년 11월 14일(수정됨)
9
python

This time, it's a data visualization exercise.

Wouldn't it be boring to use the same data again?

Let's use new data.

1. File and Problem

The file is annual crime occurrence and arrest statistics obtained from the public data portal.

Let's draw a graph according to the conditions below.

  • Visualize the occurrence and arrest counts for hacking (data breaches)

  • Set the title, x-axis label, y-axis label, and legend

  • Change the line style freely

  • Other styles are free

2. Problem Solution

First, let's copy the initial content from the previous post. Here, the only change is the file path.

import pandas as pd

# Call modules and set Korean font
import matplotlib.pyplot as plt
import matplotlib

# Font settings for MacOS
# matplotlib.rcParams["font.family"] = "AppleGothic"

# Font settings for Windows
matplotlib.rcParams["font.family"] = "Malgun Gothic"

# Set font size
matplotlib.rcParams["font.size"] = 13

# Solve negative output problem
plt.rcParams['axes.unicode_minus'] = False

crime = pd.read_excel("./crime_statistics.xlsx")
crime.head(3)

When you check the data, you can see that the years are duplicated twice according to the category.

Remove duplicate years using unique().

And extract the occurrence and arrest counts for hacking using loc and conditional statements to draw a graph.

x = crime.loc[:,"Year"].unique()
y1 = crime.loc[crime.Category=="Occurrences","Hacking (Data Breach)"]
y2 = crime.loc[crime.Category=="Arrests","Hacking (Data Breach)"]
plt.plot(x,y1)
plt.plot(x,y2)

Let's add a legend and title to the graph.

x = crime.loc[:,"Year"].unique()
y1 = crime.loc[crime.Category=="Occurrences","Hacking (Data Breach)"]
y2 = crime.loc[crime.Category=="Arrests","Hacking (Data Breach)"]
plt.plot(x,y1,label="Occurrences") # <-- added
plt.plot(x,y2,label="Arrests") # <-- added
plt.legend() # <-- added
plt.title("Hacking (Data Breach) Occurrences and Arrests") # <-- added

And also set axis names.

x = crime.loc[:,"Year"].unique()
y1 = crime.loc[crime.Category=="Occurrences","Hacking (Data Breach)"]
y2 = crime.loc[crime.Category=="Arrests","Hacking (Data Breach)"]
plt.plot(x,y1,label="Occurrences")
plt.plot(x,y2,label="Arrests")
plt.legend()
plt.xlabel("Year") # <-- added
plt.ylabel("Count") # <-- added
plt.title("Hacking (Data Breach) Occurrences and Arrests")

Finally, specify the line style and add a grid.

x = crime.loc[:,"Year"].unique()
y1 = crime.loc[crime.Category=="Occurrences","Hacking (Data Breach)"]
y2 = crime.loc[crime.Category=="Arrests","Hacking (Data Breach)"]
plt.plot(x,y1,"^--",label="Occurrences") # <-- added
plt.plot(x,y2,"o-",label="Arrests") # <-- added
plt.legend()
plt.xlabel("Year")
plt.ylabel("Count")
plt.title("Hacking (Data Breach) Occurrences and Arrests")
plt.grid() # <-- added

3. Conclusion

In the next post, I plan to discuss how to draw multiple graphs at once in Matplotlib.

I hope this helps you much in practicing data visualization.

댓글을 불러오는 중...