This time, it's a data visualization exercise.
Wouldn't it be boring to use the same data again?
Let's use new data.
1. File and Problem
The file is annual crime occurrence and arrest statistics obtained from the public data portal.
Let's draw a graph according to the conditions below.
Visualize the occurrence and arrest counts for hacking (data breaches)
Set the title, x-axis label, y-axis label, and legend
Change the line style freely
Other styles are free

2. Problem Solution
First, let's copy the initial content from the previous post. Here, the only change is the file path.
import pandas as pd
# Call modules and set Korean font
import matplotlib.pyplot as plt
import matplotlib
# Font settings for MacOS
# matplotlib.rcParams["font.family"] = "AppleGothic"
# Font settings for Windows
matplotlib.rcParams["font.family"] = "Malgun Gothic"
# Set font size
matplotlib.rcParams["font.size"] = 13
# Solve negative output problem
plt.rcParams['axes.unicode_minus'] = False
crime = pd.read_excel("./crime_statistics.xlsx")
crime.head(3)When you check the data, you can see that the years are duplicated twice according to the category.

Remove duplicate years using unique().
And extract the occurrence and arrest counts for hacking using loc and conditional statements to draw a graph.
x = crime.loc[:,"Year"].unique()
y1 = crime.loc[crime.Category=="Occurrences","Hacking (Data Breach)"]
y2 = crime.loc[crime.Category=="Arrests","Hacking (Data Breach)"]
plt.plot(x,y1)
plt.plot(x,y2)
Let's add a legend and title to the graph.
x = crime.loc[:,"Year"].unique()
y1 = crime.loc[crime.Category=="Occurrences","Hacking (Data Breach)"]
y2 = crime.loc[crime.Category=="Arrests","Hacking (Data Breach)"]
plt.plot(x,y1,label="Occurrences") # <-- added
plt.plot(x,y2,label="Arrests") # <-- added
plt.legend() # <-- added
plt.title("Hacking (Data Breach) Occurrences and Arrests") # <-- added
And also set axis names.
x = crime.loc[:,"Year"].unique()
y1 = crime.loc[crime.Category=="Occurrences","Hacking (Data Breach)"]
y2 = crime.loc[crime.Category=="Arrests","Hacking (Data Breach)"]
plt.plot(x,y1,label="Occurrences")
plt.plot(x,y2,label="Arrests")
plt.legend()
plt.xlabel("Year") # <-- added
plt.ylabel("Count") # <-- added
plt.title("Hacking (Data Breach) Occurrences and Arrests")
Finally, specify the line style and add a grid.
x = crime.loc[:,"Year"].unique()
y1 = crime.loc[crime.Category=="Occurrences","Hacking (Data Breach)"]
y2 = crime.loc[crime.Category=="Arrests","Hacking (Data Breach)"]
plt.plot(x,y1,"^--",label="Occurrences") # <-- added
plt.plot(x,y2,"o-",label="Arrests") # <-- added
plt.legend()
plt.xlabel("Year")
plt.ylabel("Count")
plt.title("Hacking (Data Breach) Occurrences and Arrests")
plt.grid() # <-- added
3. Conclusion
In the next post, I plan to discuss how to draw multiple graphs at once in Matplotlib.
I hope this helps you much in practicing data visualization.
댓글을 불러오는 중...