Part 2: Data Analysis with powerful Python – How to get the best results from your data
Analyzing and visualizing data from a SQLite database in Python can be a powerful way to gain insights and present your findings. In this blog, I will walk you through the steps to retrieve data from a SQLite database file named gold.db
and display it in the form of a chart using Python. We'll use some essential tools and libraries for this task.
Tools you will need !
Before we start, make sure you have the following tools and libraries installed:
Python: Python is the core programming language for this task. You can download it from python.org.
SQLite Database Browser: You can use a tool like DB Browser for SQLite to explore the database and its structure. This is optional but can be helpful for understanding the database schema.
Jupyter Notebook (optional): Jupyter Notebook is an interactive environment that makes data analysis and visualization easier. You can install it using pip:
pip install jupyter
.Python Libraries:
sqlite3
: This is a built-in library for Python that allows you to work with SQLite databases.pandas
: A popular data manipulation library for Python.matplotlib
: A widely used library for creating charts and visualizations.
You can install the required libraries using pip:
pip install sqlite3 pandas matplotlib
Steps to analyze and visualize data from SQLite database
Now, let's dive into the steps to analyze and display data from the gold.db
database:
Step 1: Connect to the database
First, you need to connect to the SQLite database using the sqlite3
library. Here's how you can do it:
python import sqlite3 # Connect to the database conn = sqlite3.connect("gold.db")
Step 2: Query the database to retrieve the required data
query = """ SELECT strftime('%Y-%m', o.OrderDate) AS Month, p.MetalType AS ProductType, SUM(od.Quantity * od.PriceAtTimeOfPurchase) AS TotalSales FROM Orders o JOIN OrderDetails od ON o.OrderID = od.OrderID JOIN Product p ON od.ProductID = p.ProductID GROUP BY Month, ProductType ORDER BY Month """
Execute the query and fetch the data into a Pandas dataframe:
import pandas as pd df = pd.read_sql_query(query, db_connection)
Step 3: Pivot the data for plotting
pivot_df = df.pivot(index='Month', columns='ProductType', values='TotalSales') pivot_df.fillna(0, inplace=True)
Step 4: Create a chart
Now, it's time to create a chart using the matplotlib
library. Let's say you want to create a line chart to visualize the type product prices over time:
import matplotlib.pyplot as pltplt. # Create a line chartfigure(figsize=(12, 6)) for product_type in pivot_df.columns: plt.plot(pivot_df.index, pivot_df[product_type], marker='o', linestyle='-', label=product_type) plt.xlabel('sales Month and year') plt.ylabel('Total Sales') plt.title('Total Product Sales by Month') plt.legend(loc='upper left') plt.grid(True) # Rotate x-axis labels for better readability plt.xticks(rotation=90) plt.show()
Step 5: Save or display the chart
You can choose to save the chart to a file using plt.savefig("gold_prices_chart.png")
or display it within a Jupyter Notebook if you're using one.
That's it! You have successfully analyzed data from the SQLite database and displayed it as a chart in Python. You can adapt these steps to your specific data and charting requirements. Remember to close the database connection when you're done:
# Close the database connection db_connection.close()
In this blog post, we've covered the essential steps and tools to analyze and visualize data from a SQLite database in Python. Data analysis and visualization are crucial skills for various fields, and Python makes it accessible and powerful for these tasks.