What You’ll Build
You’ll build a CSV sales analyzer that reads store sales data and prints it in manageable chunks. By the end, you’ll have a working Python program and understand the fundamentals.
This tutorial takes about 45 minutes. You’ll need basic command line skills and a text editor.
Prerequisites
Before starting, you need:
- A command line terminal (Terminal on Mac/Linux, PowerShell on Windows).
- A text editor (VS Code, Sublime Text, or even Notepad).
- Internet connection (for downloading Python and packages).
- 45 minutes of focused time.
Don’t worry if you’re new to programming. I’ll explain each step.
Step 1: Install and Verify Python
First, check if Python is already installed.
Open your terminal and run:
python3 --versionYou should see output like:
Python 3.9.7If you see an error like “command not found,” install Python first. Download the installer, run it, and make sure to check “Add Python to PATH” during installation.
After installing, run python3 --version again to verify.
Step 2: Create Your Project Folder
Create a folder for your project.
mkdir csv-analyzer
cd csv-analyzerVerify you’re in the right folder:
pwdYou should see a path ending in csv-analyzer, like:
/Users/yourname/csv-analyzerThis keeps your project organized and isolated.
Step 3: Say Hello to Python
Let’s verify Python works by creating a simple program.
Create a file called hello.py:
touch hello.pyOpen hello.py in your text editor and add this line:
print("Hello, Python!")Save the file and run it:
python3 hello.pyYou should see:
Hello, Python!If you see this, Python is working. If you see an error, check that you saved the file and you’re in the csv-analyzer folder.
What you learned: The print() function displays text. You just wrote and ran your first Python program.
Why Python?
Python is a programming language known for readable syntax and powerful libraries. It’s popular for data analysis, web development, and automation. Created by Guido van Rossum in 1991, it’s now one of the most-used languages in the world.
You’re learning Python by building something real. Let’s keep going.
Step 4: Create Sample Sales Data
Create a CSV file with sample sales data.
Create a file called example.csv:
touch example.csvOpen example.csv and add this data:
store,sales
Office A,7
Office B,3
Office C,9
Office D,100
Office E,4
Office F,96
Office G,56
Office H,34
Office I,37
Office J,7Save the file. This is the data your analyzer will process.
What you learned: CSV (Comma-Separated Values) files store data in rows and columns. The first row is the header (store, sales), and each following row is a record.
Step 5: Read the CSV File
Now make Python read the file.
Create a file called analyze.py:
touch analyze.pyOpen analyze.py and add this code:
filename = "example.csv"
print(f"Reading {filename}")
try:
with open(filename, "r") as file:
content = file.read()
print(content)
except FileNotFoundError:
print(f"Error: {filename} not found. Check that the file exists.")Run it:
python3 analyze.pyYou should see:
Reading example.csv
store,sales
Office A,7
Office B,3
Office C,9
Office D,100
Office E,4
Office F,96
Office G,56
Office H,34
Office I,37
Office J,7What you learned:
- Variables store values.
filename = "example.csv"creates a variable calledfilename. - f-strings format text with variables.
f"Reading {filename}"inserts the filename into the text. - try/except handles errors. If the file doesn’t exist, the program prints an error instead of crashing.
- with open() opens files safely. Python automatically closes the file when done.
Step 6: Set Up a Virtual Environment
Before installing packages, create a virtual environment. This keeps your project’s packages separate from your system Python.
Run this command in your csv-analyzer folder:
python3 -m venv venvThis creates a folder called venv that holds your project’s packages.
Now activate the environment:
On Mac/Linux:
source venv/bin/activateOn Windows:
venv\Scripts\activateYou should see (venv) at the start of your terminal prompt:
(venv) user@laptop:~/csv-analyzer$If you see this, your virtual environment is active.
What you learned: Virtual environments prevent package conflicts. Each project gets its own isolated Python environment. This is critical for professional Python development.
Note: If the command hangs on a network drive, move your folder to a physical disk.
Step 7: Install the Pandas Library
Pandas is a powerful library for working with data. Install it:
pip install pandasYou should see output like:
Successfully installed pandas-2.0.0 numpy-1.24.0 ...Verify it’s installed:
pip freezeYou should see a list including:
pandas==2.0.0
numpy==1.24.0
...What you learned: pip is Python’s package installer. pip install downloads and installs packages. pip freeze shows installed packages and versions.
Step 8: Parse CSV Data with Pandas
Now use Pandas to read and analyze the CSV file.
Open analyze.py and replace the contents with this:
import pandas as pd
filename = "example.csv"
print(f"Reading {filename}")
print()
try:
data = pd.read_csv(filename)
print(data)
except FileNotFoundError:
print(f"Error: {filename} not found.")Run it:
python3 analyze.pyYou should see formatted output:
Reading example.csv
store sales
0 Office A 7
1 Office B 3
2 Office C 9
3 Office D 100
4 Office E 4
5 Office F 96
6 Office G 56
7 Office H 34
8 Office I 37
9 Office J 7What you learned:
- import loads libraries.
import pandas as pdloads Pandas and gives it a short name (pd). - pd.read_csv() parses CSV files into a structured format called a DataFrame.
- Pandas automatically formats the data into neat columns with row numbers.
Step 9: Chunk the Data
For large CSV files, reading everything at once can be slow. Let’s process the data in chunks.
Open analyze.py and replace the contents with this:
import pandas as pd
filename = "example.csv"
chunksize = 3
print(f"Reading {filename} in chunks of {chunksize} rows")
print()
try:
for chunk in pd.read_csv(filename, chunksize=chunksize):
print(chunk)
print()
except FileNotFoundError:
print(f"Error: {filename} not found.")Run it:
python3 analyze.pyYou should see:
Reading example.csv in chunks of 3 rows
store sales
0 Office A 7
1 Office B 3
2 Office C 9
store sales
3 Office D 100
4 Office E 4
5 Office F 96
store sales
6 Office G 56
7 Office H 34
8 Office I 37
store sales
9 Office J 7The data is now processed in groups of 3 rows.
What you learned:
- for loops repeat actions.
for chunk in ...processes each chunk one at a time. - chunksize tells Pandas to read the file in pieces instead of all at once.
- This technique is essential for processing large files that don’t fit in memory.
Step 10: Calculate Total Sales
Let’s add analysis. Calculate the total sales for each chunk.
Open analyze.py and replace the contents with this:
import pandas as pd
filename = "example.csv"
chunksize = 3
total_sales = 0
print(f"Analyzing {filename}")
print()
try:
for chunk in pd.read_csv(filename, chunksize=chunksize):
chunk_total = chunk['sales'].sum()
total_sales += chunk_total
print(f"Chunk total: {chunk_total}")
print(chunk)
print()
print(f"Total sales across all chunks: {total_sales}")
except FileNotFoundError:
print(f"Error: {filename} not found.")Run it:
python3 analyze.pyYou should see:
Analyzing example.csv
Chunk total: 19
store sales
0 Office A 7
1 Office B 3
2 Office C 9
Chunk total: 200
store sales
3 Office D 100
4 Office E 4
5 Office F 96
Chunk total: 127
store sales
6 Office G 56
7 Office H 34
8 Office I 37
Chunk total: 7
store sales
9 Office J 7
Total sales across all chunks: 353What you learned:
- Accessing columns:
chunk['sales']gets the sales column from the chunk. - sum() adds all values in a column.
- Accumulating values:
total_sales += chunk_totaladds each chunk’s total to a running sum.
Step 11: Save Your Dependencies
Other people (or future you) will need to know which packages your project uses.
Run this command:
pip freeze > requirements.txtThis creates a file called requirements.txt with all installed packages and versions.
View it:
cat requirements.txtYou should see:
numpy==1.24.0
pandas==2.0.0
python-dateutil==2.8.2
pytz==2023.3
six==1.16.0Anyone can now install the same packages with:
pip install -r requirements.txtWhat you learned: requirements.txt is a standard file that lists project dependencies. This makes your project reproducible.
Step 12: Leave the Virtual Environment
When you’re done working, deactivate the virtual environment:
deactivateThe (venv) prefix should disappear from your prompt. You’re back to your system Python.
You can reactivate anytime with source venv/bin/activate (Mac/Linux) or venv\Scripts\activate (Windows).
What You Built
You created a CSV analyzer that:
- Reads sales data from a file.
- Processes the data in chunks.
- Calculates totals for each chunk and overall.
- Handles errors gracefully.
You learned:
- Variables: Store values like
filename = "example.csv". - f-strings: Format text with variables like
f"Reading {filename}". - Functions: Like
print(),open(), andsum(). - Loops: Repeat actions with
for chunk in .... - Exception handling: Catch errors with
try/except. - Libraries: Import and use external packages like Pandas.
- Virtual environments: Isolate project dependencies.
- Package management: Install packages with
pipand track them withrequirements.txt.
Troubleshooting
“command not found: python3”
Python isn’t installed or isn’t in your PATH. Download Python and check “Add Python to PATH” during installation. Verify with python3 --version.
“No module named pandas”
Your virtual environment isn’t activated, or Pandas isn’t installed. Run source venv/bin/activate (Mac/Linux) or venv\Scripts\activate (Windows), then run pip install pandas.
“FileNotFoundError”
The program can’t find example.csv. Check:
- You’re in the
csv-analyzerfolder when running the program. - The file is named exactly
example.csv(case matters). - The file is in the same folder as
analyze.py.
“venv/bin/activate: No such file or directory”
You haven’t created the virtual environment yet. Run python3 -m venv venv first.
Virtual environment command hangs
This can happen on network drives. Move your project folder to a physical disk (like your home directory) and try again.
Where to Go Next
You’ve learned Python basics by building a real tool. Here’s what to explore next:
More Python Concepts
You used variables, loops, functions, and exception handling. Here are more concepts to learn:
- Data types: Integers, floats, booleans, lists, dictionaries.
- Classes: Create custom objects like
class Office:. - List comprehensions: Shorthand for creating lists like
[x * 2 for x in numbers]. - Lambda functions: Short anonymous functions like
lambda x: x * 2.
Python Use Cases
Data analysis and machine learning: Pandas, NumPy, TensorFlow, scikit-learn.
Web development: Django, Flask, FastAPI.
Automation and scripting: Automate repetitive tasks, process files, interact with APIs.
Less common but possible: Mobile apps (Kivy), desktop apps (PyQt), games (PyGame), embedded systems (MicroPython).
Learning Resources
Books:
- Python Crash Course by Eric Matthes (beginner-friendly, project-based).
- Python for Data Analysis by Wes McKinney (creator of Pandas).
- Learning Python, 5th Edition by Mark Lutz (comprehensive reference).
Videos:
- Learning Python on LinkedIn Learning.
- Python Essential Training on LinkedIn Learning.
- Complete Python Developer in 2020: Zero to Mastery on Udemy.
Online:
- The official Python tutorial for core language features.
- The Zen of Python for Python philosophy.
- Python Package Index to discover libraries.
Extend Your Project
Challenge yourself by adding features to your analyzer:
- Filter stores with sales above a threshold.
- Sort stores by sales amount.
- Calculate average sales per store.
- Read the filename from command line arguments.
- Export results to a new CSV file.
Example Code Repository
View the complete code for this tutorial in my repository on GitHub:
git clone git@github.com:jeffabailey/learn.git
cd learn/programming/pythonAppendix: Python Quick Reference
Here are common Python constructs you’ll encounter. Use this as a reference after completing the tutorial. You don’t need to memorize these now. Come back to this section when you need to look something up.
Variables
office_name = "Office A"
office_sales = 7
office_score = 7.5
office_is_active = TruePython uses snake_case for variable names. See the naming section of Google’s style guide for conventions.
Comments
# Single-line comment
"""
Multi-line comment
for longer explanations
"""Control Structures
For loop:
offices = ["Office A", "Office B", "Office C"]
for office in offices:
print(office)While loop:
offices = ["Office A", "Office B", "Office C"]
while offices:
print(offices.pop())If-else statement:
if office_b_sales > office_a_sales:
print("Office B has more sales")
elif office_a_sales > office_b_sales:
print("Office A has more sales")
else:
print("Sales are equal")Functions
def calculate_total(sales_list):
total = sum(sales_list)
return total
result = calculate_total([7, 3, 9])
print(result) # 19Classes
class Office:
def __init__(self, name, location, sales):
self.name = name
self.location = location
self.sales = sales
office = Office("Office A", "Portland, Oregon", 7)
print(f"Name: {office.name}")
print(f"Sales: {office.sales}")Exception Handling
try:
file = open("data.csv", "r")
content = file.read()
except FileNotFoundError:
print("File not found")
finally:
if file:
file.close()Lists (Arrays)
offices = ["Office A", "Office B", "Office C"]
# Access
print(offices[0]) # Office A
# Update
offices[0] = "Office Z"
# Length
print(len(offices)) # 3
# Add
offices.append("Office D")
# Remove
offices.remove("Office B")
# Loop
for office in offices:
print(office)Operators
Arithmetic:
addition = 1 + 1
subtraction = 2 - 1
multiplication = 3 * 3
division = 10 / 5
modulus = 6 % 3
exponent = 2 ** 3Assignment:
x = 1
x += 1 # x is now 2
x -= 1 # x is now 1
x *= 5 # x is now 5
x /= 5 # x is now 1.0Comparison:
a == b # Equal
a != b # Not equal
a > b # Greater than
a < b # Less than
a >= b # Greater than or equal
a <= b # Less than or equalNote: For type comparisons, use isinstance() instead of operators.
Lambda Functions
offices = [
{'name': 'Office A', 'sales': 7},
{'name': 'Office B', 'sales': 3},
{'name': 'Office C', 'sales': 9}
]
# Find office with highest sales
top_office = max(offices, key=lambda x: x['sales'])
print(top_office) # {'name': 'Office C', 'sales': 9}Use lambdas for simple operations. For complex logic, use regular functions.
Related Content
- Python Package Index to search for Python packages.
- The Zen of Python for Python philosophy.
- W3Schools Python Tutorial for more examples.

Comments #