{{< partial "learn_x_header" >}}

## What You'll Build

You'll build a CSV sales analyzer that reads store sales data and prints it in manageable chunks. By the end, you'll have a working Python program and understand the fundamentals.

This tutorial takes about 45 minutes. You'll need basic command line skills and a text editor.

## Prerequisites

Before starting, you need:

* A command line terminal (Terminal on Mac/Linux, PowerShell on Windows).
* A text editor (VS Code, Sublime Text, or even Notepad).
* Internet connection (for downloading Python and packages).
* 45 minutes of focused time.

Don't worry if you're new to programming. I'll explain each step.

## Step 1: Install and Verify Python

First, check if Python is already installed.

Open your terminal and run:

```sh
python3 --version
```

You should see output like:

```sh
Python 3.9.7
```

If you see an error like "command not found," [install Python] first. Download the installer, run it, and make sure to check "Add Python to PATH" during installation.

After installing, run `python3 --version` again to verify.

## Step 2: Create Your Project Folder

Create a folder for your project.

```sh
mkdir csv-analyzer
cd csv-analyzer
```

Verify you're in the right folder:

```sh
pwd
```

You should see a path ending in `csv-analyzer`, like:

```sh
/Users/yourname/csv-analyzer
```

This keeps your project organized and isolated.

## Step 3: Say Hello to Python

Let's verify Python works by creating a simple program.

Create a file called `hello.py`:

```sh
touch hello.py
```

Open `hello.py` in your text editor and add this line:

```python
print("Hello, Python!")
```

Save the file and run it:

```sh
python3 hello.py
```

You should see:

```sh
Hello, Python!
```

If you see this, Python is working. If you see an error, check that you saved the file and you're in the `csv-analyzer` folder.

**What you learned:** The `print()` function displays text. You just wrote and ran your first Python program.

## Why Python?

Python is a programming language known for readable syntax and powerful libraries. It's popular for data analysis, web development, and automation. Created by [Guido van Rossum] in 1991, it's now one of the most-used languages in the world.

You're learning Python by building something real. Let's keep going.

## Step 4: Create Sample Sales Data

Create a CSV file with sample sales data.

Create a file called `example.csv`:

```sh
touch example.csv
```

Open `example.csv` and add this data:

```csv
store,sales
Office A,7
Office B,3
Office C,9
Office D,100
Office E,4
Office F,96
Office G,56
Office H,34
Office I,37
Office J,7
```

Save the file. This is the data your analyzer will process.

**What you learned:** CSV (Comma-Separated Values) files store data in rows and columns. The first row is the header (store, sales), and each following row is a record.

## Step 5: Read the CSV File

Now make Python read the file.

Create a file called `analyze.py`:

```sh
touch analyze.py
```

Open `analyze.py` and add this code:

```python
filename = "example.csv"
print(f"Reading {filename}")

try:
    with open(filename, "r") as file:
        content = file.read()
        print(content)
except FileNotFoundError:
    print(f"Error: {filename} not found. Check that the file exists.")
```

Run it:

```sh
python3 analyze.py
```

You should see:

```sh
Reading example.csv
store,sales
Office A,7
Office B,3
Office C,9
Office D,100
Office E,4
Office F,96
Office G,56
Office H,34
Office I,37
Office J,7
```

**What you learned:**

* **Variables** store values. `filename = "example.csv"` creates a variable called `filename`.
* **f-strings** format text with variables. `f"Reading {filename}"` inserts the filename into the text.
* **try/except** handles errors. If the file doesn't exist, the program prints an error instead of crashing.
* **with open()** opens files safely. Python automatically closes the file when done.

## Step 6: Set Up a Virtual Environment

Before installing packages, create a virtual environment. This keeps your project's packages separate from your system Python.

Run this command in your `csv-analyzer` folder:

```sh
python3 -m venv venv
```

This creates a folder called `venv` that holds your project's packages.

Now activate the environment:

On Mac/Linux:

```sh
source venv/bin/activate
```

On Windows:

```sh
venv\Scripts\activate
```

You should see `(venv)` at the start of your terminal prompt:

```sh
(venv) user@laptop:~/csv-analyzer$
```

If you see this, your virtual environment is active.

**What you learned:** Virtual environments prevent package conflicts. Each project gets its own isolated Python environment. This is critical for professional Python development.

**Note:** If the command hangs on a network drive, move your folder to a physical disk.

## Step 7: Install the Pandas Library

Pandas is a powerful library for working with data. Install it:

```sh
pip install pandas
```

You should see output like:

```sh
Successfully installed pandas-2.0.0 numpy-1.24.0 ...
```

Verify it's installed:

```sh
pip freeze
```

You should see a list including:

```sh
pandas==2.0.0
numpy==1.24.0
...
```

**What you learned:** `pip` is Python's package installer. `pip install` downloads and installs packages. `pip freeze` shows installed packages and versions.

## Step 8: Parse CSV Data with Pandas

Now use Pandas to read and analyze the CSV file.

Open `analyze.py` and replace the contents with this:

```python
import pandas as pd

filename = "example.csv"
print(f"Reading {filename}")
print()

try:
    data = pd.read_csv(filename)
    print(data)
except FileNotFoundError:
    print(f"Error: {filename} not found.")
```

Run it:

```sh
python3 analyze.py
```

You should see formatted output:

```sh
Reading example.csv

      store  sales
0  Office A      7
1  Office B      3
2  Office C      9
3  Office D    100
4  Office E      4
5  Office F     96
6  Office G     56
7  Office H     34
8  Office I     37
9  Office J      7
```

**What you learned:**

* **import** loads libraries. `import pandas as pd` loads Pandas and gives it a short name (`pd`).
* **pd.read_csv()** parses CSV files into a structured format called a DataFrame.
* Pandas automatically formats the data into neat columns with row numbers.

## Step 9: Chunk the Data

For large CSV files, reading everything at once can be slow. Let's process the data in chunks.

Open `analyze.py` and replace the contents with this:

```python
import pandas as pd

filename = "example.csv"
chunksize = 3

print(f"Reading {filename} in chunks of {chunksize} rows")
print()

try:
    for chunk in pd.read_csv(filename, chunksize=chunksize):
        print(chunk)
        print()
except FileNotFoundError:
    print(f"Error: {filename} not found.")
```

Run it:

```sh
python3 analyze.py
```

You should see:

```sh
Reading example.csv in chunks of 3 rows

      store  sales
0  Office A      7
1  Office B      3
2  Office C      9

      store  sales
3  Office D    100
4  Office E      4
5  Office F     96

      store  sales
6  Office G     56
7  Office H     34
8  Office I     37

   store  sales
9  Office J      7
```

The data is now processed in groups of 3 rows.

**What you learned:**

* **for loops** repeat actions. `for chunk in ...` processes each chunk one at a time.
* **chunksize** tells Pandas to read the file in pieces instead of all at once.
* This technique is essential for processing large files that don't fit in memory.

## Step 10: Calculate Total Sales

Let's add analysis. Calculate the total sales for each chunk.

Open `analyze.py` and replace the contents with this:

```python
import pandas as pd

filename = "example.csv"
chunksize = 3
total_sales = 0

print(f"Analyzing {filename}")
print()

try:
    for chunk in pd.read_csv(filename, chunksize=chunksize):
        chunk_total = chunk['sales'].sum()
        total_sales += chunk_total
        print(f"Chunk total: {chunk_total}")
        print(chunk)
        print()

    print(f"Total sales across all chunks: {total_sales}")
except FileNotFoundError:
    print(f"Error: {filename} not found.")
```

Run it:

```sh
python3 analyze.py
```

You should see:

```sh
Analyzing example.csv

Chunk total: 19
      store  sales
0  Office A      7
1  Office B      3
2  Office C      9

Chunk total: 200
      store  sales
3  Office D    100
4  Office E      4
5  Office F     96

Chunk total: 127
      store  sales
6  Office G     56
7  Office H     34
8  Office I     37

Chunk total: 7
   store  sales
9  Office J      7

Total sales across all chunks: 353
```

**What you learned:**

* **Accessing columns:** `chunk['sales']` gets the sales column from the chunk.
* **sum()** adds all values in a column.
* **Accumulating values:** `total_sales += chunk_total` adds each chunk's total to a running sum.

## Step 11: Save Your Dependencies

Other people (or future you) will need to know which packages your project uses.

Run this command:

```sh
pip freeze > requirements.txt
```

This creates a file called `requirements.txt` with all installed packages and versions.

View it:

```sh
cat requirements.txt
```

You should see:

```sh
numpy==1.24.0
pandas==2.0.0
python-dateutil==2.8.2
pytz==2023.3
six==1.16.0
```

Anyone can now install the same packages with:

```sh
pip install -r requirements.txt
```

**What you learned:** `requirements.txt` is a standard file that lists project dependencies. This makes your project reproducible.

## Step 12: Leave the Virtual Environment

When you're done working, deactivate the virtual environment:

```sh
deactivate
```

The `(venv)` prefix should disappear from your prompt. You're back to your system Python.

You can reactivate anytime with `source venv/bin/activate` (Mac/Linux) or `venv\Scripts\activate` (Windows).

## What You Built

You created a CSV analyzer that:

* Reads sales data from a file.
* Processes the data in chunks.
* Calculates totals for each chunk and overall.
* Handles errors gracefully.

You learned:

* **Variables:** Store values like `filename = "example.csv"`.
* **f-strings:** Format text with variables like `f"Reading {filename}"`.
* **Functions:** Like `print()`, `open()`, and `sum()`.
* **Loops:** Repeat actions with `for chunk in ...`.
* **Exception handling:** Catch errors with `try/except`.
* **Libraries:** Import and use external packages like Pandas.
* **Virtual environments:** Isolate project dependencies.
* **Package management:** Install packages with `pip` and track them with `requirements.txt`.

## Troubleshooting

**"command not found: python3"**

Python isn't installed or isn't in your PATH. [Download Python](https://www.python.org/downloads/) and check "Add Python to PATH" during installation. Verify with `python3 --version`.

**"No module named pandas"**

Your virtual environment isn't activated, or Pandas isn't installed. Run `source venv/bin/activate` (Mac/Linux) or `venv\Scripts\activate` (Windows), then run `pip install pandas`.

**"FileNotFoundError"**

The program can't find `example.csv`. Check:

* You're in the `csv-analyzer` folder when running the program.
* The file is named exactly `example.csv` (case matters).
* The file is in the same folder as `analyze.py`.

**"venv/bin/activate: No such file or directory"**

You haven't created the virtual environment yet. Run `python3 -m venv venv` first.

**Virtual environment command hangs**

This can happen on network drives. Move your project folder to a physical disk (like your home directory) and try again.

## Where to Go Next

You've learned Python basics by building a real tool. Here's what to explore next:

### More Python Concepts

You used variables, loops, functions, and exception handling. Here are more concepts to learn:

* **Data types:** Integers, floats, booleans, lists, dictionaries.
* **Classes:** Create custom objects like `class Office:`.
* **List comprehensions:** Shorthand for creating lists like `[x * 2 for x in numbers]`.
* **Lambda functions:** Short anonymous functions like `lambda x: x * 2`.

### Python Use Cases

**Data analysis and machine learning:** Pandas, NumPy, TensorFlow, scikit-learn.

**Web development:** Django, Flask, FastAPI.

**Automation and scripting:** Automate repetitive tasks, process files, interact with APIs.

**Less common but possible:** Mobile apps (Kivy), desktop apps (PyQt), games (PyGame), embedded systems (MicroPython).

### Learning Resources

**Books:**

* [Python Crash Course] by Eric Matthes (beginner-friendly, project-based).
* [Python for Data Analysis] by Wes McKinney (creator of Pandas).
* [Learning Python, 5th Edition] by Mark Lutz (comprehensive reference).

**Videos:**

* [Learning Python] on LinkedIn Learning.
* [Python Essential Training] on LinkedIn Learning.
* [Complete Python Developer in 2020: Zero to Mastery] on Udemy.

**Online:**

* [The official Python tutorial](https://docs.python.org/3/tutorial/) for core language features.
* [The Zen of Python](https://www.python.org/dev/peps/pep-0020/) for Python philosophy.
* [Python Package Index] to discover libraries.

### Extend Your Project

Challenge yourself by adding features to your analyzer:

* Filter stores with sales above a threshold.
* Sort stores by sales amount.
* Calculate average sales per store.
* Read the filename from command line arguments.
* Export results to a new CSV file.

### Example Code Repository

View the complete code for this tutorial in [my repository on GitHub]:

```sh
git clone git@github.com:jeffabailey/learn.git
cd learn/programming/python
```

## Appendix: Python Quick Reference

Here are common Python constructs you'll encounter. Use this as a reference after completing the tutorial. You don't need to memorize these now. Come back to this section when you need to look something up.

### Variables

```python
office_name = "Office A"
office_sales = 7
office_score = 7.5
office_is_active = True
```

Python uses [snake_case] for variable names. See [the naming section of Google's style guide] for conventions.

### Comments

```python
# Single-line comment

"""
Multi-line comment
for longer explanations
"""
```

### Control Structures

**For loop:**

```python
offices = ["Office A", "Office B", "Office C"]
for office in offices:
    print(office)
```

**While loop:**

```python
offices = ["Office A", "Office B", "Office C"]
while offices:
    print(offices.pop())
```

**If-else statement:**

```python
if office_b_sales > office_a_sales:
    print("Office B has more sales")
elif office_a_sales > office_b_sales:
    print("Office A has more sales")
else:
    print("Sales are equal")
```

### Functions

```python
def calculate_total(sales_list):
    total = sum(sales_list)
    return total

result = calculate_total([7, 3, 9])
print(result)  # 19
```

### Classes

```python
class Office:
    def __init__(self, name, location, sales):
        self.name = name
        self.location = location
        self.sales = sales

office = Office("Office A", "Portland, Oregon", 7)
print(f"Name: {office.name}")
print(f"Sales: {office.sales}")
```

### Exception Handling

```python
try:
    file = open("data.csv", "r")
    content = file.read()
except FileNotFoundError:
    print("File not found")
finally:
    if file:
        file.close()
```

### Lists (Arrays)

```python
offices = ["Office A", "Office B", "Office C"]

# Access
print(offices[0])  # Office A

# Update
offices[0] = "Office Z"

# Length
print(len(offices))  # 3

# Add
offices.append("Office D")

# Remove
offices.remove("Office B")

# Loop
for office in offices:
    print(office)
```

### Operators

**Arithmetic:**

```python
addition = 1 + 1
subtraction = 2 - 1
multiplication = 3 * 3
division = 10 / 5
modulus = 6 % 3
exponent = 2 ** 3
```

**Assignment:**

```python
x = 1
x += 1  # x is now 2
x -= 1  # x is now 1
x *= 5  # x is now 5
x /= 5  # x is now 1.0
```

**Comparison:**

```python
a == b  # Equal
a != b  # Not equal
a > b   # Greater than
a < b   # Less than
a >= b  # Greater than or equal
a <= b  # Less than or equal
```

**Note:** For type comparisons, use [isinstance()][the isinstance built-in function] instead of operators.

### Lambda Functions

```python
offices = [
    {'name': 'Office A', 'sales': 7},
    {'name': 'Office B', 'sales': 3},
    {'name': 'Office C', 'sales': 9}
]

# Find office with highest sales
top_office = max(offices, key=lambda x: x['sales'])
print(top_office)  # {'name': 'Office C', 'sales': 9}
```

Use lambdas for simple operations. For complex logic, use regular functions.

## Related Content

* [Python Package Index] to search for Python packages.
* [The Zen of Python](https://www.python.org/dev/peps/pep-0020/) for Python philosophy.
* [W3Schools Python Tutorial](https://www.w3schools.com/python/) for more examples.

[Guido van Rossum]: https://en.wikipedia.org/wiki/Guido_van_Rossum
[install Python]: https://www.python.org/downloads/
[Python Crash Course]: https://amzn.to/3d2s9kw
[Python for Data Analysis]: https://amzn.to/2TxtmZc
[Learning Python, 5th Edition]: https://amzn.to/3edZhFX
[Learning Python]: https://www.linkedin.com/learning/learning-python-25309312
[Python Essential Training]: https://www.linkedin.com/learning/python-essential-training-2?u=2130809
[Complete Python Developer in 2020: Zero to Mastery]: https://www.udemy.com/course/complete-python-developer-zero-to-mastery/
[my repository on GitHub]: https://github.com/jeffabailey/learn
[Python Package Index]: https://pypi.org/
[snake_case]: https://peps.python.org/pep-0008/#naming-conventions
[the naming section of Google's style guide]: https://google.github.io/styleguide/pyguide.html#316-naming
[the isinstance built-in function]: https://docs.python.org/3.7/library/functions.html#isinstance