Data¶
Unit 2 Vocabulary¶
Click Here
1. Data Representation
Term |
Definition |
---|---|
Binary |
A number system using only two digits: 0 and 1. It’s used to represent data in computers. |
Bit |
The smallest unit of data in a computer, represented as either 0 or 1. |
Byte |
A group of 8 bits. It is the basic unit of storage. |
Character Encoding |
A system for representing characters (letters, digits, symbols) as binary data. Examples include ASCII and Unicode. |
Hexadecimal |
A base-16 number system often used to represent binary data in a more human-readable form (using digits 0-9 and letters A-F). |
2. Encoding and Decoding
Term |
Definition |
---|---|
ASCII |
A character encoding standard that represents text in computers using 7 or 8 bits. |
Decoding |
The process of converting machine-readable data (binary) back into a human-readable format. |
Encoding |
The process of converting data from a human-readable format into a machine-readable format. |
Unicode |
A character encoding standard that represents a wider range of characters, including non-English letters, symbols, and emoji. |
3. Compression and Efficiency
Term |
Definition |
---|---|
Data Compression |
The process of reducing the size of data in order to save storage space or transmission time. |
Huffman Coding |
A lossless data compression algorithm that assigns shorter binary codes to more frequent characters and longer codes to less frequent characters. |
Lossless Compression |
Data is compressed without losing any information (e.g., ZIP files, Huffman coding). |
Lossy Compression |
Some data is lost in the compression process, but the file size is significantly reduced (e.g., JPEG images, MP3 audio). |
4. Data Processing and Transformation
Term |
Definition |
---|---|
Data Abstraction |
Simplifying complex data by providing only the essential details. |
Data Transformation |
The process of changing data from one format to another. |
Data Visualization |
The graphical representation of data (charts, graphs, maps) to make information easier to understand. |
5. File Formats and Storage
Term |
Definition |
---|---|
CSV |
A simple text-based file format for storing tabular data, with values separated by commas. |
File Format |
A specification that defines how data is encoded and stored in a file. |
JSON |
A lightweight, text-based format for storing and exchanging data, commonly used in web development. |
XML |
A markup language used for encoding documents in a format that is both human-readable and machine-readable. |
6. Algorithms and Problem-Solving
Term |
Definition |
---|---|
Algorithm |
A step-by-step procedure or formula for solving a problem or completing a task. |
Data Structures |
The organization and storage of data for efficient access and modification. |
Efficiency |
How well an algorithm performs, often measured in time and memory usage (e.g., big-O notation). |
7. Security and Privacy
Term |
Definition |
---|---|
Adware |
Displays unwanted ads, often with tracking. |
Brute Force |
Trying all possible keys to decrypt a message. |
Caesar Cipher |
A substitution cipher that shifts letters by a fixed amount. |
Cipher |
A method for performing encryption or decryption. |
Ciphertext |
The encrypted (scrambled) message. |
Cookies |
Small files stored on your device to track activity and preferences. |
Crypto-analysis |
The art of decoding encrypted messages without the key. |
Data Breach |
When sensitive information is accessed or released without authorization. |
Decryption |
Converting encrypted data back into readable form. |
Encryption |
Converting information into a code to prevent unauthorized access. |
Frequency Analysis |
A method of breaking substitution ciphers by studying how often letters appear. |
Hashing |
Converting data into a fixed-size value (a hash) used to verify integrity. |
Keylogger |
A program that records everything you type, often used to steal passwords. |
Keyword |
A word used to vary the shifts in the cipher. |
Malware |
Malicious software designed to damage or gain unauthorized access to a system. |
Modular Arithmetic |
Calculations done using modulo 26 (number of letters in the alphabet). |
Modulus |
A number used to link the public and private keys in RSA. |
Phishing |
A cyber attack that tricks users into giving up personal or sensitive data. |
Plaintext |
The original readable message. |
Polyalphabetic |
Involving more than one substitution alphabet. |
Private Key |
Used to decrypt data in RSA. Kept secret. |
Public Key |
Used to encrypt data in RSA encryption. Shared with everyone. |
Repeating Key |
The keyword repeats to match the length of the message. |
Ransomware |
Locks data and demands payment. |
Rogue Access Point |
A fake Wi-Fi network set up to trick people into connecting and stealing data. |
Rootkits |
Hide malicious activities from detection. |
Spyware |
Secretly monitors user activity. |
Substitution Cipher |
A cipher that replaces each letter with another. |
Trojans |
Disguise as legitimate software. |
Viruses |
Attach to clean files and spread. |
Worms |
Self-replicate and spread across networks. |
8. Data Representation in Networks
Term |
Definition |
---|---|
IP Address |
A unique identifier assigned to each device on a network. |
Packet |
A small chunk of data sent over a network, which may be part of a larger message. |
Protocol |
A set of rules governing how data is transmitted over a network (e.g., HTTP, FTP, TCP/IP). |
9. Big Data Concepts
Term |
Definition |
---|---|
Big Data |
Large and complex data sets that traditional data processing software can’t handle efficiently. |
Cloud Computing |
The delivery of computing services (storage, processing, etc.) over the internet, often used for managing big data. |
10. Data Ethics
Term |
Definition |
---|---|
Bias in Data |
When data collection or processing methods lead to unfair or skewed results. |
Personally Identifiable Information (PII) |
Any data that could be used to identify an individual (e.g., name, address, SSN). |
Privacy |
The protection of individuals’ personal information, ensuring responsible collection and sharing (e.g., GDPR). |
Data Project 1 - Data Communication and Compression:¶
The project will simulate a “Data Communication and Compression” system where you process, compress, and extract information from a text-based dataset. The project will be divided into sections based on the core topics and you will build a program in Python to demonstrate these concepts.
Objectives
Students will be able to:
Understand the binary number system (base-2)
Convert decimal numbers to binary (and vice versa)
Understand how text is represented in binary using ASCII
Write a Python program to convert text into binary
Introduction
What is the smallest piece of data a computer can understand?
Why do you think this computer engineers used this idea?
Binary Exploration
How many combination of numbers can we make using just one bit?
Each bit will double the number of possible values.
Bits |
Possible Values |
---|---|
1 |
2 |
2 |
4 |
3 |
8 |
8 |
256 |
What does 8 bits look like?
bit 1 |
bit 2 |
bit 3 |
bit 4 |
bit 5 |
bit 6 |
bit 7 |
bit 8 |
---|---|---|---|---|---|---|---|
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
The binary number of 130
is 10000010
.
Part 1: Binary Number System¶
1. Decimal vs. Binary
Decimal (base-10) are made using place values (ones, tens, hundreds…).
Binary (base-2) uses powers of 2.
Binary |
Power of 2 |
Value |
---|---|---|
1 |
2⁰ |
1 |
0 |
2¹ |
0 |
1 |
2² |
4 |
1 |
2³ |
8 |
Total |
13 |
Example: 1101 = 8 + 4 + 0 + 1 = 13
2. Activity: Convert Decimal ↔ Binary
Practice converting the following:
Decimal to binary: 5, 10, 23
Binary to decimal: 101, 1110, 10011
Part 2: How Text Becomes Binary¶
1. ASCII Encoding
Explain that each character (like “A” or “!”) has a number assigned by ASCII.
Character |
ASCII Code |
Binary |
---|---|---|
A |
65 |
01000001 |
a |
97 |
01100001 |
! |
33 |
00100001 |
In Python you can type: ord('A')
and chr(65)
to see the conversion.
ASCII Table
Binary/ ASCII/ and other things
Dec |
Char |
Dec |
Char |
Dec |
Char |
Dec |
Char |
---|---|---|---|---|---|---|---|
0 |
NUL (null) |
32 |
SPACE |
64 |
@ |
96 |
` |
1 |
SOH (start of heading) |
33 |
! |
65 |
A |
97 |
a |
2 |
STX (start of text) |
34 |
“ |
66 |
B |
98 |
b |
3 |
ETX (end of text) |
35 |
# |
67 |
C |
99 |
c |
4 |
EOT (end of transmission) |
36 |
$ |
68 |
D |
100 |
d |
5 |
ENQ (enquiry) |
37 |
% |
69 |
E |
101 |
e |
6 |
ACK (acknowledge) |
38 |
& |
70 |
F |
102 |
f |
7 |
BEL (bell) |
39 |
‘ |
71 |
G |
103 |
g |
8 |
BS (backspace) |
40 |
( |
72 |
H |
104 |
h |
9 |
TAB (horizontal tab) |
41 |
) |
73 |
I |
105 |
i |
10 |
LF (NL line feed, new line) |
42 |
* |
74 |
J |
106 |
j |
11 |
VT (vertical tab) |
43 |
+ |
75 |
K |
107 |
k |
12 |
FF (NP form feed, new page) |
44 |
, |
76 |
L |
108 |
l |
13 |
CR (carriage return) |
45 |
- |
77 |
M |
109 |
m |
14 |
SO (shift out) |
46 |
. |
78 |
N |
110 |
n |
15 |
SI (shift in) |
47 |
/ |
79 |
O |
111 |
o |
16 |
DLE (data link escape) |
48 |
0 |
80 |
P |
112 |
p |
17 |
DC1 (device control 1) |
49 |
1 |
81 |
Q |
113 |
q |
18 |
DC2 (device control 2) |
50 |
2 |
82 |
R |
114 |
r |
19 |
DC3 (device control 3) |
51 |
3 |
83 |
S |
115 |
s |
20 |
DC4 (device control 4) |
52 |
4 |
84 |
T |
116 |
t |
21 |
NAK (negative acknowledge) |
53 |
5 |
85 |
U |
117 |
u |
22 |
SYN (synchronous idle) |
54 |
6 |
86 |
V |
118 |
v |
23 |
ETB (end of trans. block) |
55 |
7 |
87 |
W |
119 |
w |
24 |
CAN (cancel) |
56 |
8 |
88 |
X |
120 |
x |
25 |
EM (end of medium) |
57 |
9 |
89 |
Y |
121 |
y |
26 |
SUB (substitute) |
58 |
: |
90 |
Z |
122 |
z |
27 |
ESC (escape) |
59 |
; |
91 |
[ |
123 |
{ |
28 |
FS (file separator) |
60 |
< |
92 |
\ |
124 |
| |
29 |
GS (group separator) |
61 |
= |
93 |
] |
125 |
} |
30 |
RS (record separator) |
62 |
> |
94 |
^ |
126 |
~ |
31 |
US (unit separator) |
63 |
? |
95 |
_ |
127 |
DEL |
2. Activity: Binary Message*
Use the handout to practice binary and ASCII.
Part 3: Programming - Convert Text to Binary¶
Objective:
Takes a message as input
Converts each character to its ASCII number
Converts that number to 8-bit binary
Displays the final binary string
Step 1: Create the function
Start by creating a function that takes one input (a string message).
def text_to_binary(message):
Step 2: Make an empty list to hold binary codes
Inside the function, make an empty list to save binary versions of each character.
binary_list = []
Step 3: Loop through each letter in the message
Use a for
loop to go through each character in the message one by one.
for char in message:
Step 4: Convert the character to ASCII code
Use the ord()
function to get the ASCII number for that character.
ascii_number = ord(char)
Step 5: Convert the ASCII number to binary (8 digits)
Use the format()
function to change the number into 8-digit binary.
binary_code = format(ascii_number, '08b')
Step 6: Add the binary code to the list
Now add the binary version to the list you created earlier.
binary_list.append(binary_code)
Step 7: Join the binary codes into one string
Once the loop is done, combine everything into one string, with a space between each binary number.
final_binary = ' '.join(binary_list)
Step 8: Return the final binary string
Return the complete string from the function.
return final_binary
Step 9: Call the function and show the result
Try it out with a message like "Hello, World!"
and print the result.
result = text_to_binary("Hello, World!")
print(result)
Example Output:
01001000 01100101 01101100 01101100 01101111 00101100 00100000 01010111 01101111 01110010 01101100 01100100 00100001
Extensions (Optional)
Encode and decode secret messages
Create a visual binary art project using black/white squares
Part 4: Programming - Convert Binary to Text¶
Learning Objectives
Students will be able to:
Explain how binary represents ASCII characters.
Convert binary strings into integers using
int(binary, 2)
.Translate integers into ASCII characters with
chr()
.
Recall:
Binary: Base-2 number system (only
0
and1
).ASCII: Each character has a decimal code (e.g.,
A = 65
,a = 97
).Computers store text as binary numbers corresponding to these ASCII values.
Example:
Letter
A
→ Decimal65
→ Binary01000001
.
Step 1: Understanding the Conversion
We need to go binary → decimal → ASCII character.
Binary to Decimal:
int("01000001", 2)
→65
Decimal to ASCII:
chr(65)
→"A"
Step 2: Single Character Conversion
# Convert a single 8-bit binary number to ASCII
binary_num = "01000001" # Binary for 'A'
decimal_num = int(binary_num, 2) # Convert to decimal (65)
ascii_char = chr(decimal_num) # Convert to character ('A')
print(ascii_char)
Output: A
Step 3: Multiple Binary Characters
# Convert a string of binary numbers separated by spaces
binary_string = "01001000 01001001" # Binary for "HI"
# Split the string into a list of binary numbers
binaries = binary_string.split()
# Convert each binary number to ASCII
ascii_text = "".join([chr(int(b, 2)) for b in binaries])
print(ascii_text)
Output: HI
Step 4: Call the function and show the result
binary_to_ascii()
Practice Activities
Decode a Message: Use this binary string:
01001000 01100101 01101100 01101100 01101111
→ The output should decode to
"Hello"
.Encode/Decode Cycle:
Encode
"Cat"
into binary using an ASCII table.Feed it back into the program to confirm it decodes correctly.
Debugging Challenge: Ask for the broken version (e.g., missing
.split()
or incorrect base) and fix it.
Extensions
Modify the program to ignore invalid inputs (skip anything that’s not 8 bits of
0
/1
).Add an encode option (ASCII → Binary).
Wrap the program in a menu system so the user can choose Encode or Decode.
Adding & Subtracting Binary Numbers
Objectives
By the end of this lesson, students will be able to:
Explain how binary addition and subtraction follow rules similar to decimal arithmetic.
Apply the binary addition rules (carry-over with base 2).
Apply the binary subtraction rules (borrowing with base 2).
Perform binary addition and subtraction problems by hand and in Python.
Resource How to Add & Subtract Binary Number
Background Knowledge
Binary uses only two digits:
0
and1
.Just like base-10 arithmetic uses carries and borrows, binary does too—except instead of carrying over at
10
, it carries over at2
(binary10
).
Step 1: Binary Addition Rules
Binary has 4 possible single-bit addition cases:
Binary |
Result |
---|---|
0 + 0 |
0 |
0 + 1 |
1 |
1 + 0 |
1 |
1 + 1 |
0 (carry |
1 + 1 + 1 |
1 (Carry |
Example 1:
111 <- Carry a '1'
1011 (decimal 11)
+ 1101 (decimal 13)
---------
11000 (decimal 24)
Step 2: Binary Subtraction Rules
Binary subtraction has 4 possible single-bit cases:
Binary |
Result |
---|---|
0 - 0 |
0 |
1 - 0 |
1 |
1 - 1 |
0 |
0 - 1 |
1 (borrow 1 from the next column) |
Note: When borrowing in binary, you borrow a 2
(binary 10
).
Example 2:
10101 (decimal 21)
- 00111 (decimal 7)
----------
01110 (decimal 14)
Step 3: Binary Math in Python
Python makes this easy by using int(string, 2)
to convert binary to decimal, then bin()
to convert back.
Addition Example
a = "1011" # binary for 11
b = "1101" # binary for 13
# Convert to decimal
a_dec = int(a, 2)
b_dec = int(b, 2)
# Add and convert back to binary
result = bin(a_dec + b_dec)[2:]
print("Addition result:", result) # Output: 11000
Subtraction Example
a = "10101" # binary for 21
b = "00111" # binary for 7
# Convert to decimal
a_dec = int(a, 2)
b_dec = int(b, 2)
# Subtract and convert back to binary
result = bin(a_dec - b_dec)[2:]
print("Subtraction result:", result) # Output: 1110
Practice
Add
1010
+0011
. (Check:10 + 3 = 13 → 1101
).Subtract
1101
-0101
. (Check:13 - 5 = 8 → 1000
).Add:
1110 + 0111
10101 + 11001
Subtract:
10010 - 00101
11111 - 01111
Extensions
Write a Python program that:
Asks the user for two binary numbers.
Asks whether they want to add or subtract.
Returns the binary result.
Part 5: Bitmaps & Binary¶
Part 1: What is a Bitmap?
A bitmap is a type of digital image made of small squares called pixels.
Each pixel has information stored in binary.
Black-and-white images (1-bit per pixel):
0
= white1
= black
Grayscale images (8 bits per pixel):
Each pixel stores a number from
0
(black) to255
(white).
Color images (24 bits per pixel / RGB):
8 bits for Red, 8 bits for Green, 8 bits for Blue.
Together, they can make over 16 million colors.
Example:
(255, 0, 0)
= pure red
(0, 255, 0)
= pure green
(0, 0, 255)
= pure blue
(255, 255, 255)
= white
Create Your Own Binary Picture
Use the 8×8 grid below.
Fill each square with a
0
(white) or1
(black).Shade the boxes to match your binary.
Challenge: Using Google Sheets make a smiley face, initials, or a pixel heart using a 8 x 8 grid.
Row/Col → |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
---|---|---|---|---|---|---|---|---|
1 |
||||||||
2 |
||||||||
3 |
||||||||
4 |
||||||||
5 |
||||||||
6 |
||||||||
7 |
||||||||
8 |
Decode a Binary Image
Below is an 8×8 binary picture.
Row/Col → |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
---|---|---|---|---|---|---|---|---|
1 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
2 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
3 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
4 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
5 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
1 |
6 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
7 |
0 |
1 |
0 |
0 |
0 |
0 |
1 |
0 |
8 |
0 |
0 |
1 |
1 |
1 |
1 |
0 |
0 |
Color Bitmaps
Below is a 4×4 grid with grayscale values (0–9), where 0 = black
and 9 = white
.
Example: 5
would be gray.
Row/Col → |
1 |
2 |
3 |
4 |
---|---|---|---|---|
1 |
9 |
9 |
9 |
9 |
2 |
9 |
5 |
5 |
9 |
3 |
9 |
5 |
5 |
9 |
4 |
9 |
9 |
9 |
9 |
Use Google Sheets to create a “color” image that uses the shades listed above (light shades for higher numbers, darker for lower).
Resolution & File Size
Resolution = number of pixels (e.g., 1920×1080 has over 2 million pixels).
Color depth = number of bits per pixel.
File size formula = pixels × bits per pixel ÷ 8 (to convert to bytes).
Feature |
Bitmap (.bmp) |
JPEG (.jpg) |
PNG (.png) |
---|---|---|---|
Compression |
❌ None (Uncompressed) |
✅ Lossy Compression |
✅ Lossless Compression |
File Size |
Very Large |
Small |
Medium |
Image Quality |
High (exact pixel values) |
Lower (some detail lost) |
High (no quality loss) |
Supports Transparency |
❌ No |
❌ No |
✅ Yes |
Best For |
Raw image storage, editing |
Photos, web use |
Graphics with transparency or text |
Color Depth |
Supports 24-bit & more |
24-bit color |
24-bit color + alpha channel |
Speed |
Fast to decode, slow to transfer |
Fast to transfer |
Balanced |
Common Uses |
Image processing, printing, archives |
Web photos, social media |
Logos, icons, UI elements |
Bitmap (.bmp)
Every single pixel is saved exactly — no shortcuts!
Super accurate, but that makes the file huge.
Example: A 100×100 pixel image with 24-bit color = ~30 KB
Great for image editing where every detail matters.
JPEG (.jpg or .jpeg)
Uses lossy compression — it throws away some data to shrink file size.
Smaller files = faster uploads/downloads
Best for photographs, but not great for text, sharp edges, or repeated editing.
If you keep saving a JPEG, the quality degrades each time.
PNG (.png)
Uses lossless compression — it compresses without losing data.
Keeps sharp edges and text looking clean.
Supports transparency (great for web design, overlays, UI).
Bigger than JPEG, but better for logos, screenshots, and pixel art.
Complete the Reflection Questions Below
How does binary connect to digital images?
What’s the difference between black-and-white, grayscale, and color bitmaps?
Why do higher resolution and more colors make bigger files?
Can you think of a situation where low resolution is actually useful? (hint: emojis, icons, pixel art)
Data Project 2: “Data Detectives – Investigating Real-World Trends”¶
Objective
Investigate and research a real-world issue using data. Collect or access a dataset, clean and organize the data, analyze it, and present your findings with a chart in Google Sheets and a written explanation.
Big Idea 2 Concepts
Data Acquisition: Students obtain or access a dataset (e.g., from open data portals or surveys).
Data Representation: Students convert raw data into usable formats.
Data Analysis: Students filter, sort, and compute statistics to identify patterns or trends.
Visualization: Students create graphs or infographics to communicate insights clearly.
Project Steps
Pick a Topic (in pairs or solo)
Students choose a topic of interest such as:
Climate change
COVID-19 statistics
Music or movie popularity
Social media usage
Sports performance
Gaming trends
A survey you conduct with classmates
Find or Collect Data
Sources can include:
Survey data from peers (e.g., favorite apps, screen time, etc.)
Prepare the Data
Open Google Sheets and paste or import your data.
To import: File → Import → Upload (CSV/Excel).
Clean the data:
Delete duplicate rows (Data → Data cleanup → Remove duplicates).
Fill in or remove missing values.
Reformat text into numbers if needed (ex: percentages, dates, scores).
Analyze the Data
Ask yourself:
What trend is increasing the fastest?
Is there a correlation between the variables?
What is the average, maximum, or minimum?
How do categories compare (percentages, rankings, etc.)?
Use basic Google Sheets functions:
=AVERAGE(range)
→ mean=MEDIAN(range)
→ median=MODE(range)
→ most common value=MAX(range)
→ largest number=MIN(range)
→ smallest number=COUNT(range)
or=COUNTIF(range, condition)
→ counts
Create a Chart in Google Sheets
Instructions:
Highlight the data you want to visualize.
Go to Insert → Chart.
In the Chart Editor (appears on the right):
Under Chart type, choose the best chart for your data:
Column/Bar chart → compare categories
Line chart → show trends over time
Pie chart → show percentages
Scatter plot → show relationships between two variables
Customize your chart:
Add a descriptive title.
Label the x-axis and y-axis.
Use clear colors that make the data easy to read.
Present Your Findings
Deliverables:
A chart or graph created in Google Sheets
A written explanation (at least 1 paragraph) that explains:
Why did you choose this dataset
What your chart shows
Why the data is meaningful
What insights you gained
(Optional) A short presentation or recorded video
Rubric (suggested categories)
Category |
Points |
---|---|
Relevant topic and data source |
1 |
Clean and organized data |
1 |
Meaningful analysis (insights/statistics) |
2 |
Visualization quality and clarity |
2 |
Written explanation |
2 |
Creativity and effort |
2 |
Total |
10 pts |
Data Project 3 - Phishing¶
Objective:
Students will be able to:
Define phishing and related cybersecurity vocabulary
Identify real-life phishing threats
Understand how their personal information is collected and used
Apply basic to advanced protection measures
Create a digital artifact to demonstrate their learning
Introduction
Discussion: “Have you ever received a suspicious email or message asking you to click a link or enter your password?”
Learn more about phishing emails and websites.
What clues make them suspicious?
Instructions:
Explain phishing using a real-world analogy (e.g., someone pretending to be your friend to get your house key)
Review all vocabulary words with examples
Activity 1: PII Collection Table
Instructions: What is PII? Click on the link to learn more about Personally Identifiable Information. Research and create a table like the one below. Identify three websites or apps that you have used recently (e.g., Instagram, Google).
Click Here for evidence relating to PII breaches
Big Business Breaches
TransUnion (Credit Reporting Giant)
In July 2025, TransUnion suffered a breach affecting over 4.4 million Americans, including Social Security numbers, names, birth dates, email addresses, and more—despite initial claims downplaying the severity. The breach was linked to a compromised Salesforce account. Affected individuals are being offered 24 months of free identity theft protection. (TechRadar, IT Pro)
Allianz Life (Insurance Firm)
In late July 2025, about 1.1 million U.S. customers had their personal information—including names, addresses, phone numbers, and emails—compromised in a cyberattack. The company plans to offer two years of identity monitoring to those affected. (Reuters)
OnTrac (Delivery Service)
Between April 13–15, 2025, OnTrac had sensitive data from over 40,000 individuals exposed, including full names, birth dates, Social Security numbers, driver’s license or state ID numbers, and even medical or health insurance details. (Tom’s Guide)
Mass Credential Leak
An enormous breach exposed 16 billion login credentials—usernames, passwords, and associated URLs—from platforms like Apple, Google, Facebook, and many government and corporate systems. This aggregation stems from multiple sources, frequently due to malware-based theft of credentials. (Tom’s Guide)
Medical & Healthcare Industry Breaches
Change Healthcare (UnitedHealth Group)
In February 2024, a ransomware attack (by ALPHV/BlackCat) on Change Healthcare—processing medical and billing data for around a third of Americans—resulted in the theft of sensitive personal and health data from over 100 million individuals. (TechCrunch)
Frederick Health
A ransomware attack on January 27, 2025, exposed data of 934,326 individuals—including identifying data, insurance info, clinical records, and more. (TechTarget, Healthcare Dive)
Medusind (Medical Billing Vendor)
Discovered in December 2023 and disclosed in early 2025, this breach affected 701,475 individuals, exposing health insurance details, medical histories, prescription data, Social Security numbers, and contact information. (TechTarget, Healthcare Dive)
Kelly & Associates Insurance Group
The December 2024 breach affected 553,332+ individuals, planting exposure of names, SSNs, tax IDs, medical/insurance info, and financial account data. (TechTarget)
Community Health Systems, UCLA Health, Premera, Excellus, Labcorp, etc.
Historically, massive breaches have affected millions in the healthcare sector—for instance, Community Health Systems (4.5 million), UCLA Health (~4.5 million), Premera (~11 million), Excellus (~10 million), and Labcorp (~10 million) via earlier events. (Breachsense, Digital Guardian)
Government-Related Breaches
Office of Personnel Management (OPM), 2015
State-sponsored hackers (allegedly from China’s MSS) stole 22.1 million records of federal employees and others undergoing background checks—including Social Security numbers, birth data, and addresses—making it one of the largest U.S. government data breaches ever. (Wikipedia)
Texas Department of Transportation (TxDOT)
In May 2025, hackers downloaded crash report records affecting data for 423,391 individuals, including sensitive personal data (names, addresses, driver’s license numbers, insurance details). (San Antonio Express-News)
Social Security Administration (SSA) Cloud Warning
Not a breach per se, but in late August 2025, the SSA Chief Data Officer resigned, triggering alarm after whistleblower allegations that data on over 300 million Americans had been uploaded to an insecure cloud environment improperly. Though no breach was confirmed, the potential risk was enormous. (The Washington Post, The Times of India)
Summary Table
Sector |
Incident |
Scope / Individuals Affected |
Sensitive Data Exposed |
---|---|---|---|
Big Business |
TransUnion, Allianz, OnTrac, 16B credentials |
Millions (4.4M, 1.1M, 40K, 16B creds) |
SSNs, PII, contact data, IDs |
Healthcare |
Change Healthcare, Frederick Health, Medusind, Kelly & Associates, others |
Hundreds of thousands to 100M+ |
Medical records, SSNs, insurance, billing |
Government |
OPM (2015), TxDOT (2025), SSA cloud exposure |
Millions to hundreds of millions |
SSNs, IDs, addresses, crash/health data |
Website or Application |
PII Collected |
How the information is used (Will they share it? With whom? Will they keep it forever?) |
---|---|---|
Amazon |
Email, phone number, location |
For ad targeting, may be shared with advertisers |
Website/ App Name 2 |
||
Website/ App Name 3 |
What is Malware?
Malware is short for malicious software. It refers to any software or code that is intentionally designed to harm, exploit, or otherwise compromise a computer system, network, or device without the user’s informed consent.
Purpose: Steal data, damage systems, disrupt operations, spy on users, or gain unauthorized access.
Forms of Malware:
Viruses – Attach to clean files and spread.
Worms – Self-replicate and spread across networks.
Trojans – Disguise as legitimate software.
Ransomware – Locks data and demands payment.
Spyware – Secretly monitors user activity.
Adware – Displays unwanted ads, often with tracking.
Rootkits – Hide malicious activities from detection.
How It Spreads: How does it affect your computer. Click on the link to learn more about Malware.
Protection Measures
Instructions: In your own words, list and explain at least 5 protection measures from the list below:
Be suspicious of links and attachments
Use strong, unique passwords
Enable two-factor authentication (2FA)
Install and update antivirus software
Avoid public Wi-Fi for sensitive tasks
Check URLs carefully
Review app permissions
Regularly clear cookies and browser history
Monitor your digital footprint
Reflection Questions
Students should respond to the following questions in writing or in a small group discussion:
How does storing your information online facilitate convenience?
How is convenience online related to data loss?
What are some ways you can protect yourself online?
Digital Artifact
Objective: Demonstrate what you’ve learned by creating a digital poster, infographic, video, or slideshow that includes:
Definitions of phishing and at least 3 other vocabulary words
Examples of phishing (screenshots, skits, or drawings)
5+ protection tips in student-friendly language
One takeaway message or slogan (e.g., “Think Before You Click!”)
Tools Suggestions:
Canva (infographic/poster)
Google Slides (presentation)
Adobe Spark or Powtoon (video)
Flip (recorded video skit)
Assessment Criteria:
Component |
Points |
---|---|
Completed PII Table |
2 |
Protection Measures List |
2 |
Reflection Questions |
3 |
Digital Artifact |
3 |
Total |
10 |
Data Project 4: Encryption & Privacy¶
Key concepts to learn
What encryption is and why it’s essential for secure communication.
How Caesar cipher works as a basic form of substitution cipher.
The concept of brute force attacks.
Introduction to frequency analysis and basic crypto-analysis.
Understanding of public-key cryptography (RSA) and private/public key pairs.
Vocabulary relevant to digital security and cryptography.
Why is Frequency Important in Cryptography?¶
Frequency analysis is one of the oldest and most powerful tools in breaking substitution ciphers. Here’s how it helps:
Languages Have Predictable Patterns
In English, letters like E, T, A, O, and N appear most often.
If an encrypted message shows one letter (like “X”) appearing frequently, it might represent “E” or “T”.
Cracks Simple Ciphers
Substitution ciphers (like Caesar or cryptograms) can often be broken by comparing letter frequencies in the ciphertext to known English frequency patterns.
Foundation for Modern Crypto analysis
While modern encryption is more complex, the principle of pattern recognition and analysis is still used in detecting weak encryption algorithms.
Common English Letter Frequencies
Letter |
Frequency (%) |
Letter |
Frequency (%) |
---|---|---|---|
A |
8.2% |
N |
6.7% |
B |
1.5% |
O |
7.5% |
C |
2.8% |
P |
1.9% |
D |
4.3% |
Q |
0.1% |
E |
12.7% |
R |
6.0% |
F |
2.2% |
S |
6.3% |
G |
2.0% |
T |
9.1% |
H |
6.1% |
U |
2.8% |
I |
7.0% |
V |
1.0% |
J |
0.2% |
W |
2.4% |
K |
0.8% |
X |
0.2% |
L |
4.0% |
Y |
2.0% |
M |
2.4% |
Z |
0.1% |
Python Program: Letter Frequency Analyzer¶
def letter_frequency(text):
text = text.lower() # normalize to lowercase
frequency = {}
for char in text:
if char.isalpha(): # count only letters
if char in frequency:
frequency[char] += 1
else:
frequency[char] = 1
total_letters = sum(frequency.values())
print(" Letter Frequencies:\n")
for letter in sorted(frequency):
percent = (frequency[letter] / total_letters) * 100
print(f"{letter.upper()} : {frequency[letter]} times ({percent:.2f}%)")
# Example usage
input_text = input("Enter a message to analyze: ")
letter_frequency(input_text)
How to Use
Copy and paste the code into any Python environment.
When prompted, paste your encrypted or regular text.
The program will print a breakdown of how often each letter appears.
Python Code: Caesar Cipher Program
Copy and place this in a working Python environment.
def caesar_encrypt(text, shift):
result = ""
for char in text:
if char.isalpha():
offset = 65 if char.isupper() else 97
result += chr((ord(char) - offset + shift) % 26 + offset)
else:
result += char
return result
def caesar_decrypt(text, shift):
return caesar_encrypt(text, -shift)
def brute_force_caesar(text):
for key in range(1, 26):
print(f"Key {key}: {caesar_decrypt(text, key)}")
# Example usage:
print("1. Encrypt")
print("2. Decrypt")
print("3. Brute Force")
choice = input("Enter your choice: ")
if choice == "1":
message = input("Enter message to encrypt: ")
key = int(input("Enter shift key (1-25): "))
print("Encrypted:", caesar_encrypt(message, key))
elif choice == "2":
message = input("Enter message to decrypt: ")
key = int(input("Enter shift key (1-25): "))
print("Decrypted:", caesar_decrypt(message, key))
elif choice == "3":
message = input("Enter message to brute-force: ")
brute_force_caesar(message)
Complete Caesar decryption for “Vkliw wkuhh” using Caesar shift of 3.
Discuss why certain messages need encryption online.
Practice
Try encrypting and decrypting a sentence using the Python code.
Paste encrypted sentence into class doc with your key.
Brute Force
Use the brute force option in the script above to decrypt:
Guvf vf n rapelcgrq zrffntr. Lbh jvyy arire or noyr gb ernq vg!
Discuss if the original message is easy to spot among the outputs.
Vigenère Cipher¶
What Is the Vigenère Cipher?
The Vigenère Cipher is a polyalphabetic substitution cipher that uses a keyword to encrypt a message.
Unlike Caesar Cipher (which uses the same shift for every letter), Vigenère changes the shift for each letter, based on a repeating keyword.
This makes it much harder to crack using simple frequency analysis.
How It Works – Step-by-Step
Core Idea:
Each letter of the plaintext is shifted by an amount based on the corresponding letter of the keyword.
Alphabet Reference
Each letter in the alphabet is assigned a number:
Letter |
A |
B |
C |
D |
E |
F |
G |
H |
I |
J |
K |
L |
M |
N |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Number |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
Letter |
O |
P |
Q |
R |
S |
T |
U |
V |
W |
X |
Y |
Z |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Number |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
Keyword:
A keyword is used to determine how much each letter in your message gets shifted.
Example keyword: KEY
We convert it to numbers using the alphabet reference:
K = 10
E = 4
Y = 24
Repeating the Keyword
If your message is longer than the keyword, repeat the keyword to match the message length.
Example:
Feature |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Message: |
A |
T |
T |
A |
C |
K |
A |
T |
D |
A |
W |
N |
||
Keyword: |
K |
E |
Y |
K |
E |
Y |
K |
E |
Y |
K |
E |
Y |
(Spaces are just for clarity; they’re not encrypted.)
Encryption Formula
To encrypt a message:
EncryptedLetter = (PlainLetter + KeyLetter) % 26
Each letter of the message is converted to a number. Then, add the number from the keyword (also as a number). Use % 26
(modulo 26) to make sure the result stays between 0 and 25.
Example: Encrypt “A” with key letter “K”
A = 0
K = 10
Encrypted = (0 + 10) % 26 = 10
10 = K
So A becomes K
Decryption Formula
To decrypt a message:
PlainLetter = (EncryptedLetter - KeyLetter + 26) % 26
You subtract the keyword number from the encrypted letter number. Add 26 before the % 26
to avoid negative numbers.
Example: Decrypt “K” with key letter “K”
K = 10
K = 10
Decrypted = (10 - 10 + 26) % 26 = 0
0 = A
So K becomes A
Example: Encrypting “ATTACKATDAWN” with Key “KEY”
Position |
Message |
Key |
Plain (Num) |
Key (Num) |
Encrypted (Num) |
Encrypted Letter |
---|---|---|---|---|---|---|
1 |
A |
K |
0 |
10 |
10 |
K |
2 |
T |
E |
19 |
4 |
23 |
X |
3 |
T |
Y |
19 |
24 |
17 |
R |
4 |
A |
K |
0 |
10 |
10 |
K |
5 |
C |
E |
2 |
4 |
6 |
G |
6 |
K |
Y |
10 |
24 |
8 |
I |
7 |
A |
K |
0 |
10 |
10 |
K |
8 |
T |
E |
19 |
4 |
23 |
X |
9 |
D |
Y |
3 |
24 |
1 |
B |
10 |
A |
K |
0 |
10 |
10 |
K |
11 |
W |
E |
22 |
4 |
0 |
A |
12 |
N |
Y |
13 |
24 |
11 |
L |
Encrypted Message: KXRKGIKXBKAL
def vigenere_encrypt(text, key):
result = ""
key = key.lower()
key_index = 0
for char in text:
if char.isalpha():
shift = ord(key[key_index % len(key)]) - 97
base = 65 if char.isupper() else 97
result += chr((ord(char) - base + shift) % 26 + base)
key_index += 1
else:
result += char
return result
def vigenere_decrypt(text, key):
result = ""
key = key.lower()
key_index = 0
for char in text:
if char.isalpha():
shift = ord(key[key_index % len(key)]) - 97
base = 65 if char.isupper() else 97
result += chr((ord(char) - base - shift) % 26 + base)
key_index += 1
else:
result += char
return result
# Example use
msg = input("Enter your message: ")
key = input("Enter the keyword: ")
mode = input("Encrypt (e) or Decrypt (d)? ").lower()
if mode == 'e':
print("Encrypted:", vigenere_encrypt(msg, key))
else:
print("Decrypted:", vigenere_decrypt(msg, key))
RSA Educational Encryption and Decryption Program¶
⚠️ This version is simplified for educational purposes only and not secure for real cryptographic use. It shows the core concepts of RSA: key generation, encryption, and decryption.
RSA (named after its inventors Rivest, Shamir, and Adleman) is a widely used asymmetric encryption algorithm. It allows secure communication over insecure channels.
Asymmetric Encryption
RSA uses two keys:
A public key for encryption
A private key for decryption
These keys are mathematically related, but knowing the public key does not make it feasible to compute the private key (assuming large enough primes).
How RSA Works:
1. Key Generation
RSA starts with generating two large prime numbers, p
and q
.
a. Compute the modulus:
In RSA encryption, the modulus refers to a large integer value n, which is the product of two large prime numbers.
n = p x q
This n is a core part of the public and private keys. It defines the arithmetic space in which all encryption, decryption, and key generation operations occur. All RSA operations are performed modulo n — that is, within the set range of integers {0, 1, …, n−1}. So RSA, is a number used to define a finite mathematical field for secure computation.
b. Compute Euler’s totient function:
$$ \phi(n) = (p - 1) x (q - 1) $$
c. Choose a public exponent e
:
e
must be relatively prime toφ(n)
Common choice:
e = 65537
(it’s fast and secure in practice)
d. Compute the private exponent d
:
$$ d \equiv e^{-1} \mod \phi(n) $$
In other words,
d
is the modular inverse ofe
moduloφ(n)
The keys:
Public key:
(n, e)
Private key:
(n, d)
2. Encryption
To encrypt a message m
(as a number), use the public key:
$$ c = m^e \mod n $$
c
is the ciphertext.m
must be an integer less thann
, so padding schemes like PKCS#1 or OAEP are used in real applications.
3. Decryption
To decrypt the ciphertext c
, use the private key:
$$ m = c^d \mod n $$
You recover the original message
m
.
Security of RSA
The security of RSA relies on the difficulty of the integer factorization problem:
Given
n = p × q
, it is computationally infeasible to findp
andq
if they are large (e.g., 2048-bit primes).
If someone could factor n
, they could compute φ(n)
, and then d
, breaking RSA.
RSA is foundational to modern cryptography but often used in practice to encrypt symmetric keys, not large data directly, due to performance and padding limitations.
def gcd(a, b):
while b != 0:
a, b = b, a % b
return a
def modinv(e, phi):
# Extended Euclidean Algorithm
d_old, r_old = 0, phi
d_new, r_new = 1, e
while r_new != 0:
quotient = r_old // r_new
d_old, d_new = d_new, d_old - quotient * d_new
r_old, r_new = r_new, r_old - quotient * r_new
return d_old % phi
def is_prime(n):
if n < 2: return False
for i in range(2, int(n**0.5)+1):
if n % i == 0:
return False
return True
def generate_keys():
print("\n Choose two prime numbers (p and q) such that their product n = p * q is greater than 255.")
p = int(input("Enter prime p (e.g. 61): "))
q = int(input("Enter prime q (e.g. 53): "))
if not (is_prime(p) and is_prime(q)):
print("Error: Both numbers must be prime.")
return None, None, None
n = p * q
if n <= 255:
print("Error: n must be greater than 255 to support all ASCII characters.")
return None, None, None
phi = (p - 1) * (q - 1)
e = 3
while gcd(e, phi) != 1:
e += 2
d = modinv(e, phi)
print(f"\n Public Key: ({e}, {n})")
print(f" Private Key: {d}")
return e, d, n
def encrypt(message, e, n):
encrypted = [pow(ord(char), e, n) for char in message]
return encrypted
def decrypt(cipher, d, n):
decrypted = ''.join([chr(pow(char, d, n)) for char in cipher])
return decrypted
def main():
print("RSA Cryptosystem")
while True:
print("\nMENU:")
print("1. Generate Keys")
print("2. Encrypt Message")
print("3. Decrypt Message")
print("4. Exit")
choice = input("Choose an option (1-4): ")
if choice == "1":
e, d, n = generate_keys()
elif choice == "2":
msg = input("Enter your message: ")
e = int(input("Enter recipient's public key e: "))
n = int(input("Enter recipient's public key n: "))
encrypted_msg = encrypt(msg, e, n)
print("Encrypted message:", encrypted_msg)
elif choice == "3":
try:
cipher_input = input("Paste the encrypted message list (e.g. [123, 456]): ")
cipher = eval(cipher_input)
d = int(input("Enter your private key d: "))
n = int(input("Enter your modulus n: "))
decrypted_msg = decrypt(cipher, d, n)
print("Decrypted message:", decrypted_msg)
except:
print("Invalid input. Try again.")
elif choice == "4":
print("Exiting RSA program. Goodbye!")
break
else:
print("Invalid option. Please choose 1, 2, 3, or 4.")
main()
How to use the RSA Program
Generate Keys
Run the program and choose option 1 to generate RSA keys.
Choose larger primes (e.g. 61 and 53).
Share public key (e, n).
Keep private key (d) secret.
Encrypt
Select option 2.
Input your message.
Input recipient’s public key (e, n).
Share the encrypted list with the recipient.
Decrypt
Select option 3.
Paste the encrypted list.
Enter your private key (d) and modulus (n).
Get the decrypted message.
Data Project 5: Password Strength¶
⚠️ This is a simulation, not a real-time brute-force on a login system, which would be illegal and unethical without permission.
Below is a Python program to simulate the average time it would take to brute-force different types of passwords:
One-word password (e.g.,
apple
)Two-word password (e.g.,
applebanana
)Two words with a digit (e.g.,
applebanana7
)
What the program would do:
Define the password space based on the type (word length, number of combinations).
Use a mock brute-force attack where it tries all combinations until it finds the target.
Measure the time taken for a number of trials and average them.
Key Assumptions:
Word list comes from a dictionary (e.g., 1,000 common English words).
Digits are from 0–9 (10 possible digits).
Passwords are guessed in a brute-force (sequential or random) manner.
Password Strength Code:
import time
import random
import string
# Simulated dictionary of 10,000 words (for example purposes)
word_list = [f"word{i}" for i in range(1000)]
def brute_force_simulation(password, candidates):
attempts = 0
start_time = time.time()
for guess in candidates:
attempts += 1
if guess == password:
break
end_time = time.time()
return end_time - start_time, attempts
def generate_password(word_count=1, add_digit=False):
words = random.sample(word_list, word_count)
if add_digit:
digit = str(random.randint(0, 9))
return ''.join(words) + digit
return ''.join(words)
def create_candidates(word_count, add_digit):
from itertools import product
if word_count == 1:
return [w for w in word_list]
elif word_count == 2 and not add_digit:
return [w1 + w2 for w1 in word_list for w2 in word_list]
elif word_count == 2 and add_digit:
return [w1 + w2 + str(d) for w1 in word_list for w2 in word_list for d in range(10)]
def average_time(word_count, add_digit=False, trials=1):
total_time = 0
total_attempts = 0
candidates = create_candidates(word_count, add_digit)
for _ in range(trials):
password = generate_password(word_count, add_digit)
random.shuffle(candidates) # simulate guessing randomness
time_taken, attempts = brute_force_simulation(password, candidates)
total_time += time_taken
total_attempts += attempts
avg_time = total_time / trials
avg_attempts = total_attempts / trials
return avg_time, avg_attempts
# Run simulations
print("Running brute-force simulation...")
for wc, desc, digit in [
(1, "One-word", False),
(2, "Two-word", False),
(2, "Two-word with digit", True)
]:
t, a = average_time(wc, digit, trials=1)
print(f"{desc} password:")
print(f" Average Time: {t:.4f} seconds")
print(f" Average Attempts: {int(a)}\n")
Example Output (depending on system performance):
One-word password:
Average Time: 0.0023 seconds
Average Attempts: 4521
Two-word password:
Average Time: 0.3876 seconds
Average Attempts: 49230221
Two-word with digit password:
Average Time: 1.0542 seconds
Average Attempts: 69811022
Great question — and an important one. Let’s go deep into Two-Factor Authentication (2FA) in a way that’s clear and practical.
2FA Click Here
What is 2FA?
Two-Factor Authentication is a security process where a user provides two separate forms of identification before gaining access to an account or system.
Traditionally, logging in requires one factor — usually a password (something you know). 2FA adds a second factor, typically one of these:
Something you know → Password, PIN, or security question
Something you have → Phone, hardware token, security key, smart card
Something you are → Biometrics (fingerprint, face, iris scan)
When you combine two of these, even if a hacker steals your password, they can’t log in without that second factor.
Common Types of 2FA
Method |
How It Works |
Security Level |
Pros |
Cons |
---|---|---|---|---|
SMS Codes |
A code is sent via text message |
Low-Med |
Easy to use, no setup |
Vulnerable to SIM-swaps, phishing |
Authenticator Apps (e.g., Google Authenticator, Authy) |
Time-based codes generated on your phone |
High |
No internet or SMS needed, more secure than SMS |
Requires phone; if lost, recovery needed |
Push Notifications (e.g., Duo, Microsoft Authenticator) |
App sends a prompt to approve/deny login |
High |
Very convenient, less typing |
Can be tricked by “push fatigue” attacks |
Hardware Security Keys (e.g., YubiKey) |
Physical device is tapped or inserted |
Very High |
Extremely secure, phishing-resistant |
Costs money, must be carried |
Biometric |
Fingerprint, Face ID, etc. |
Varies |
Convenient, hard to steal remotely |
Privacy concerns, device-specific |
Should People Use It?
Security experts consistently recommend enabling 2FA wherever possible — especially on:
Email accounts (often the gateway to everything else)
Banking and financial accounts
Social media accounts (to prevent identity hijacking)
Cloud storage (e.g., Google Drive, Dropbox, iCloud)
It dramatically reduces the chance of your account being hacked. Most account breaches occur because passwords get stolen, guessed, or reused. 2FA blocks almost all of these attacks.
Risks & Limitations
While 2FA is much safer than a password alone, no security measure is perfect. Key risks include:
Account Recovery Risks If you lose your second factor (e.g., phone, hardware key), recovery can be difficult — sometimes impossible without backup codes or a recovery method.
Phishing Some phishing attacks can trick users into providing both their password and 2FA code (e.g., real-time relay attacks).
SIM-Swap Attacks (SMS-based 2FA) Hackers can socially engineer your phone company into transferring your number to them, intercepting SMS codes.
Push Fatigue / MFA Bombing Attackers spam push notifications to trick users into approving a login out of annoyance or mistake.
Biometric Risks Biometric data (like fingerprints) can’t be “changed” if compromised, though compromise is rare.
Best Practices When Using 2FA
Prefer Authenticator Apps or Hardware Security Keys over SMS codes.
Save backup codes somewhere safe (offline or in a secure password manager).
Never approve a 2FA prompt unless you’re actively logging in.
Keep your phone account secure (PIN-protect your SIM and phone carrier account).
Consider multi-factor (more than two) for highly sensitive accounts.
Summary:
Average adult vocabulary: 30,000 words
10 digits (0–9)
10 symbols (e.g.,
!@#$%^&*()
)Guess speed: 100,000 guesses per second (TIME_PER_GUESS = 0.00001 seconds/guess)
Average attempts ≈ half of all possibilities
Password Type |
Total Combos |
Avg Attempts |
Avg Time (s) |
Hours |
Days |
Years |
---|---|---|---|---|---|---|
One-word |
30,000 |
15,000 |
0.15 |
0.00 h |
0.00 d |
0.00 y |
Two-word |
900,000,000 |
450,000,000 |
4,500.00 |
1.25 h |
0.05 d |
0.00014 y |
Two-word + digit |
9,000,000,000 |
4,500,000,000 |
45,000.00 |
12.50 h |
0.52 d |
0.0014 y |
Two-word + digit + symbol |
90,000,000,000 |
45,000,000,000 |
450,000.00 |
125.00 h |
5.21 d |
0.014 y |
NOTES:
Password complexity grows multiplicatively: every extra element (word, digit, symbol) multiplies the possibilities.
Attackers use faster hardware: if guesses per second go up, time goes down proportionally.
If you scale up words, add digits, symbols, case-sensitivity, or length, cracking times very quickly go from seconds → years → centuries.
Data Project 6: Be Professional¶
Introduction
Writing code is not just about “making it run.” It’s about making it solve the intended problem reliably.
Program code that uses incorrect logical conditions, unclear structure, or poor data handling — may appear to work in some situations but will often:
Produce wrong results when certain inputs are used (hidden bugs).
Confuse other programmers (including your future self), making it harder to maintain or improve.
Break silently — meaning it doesn’t crash, but gives answers that look right but aren’t.
One of the most important skills in programming is rewriting messy code into clear, logically correct, and readable code. This process is called refactoring. By practicing fixing poorly written code, you’re learning how to:
Think logically about what a program should do.
Detect logical errors in code that still runs.
Write code that other people can trust and understand.
In this project, you will practice rewriting bad code to be correct, clear, and be Professional.
print("Welcome to the age checker!")
age = input("Enter your age: ")
if age >= "18" or age < "0":
print("You are an adult or a time traveler.")
elif age < "18" and age > "0" and age == "17" or age == "16":
print("You are almost an adult but also maybe a kid.")
elif not age == "15" and not age == "14" and not age == "13":
print("You are a teenager?")
else:
print("Invalid input or something went wrong maybe.")
There are 5 things that need to be corrected within this program. Can you identify all 5?
Check your understanding here
age
is a string, never converted toint
→ numeric comparisons (>=
,<
) are unreliable.Logical operators are inconsistent:
or
andand
are mixed with no parentheses → leads to unintended precedence.Some conditions are redundant or nonsensical (e.g.,
"almost an adult but maybe a kid"
).The
not
logic is confusing and hard to read.Default
else
message is unclear.
What should you do:
Convert input to integer.
Write clear, correct logical conditions with proper ranges.
Simplify and reorder checks (e.g., negative age should be caught early).
Remove conflicting conditions and clarify outputs.
print("Welcome to the age checker!")
age_input = input("Enter your age: ")
if age_input.isdigit():
age = int(age_input)
else:
print("Invalid input. Please enter a number.")
age = -1 # forces an error path later
if age < 0:
print("You entered an impossible age.")
elif age < 13:
print("You are a child.")
elif age < 18:
print("You are a teenager.")
else:
print("You are an adult.")
Project:
Fix the poorly written examples below (Grade Checker, Temperature Warning, Dice Game Result).
You must rewrite each to be logically correct and easy to read.
After rewriting, test your code with different inputs.
Trade with a partner to verify if the code works logically.
Reflection Questions (use comments to answer these questions)
“What was the hardest bug to fix, and why?”
“How did testing help you find logical problems?”
“What is one thing you’ll do in future projects to avoid writing sloppy code?”
1 — Grade Checker
print("Grade Checker")
grade = input("Enter your test score: ")
if grade > "90" or grade < "0":
print("A or maybe error??")
elif grade <= "90" and grade >= "80" or grade == "85":
print("You got B but maybe A??")
elif grade < "80" and grade >= "70" or not grade == "60":
print("C?")
else:
print("I don't know what your grade is sorry")
2 — Temperature Warning
print("Temperature Warning System")
temp = input("Enter temperature: ")
if temp < "0" and temp > "100":
print("Too cold or too hot??")
elif temp >= "0" or temp <= "100" and temp == "50":
print("Perfect maybe?")
else:
print("Weather broken I guess")
3 — Dice Game Result
print("Dice Game Result Checker")
roll = input("Enter dice roll (1-6): ")
if roll == "6" or roll < "1" and roll > "6":
print("You win big!")
elif roll == "3" and roll == "4" or roll == "5":
print("You win something small.")
else:
print("Lose maybe?? idk")
Enhancements
Testing Table
Provide a small input-output table for each corrected program (e.g., show 3–5 different inputs and what the output should be).
Style Bonus
Additional point(s) may be awarded for clean formatting, good variable naming convention, and helpful comments (introduces code readability as a skill).
Version Control
Professionals use tools (like Git) to track changes, review code, and prevent sloppy code from reaching production.
Grading Rubric
Category |
Excellent (4) |
Proficient (3) |
Developing (2) |
Beginning (1) |
---|---|---|---|---|
Logical Correctness |
All code works for all valid inputs |
Minor logical mistakes remain |
Many logical mistakes remain |
Code does not run or does not solve the task |
Clarity & Readability |
Code is well-formatted and easy to follow |
Mostly clear with some confusing parts |
Hard to follow, poor naming or formatting |
Very unclear / unreadable |
Testing / Validation |
Multiple test cases included and passed |
Some test cases provided |
Minimal testing shown |
No testing |
Comments / Reflection |
Clear explanation of fixes and lessons learned |
Basic explanation |
Minimal notes |
No reflection |