Base64 Encoding Demystified
To the uninitiated, SGVsbG8gV29ybGQ= looks like gibberish. To a developer, it says "Hello World". This is Base64, the ubiquitous encoding scheme of the internet. It is not encryption; it is a survival mechanism for binary data in a text-based world.
The Problem: Binary vs. Text Channels
Computers store data as bytes (0s and 1s). Images, PDFs, and executable files are all just sequences of bytes. However, many early internet protocols (like Email/SMTP) were designed only to handle plain text (ASCII). If you try to copy-paste a raw image file into an email body, it will break because the binary data contains control characters (like 'End of File' or 'Null') that confuse the transmission software.
The Solution: Base64
Base64 solves this by translating any binary data into a safe subset of 64 printable ASCII characters: A-Z (26), a-z (26), 0-9 (10), +, and /.
How it works (The Math):
1. Take 3 bytes of input data (3 * 8 = 24 bits).
2. Divide these 24 bits into 4 chunks of 6 bits each.
3. Map each 6-bit chunk to a character from the Base64 alphabet.
Since 3 input bytes result in 4 output characters, Base64 increases the size of the data by roughly 33%. This is the cost of compatibility.
Use Cases
1. Data URIs: You can embed small icons directly into CSS or HTML to avoid an HTTP request.
background-image: url('data:image/png;base64,iVBORw0KGgo...');
2. Email Attachments: Behind the scenes, every attachment you send in Gmail is Base64 encoded (MIME).
3. API Keys: Basic Auth headers often require username:password to be Base64 encoded.
Coding with Base64
- JavaScript:
btoa("hello")encodes,atob("...")decodes. - Python:
import base64;base64.b64encode(data).
Base64 is a bridge. It allows complex media to travel safely across simple infrastructure.