The moment you tap “Log In,” something small but important happens. Not on the screen, but underneath it. A quiet departure.
A piece of you leaves.
It does not leave as a photograph or a sentence or a neatly packed file. It leaves as fragments. Tiny, structured scraps. A device identifier here. A timestamp there. A cookie that says you were here before. Together, these fragments make up a travelling version of you, one that rarely waits for permission. This is your datafolk in motion.
To understand how datafolk move, it helps to stop thinking about the internet as a cloud. Clouds are vague and polite. The internet is neither. It is closer to a city filled with couriers, routes, checkpoints, warehouses, and rules. Data does not float. It is carried.
And there is no better way to understand how it’s carried than to start with a lunchbox.
The Dabbawalas of Mumbai
Every morning in Mumbai, around 5,000 dabbawalas pick up freshly cooked lunches from homes across the city.1 Each tiffin box, each dabba, travels through a chain of hands, changes trains at Churchgate or Dadar, and arrives at the right office desk by 12:30 PM. The system moves 200,000 lunches daily with an error rate so low that Harvard Business School studied it. One mistake per six million deliveries. Your favourite food delivery app could never.
Here’s the crucial detail: no single dabbawala sees the entire journey. Most cannot read the addresses they carry. Yet the system works, because everyone follows the same code.
The markings on each dabba tell the story. A combination of colours, symbols, and numbers painted on the lid encodes the origin station, the destination station, the building, and the floor. Each dabbawala reads only the part relevant to their leg of the journey. At the origin station, one reads the colour that indicates which train to board. At the destination station, another reads the symbol that points to the neighbourhood. At the final building, someone reads the number that specifies the floor. The dabba itself carries its routing instructions.
This is what happens to your datafolk when they travel through the internet. And understanding dabbawalas is the first step to understanding what happens to your data every time you tap a button on your phone.
The comparison is not decorative. It is structurally precise. Both systems break a delivery into legs, use coded labels for routing, rely on intermediaries who don’t need to understand the full journey, and achieve remarkable efficiency through standardisation. The dabbawala network and the internet are, at their core, the same design pattern, just operating at different scales and different centuries.
What Is a Packet?
When your phone sends data across the internet, it doesn’t open a direct connection and pour the data through like water through a hose. Instead, the data is broken into small chunks, typically around 1,500 bytes each. Each chunk is wrapped in some extra information and sent out as an independent unit.
That unit is a packet.
Think of it this way: if you wanted to send a 300-page book through the dabbawala system, you wouldn’t send the whole book as one package, it’s too heavy, too unwieldy, and if it gets lost, you’ve lost everything. You’d also lose a friend, because nobody appreciates being handed a 300-page book and told “deliver this by lunch.”
Instead, you’d tear out each page, put it in its own envelope with instructions on the cover, and send them separately. Each envelope can take a different route. Each one arrives independently. The recipient reassembles the pages in order.
That’s packet switching. And it’s the reason the internet works.
Anatomy of a Packet
Every packet has two parts: a header and a payload. The header is the label on the lunchbox. The payload is the lunch.
The Header
The header is the metadata that tells the network how to handle this packet:
Source address, Your device’s IP address, the return address on the envelope. It tells the network where this packet came from, so the response knows where to go back to. Like writing your home address on the dabba lid, except considerably less colourful.
Destination address, The IP address of wherever this packet is headed. When you watch YouTube, the destination is one of Google’s video servers. When you send a WhatsApp message, it’s Meta’s servers. When you’re arguing with a stranger on Twitter, it’s, well, probably better not to think about the infrastructure supporting that particular activity.
Sequence number, Which piece of the original message this packet represents. If your video was broken into 10,000 packets, this number tells the receiver “I am packet #4,827, please put me between #4,826 and #4,828 and do try not to lose me.”
Protocol, The rules this packet follows. The two main protocols are:
- TCP, Reliable delivery. If a packet gets lost, TCP requests it again. Used for web pages, messages, file downloads. The responsible older sibling of internet protocols.
- UDP, Fast delivery, no guarantees. If a packet gets lost, too bad, it’s gone, we’ve moved on. Used for video calls, online gaming, live streaming. The younger sibling who shows up late with no explanation and somehow gets away with it.
TTL (Time to Live), A countdown that starts at a number (usually 64) and decreases by one at each hop. When it hits zero, the packet is discarded. This prevents lost packets from bouncing around the network forever, like a dabba that missed every handoff and is now just riding the local train indefinitely, confusing everyone.
The Payload
The payload is the actual data, a fragment of your message, a slice of an image, a piece of a video frame. On its own, a single packet’s payload is often meaningless. It’s one page torn from a book. But combined with all the other packets, it reconstructs the complete data.
Your entire internet experience, every video, every message, every payment, every doom-scroll session at 2 AM, is packets. All the way down.
Why Packets? A Brief History of Not Reserving the Wire
Before the internet, we had the telephone network. When you called someone, the system created a dedicated circuit, an actual physical path of wires, between you and the other person. That circuit was held open for the entire call, reserved exclusively for you.
This worked fine for voice calls. But it was spectacularly wasteful. Think about a phone conversation: there are pauses, silences, moments when you’re listening, moments when you’re wondering why your aunt called you at this hour. During all that, the dedicated wire is sitting idle. You’re paying for a reserved lane on the highway while your car is parked and you’re at the dhaba having tea.
If you tried to run the internet on circuit switching, you’d need a dedicated wire between every pair of communicating devices. That’s billions of simultaneous connections. It’s physically and economically impossible. It’s also an infrastructure engineer’s anxiety dream.
In the 1960s, researchers including Paul Baran and Donald Davies independently came up with a radical idea: don’t reserve the wire.2 Instead, break the data into small packets, label each one with its destination, and let them share the wires.
Packets from different senders interleave on the same cable. Your YouTube video packets share the same fibre optic cable as someone else’s WhatsApp messages and a third person’s UPI payment. Each packet finds its way to the right destination based on its header, just like dabbas from different households sharing the same train and the same dabbawalas.
This is why the internet is efficient. This is also why it’s chaotic, packets can arrive out of order, get lost, or take different routes. The protocols (TCP and UDP) exist to handle this chaos, which is fundamentally the same job as a very patient project manager.
The Journey Begins: A UPI Payment
Let’s follow a single action through the system. You are standing at a chai stall in Koramangala, Bangalore. You scan the shopkeeper’s PhonePe QR code and enter ₹20. You tap “Pay.” Your phone tells you the transaction is complete. The chai tastes the same as always. Nothing seems to have happened except that your bank balance dropped by twenty rupees.
But beneath that simple tap, your datafolk just took a journey longer and more complex than any dabbawala’s route.
The moment you tapped “Pay,” your phone created a packet. Not a single message, but a bundle of fragments. This packet contained your UPI ID (a kind of address), the shopkeeper’s UPI ID, the amount, a timestamp, and your phone’s device identifier. Each piece of information was wrapped in a specific format that routers along the way could read, just as each dabba carries painted codes that dabbawalas interpret.
This packet did not travel directly to your bank. It could not. The internet does not work that way. Instead, your packet first went to your phone’s operating system, which handed it to your mobile network. The network’s nearest tower received the packet and passed it to the telecom operator’s regional gateway. From there, the packet entered the internet backbone, the digital equivalent of Mumbai’s railway network, where it hopped between routers operated by multiple companies until it reached the National Payments Corporation of India (NPCI), which operates the UPI system.
At NPCI, the packet was authenticated, checked against your bank’s records, and routed to both your bank and the shopkeeper’s bank. Multiple packets travelled back: confirmations, balance updates, transaction logs. Your phone’s “Success” screen appeared only after this entire round trip was completed, typically in under two seconds.
In those two seconds, your datafolk split into multiple copies, passed through at least a dozen different systems operated by different organisations, and left traces in each one. The chai stall’s QR code provider logged the transaction. Your phone’s UPI app recorded it. Your telecom operator logged the data packets. NPCI stored the transaction details. Both banks updated their records. The shopkeeper’s payment provider noted the incoming payment.
A single tap. At least seven copies of your datafolk created. And you noticed nothing but the chai cooling in your hand.
What Each Intermediary Sees
Here’s what makes this interesting, and a little unsettling.
Even when the payload is encrypted (and it usually is, thanks to HTTPS), the header is visible. It has to be, the network needs to read the destination address to route the packet, just like a dabbawala needs to read the lid to know where the dabba goes.
This means everyone between you and the destination can see:
| Who | What They See |
|---|---|
| Your Wi-Fi router | Every IP address you connect to, when, and how much data you exchange. It’s the nosiest device in your home, and you gave it admin access. |
| Your ISP (Jio, Airtel, BSNL) | Same as the router, plus your account identity. They may not know you watched a specific YouTube video, but they know you connected to YouTube’s servers for 45 minutes at 11 PM. |
| Internet exchange points | Aggregated traffic patterns. Not individual content, but flow data, what’s going where and how much of it. |
| NPCI | Both sides of every UPI transaction in India. Sender, receiver, amount, time, banks, apps. The most complete picture of Indian financial activity that has ever existed. |
This is metadata, data about data. And metadata is often more revealing than the content itself.
Consider: we don’t need to read your messages to know that you called a divorce lawyer at 2 AM, then a real estate agent the next morning. The pattern of connections tells the story. We also don’t need to read your UPI payloads to know that you make a payment to the same wine shop every Friday evening. The metadata, IP addresses, timestamps, frequency, is enough to write your biography. An unflattering one.
The Cables Under the Sea
Your ₹20 UPI payment probably stayed within India, bouncing between data centres in Mumbai, Chennai, and Bangalore. But most of your internet activity doesn’t.3
When you open YouTube, your packets might travel to Google’s nearest edge server, possibly in Mumbai or Singapore. When you use Instagram, your data crosses the ocean to Meta’s data centres in the United States. When you search for something on Google, depending on what you’re searching and where the relevant index is cached, your query might hop through an undersea fibre optic cable.
There are currently over 500 submarine cables crisscrossing the world’s oceans, carrying approximately 99% of intercontinental data traffic. Some of these cables are as thin as a garden hose, sitting on the ocean floor, carrying the collective internet activity of entire nations. The idea that your 2 AM Wikipedia deep-dive about capybara social behaviour travels through a tube on the literal ocean floor is, frankly, one of the more humbling facts about modern technology.
India’s international connectivity lands primarily at Mumbai’s Versova and Chennai’s submarine cable landing stations. Eleven of India’s submarine cables come ashore at Versova, a six-kilometre stretch of coastline in suburban Mumbai that carries the vast majority of India’s international data traffic. Your datafolk’s journey abroad almost certainly passes through this unremarkable patch of beach.
Data Localisation and the Border Problem
Packets don’t have passports. They don’t respect borders. A packet from Bangalore to Delhi might route through Singapore if that’s the most efficient path. Your data is a cosmopolitan traveller who never asked for a visa.
This creates a problem. The physical location of a server determines which country’s laws apply to the data on it. A server in Mumbai is subject to Indian law. A server in Virginia is subject to American law, even if it stores data about Indian citizens, collected by an Indian company, about transactions that happened in India.
In April 2018, the Reserve Bank of India issued a circular that changed this equation for financial data: all payment system operators must store their data exclusively in India.4 Not a copy, the primary data. On Indian servers. Under Indian law.
This is data localisation, and it’s one of the defining debates of our time. We’ll explore it in depth in a later chapter. But the seed is here: your datafolk, once born and set in motion, can end up anywhere. The question of who controls where they come to rest, and who gets to look at them once they do, is ultimately a question of power.
Cookies, Trackers, and the Passengers You Didn’t Invite
So far, we’ve talked about datafolk you knowingly created, a UPI payment, a YouTube video, a login. But your datafolk also pick up hitchhikers.
When you visit a website, it often drops a cookie on your device, a small file that remembers you. First-party cookies are generally useful: they keep you logged in, remember your language preference, save your shopping cart. They’re the friendly neighbourhood shopkeeper who remembers your usual order.
Third-party cookies are different. These are placed not by the website you’re visiting, but by advertisers, analytics companies, and data brokers who have code embedded on that website. When you visit a news article, the article’s website might drop 30-60 third-party cookies. Each one is a tiny datafolk birth, a new tracker that can follow you across the internet, building a profile of your browsing habits across different websites.
You read a health article about migraines. A tracker notes it. You search for headache medication. A different tracker notes it. You visit a pharmacy’s website. A third tracker notes it. Now three different companies know you have migraines, and your personalised ad experience is about to become very, very specific. You will see migraine medication advertisements for the next three months. You will see them everywhere. You will begin to wonder if the ads are causing the migraines.
This is cross-site tracking, and it’s the engine that powers most of the internet’s advertising economy. Your datafolk didn’t just travel from your phone to a server. They multiplied at every stop, and each copy wandered off to tell someone about your headache.
The Chain Reaction
This is the takeaway of this chapter, and it’s worth stating plainly: sharing data is rarely a single action. It is a chain reaction.
When you tap “Pay” for a ₹20 chai, you didn’t send one piece of data to one place. You triggered a cascade of packets across multiple networks, creating copies at every intermediary, generating metadata that reveals your patterns, and leaving traces that persist long after the chai is finished and you’ve moved on with your day.
When you visit a website, you didn’t just load a page. You announced your presence to dozens of trackers, each of which sent your information to different servers in different countries operated by different companies under different privacy laws.
When you installed an app and tapped “Allow” on the permissions popup, you didn’t just grant access to your camera or contacts. You opened a channel that will continuously generate datafolk for as long as the app is installed, whether you’re using it or not.
Every digital action is a departure. Every departure creates copies. Every copy has a destination you didn’t choose and a lifespan you don’t control.
Your datafolk are out there, riding the cables, bouncing between routers, sitting in databases you’ll never see. They are the dabbawalas who never come home.
The next question is: where do they end up, and who’s waiting for them when they arrive?
Try It Yourself
The Packet Tracer experiment in the lab lets you send a packet from your phone to a web server and watch it hop through each network node. Toggle the packet header to see what metadata each intermediary can read, even when the payload is encrypted.
Previously: How Datafolk Are Born, from census takers to Aadhaar, the origins of data collection. Next chapter: The Invisible Bazaar, where your datafolk are valued, traded, and profiled.