TL;DR: Onchain data lives directly on the blockchain, making it permanent, transparent, and verifiable by anyone. Offchain data is stored outside the blockchain, typically on traditional servers, decentralized storage networks like IPFS, or private databases. Most real-world applications use a combination of both, keeping critical state onchain while storing larger or less essential data offchain to save cost and improve performance.
The Simple Explanation
Every blockchain has limited space. Storing data onchain means writing it into a transaction or smart contract state that gets permanently recorded across every node in the network. This makes the data tamper-proof and universally accessible, but it comes at a cost. On Ethereum, storing just 32 bytes of data can cost several dollars depending on gas prices. Storing an entire image or document onchain would be prohibitively expensive.
That is where offchain data comes in. Instead of storing a full image on the blockchain, a developer might store the image on IPFS (a decentralized file storage network) and then store only the IPFS content hash onchain. The blockchain record proves what the data should be, and the offchain storage actually holds it. This hybrid approach gives you the verifiability of blockchain with the scalability of traditional or decentralized storage systems.
Think of it like a property deed. The deed itself (a small, critical record of ownership) gets recorded in the county clerk's office. But the actual house, the blueprints, the inspection reports, and the photos all exist elsewhere. The deed points to the property. The blockchain works the same way: it stores the critical proofs and pointers while larger datasets live offchain.

What Counts as Onchain Data
Onchain data includes everything that is permanently recorded in the blockchain's state or transaction history. This covers transaction records (sender, receiver, amount, timestamp), smart contract code and state variables (token balances, ownership records, governance votes), event logs emitted by smart contracts, and block metadata like timestamps and validator information.

