Zero-knowledge: the solution to privacy problems on Blockchain?
- Zero-knowledge is a concept used in cryptocurrency which allows us to prove the existence of certain information without disclosing it.
- This technology can be distinguished from encryption, where all data is accessible as soon as the decryption key is available.
- The two best known zero-knowledge proof protocols in the blockchain world are zk-STARKs and zk-SNARKs.
- These protocols have the advantage of providing shorter and faster proofs to verify.
- The research world is not done yet and is still looking for ways to optimise computation and data verification performance.
In the world of blockchains and cryptocurrencies, the concept of zero-knowledge is often mentioned as an efficient solution to confidentiality and privacy issues. The principle of this technology is to prove the existence of certain information without having to disclose it. This is a major breakthrough that does not hinder research on existing systems.
Is zero-knowledge proof the future of Blockchain?
It must be said that there are many practical applications for this technology. For example, it can be used to verify a person’s identity without revealing their name. If someone wants to prove that they are over 18, for example, they could use their driving licence. Yet this will reveal not only their age, but also their name, date of birth and several other personal details. Zero-knowledge evidence can prove that the person is over 18, without revealing any of the information on the driving licence.
Zero-knowledge can also prove that someone has made a transaction without revealing the amount. The process? The so-called cryptographic hash function. This hash function, which has no equivalent in the real world, is an algorithm that transforms any digital data (an image, a text file, etc.) into a fixed size value, such as a 256-bit sequence. For example, the SHA-256 standard is widely used in blockchains, as it has a high level of security, and always results in a string of 64 hexadecimal characters. And if it is offered the same file a second time, it will give the same answer, i.e. a 64 byte hash. These algorithms are standardised, and less than a dozen are recognised worldwide.
This hash can be compared to a fingerprint, which is much less complex than the original information, but allows for precise and unique identification. This fingerprint or minimal trace is recorded on a blockchain, and it is based on this fingerprint that it is possible to prove facts about this information without revealing the information itself. What is important to understand is that this technology is distinguishable from encryption, which uses a cryptographic algorithm to make data unintelligible and which can be made fully accessible with a decryption key. Encryption is the all-or-nothing solution: if you don’t have the key, you can’t know anything.
How does this work in cryptocurrencies?
Zero-knowledge proofs work differently, as they allow the veracity of hidden data to be proven without revealing it. In the Bitcoin network, for example, Merkle trees are used for data verification: this method consists of using only some of the data instead of all of it. In concrete terms, blocks contain transactions, and the headers of these blocks contain the root of a Merkle tree. This root allows a large amount of data to be “pledged” with very short hashes, and each piece of data can be individually certified. There is a so-called “thin client” version of Bitcoin, which in effect only downloads the block headers.
Zero-knowledge also provides proof that calculations have been performed correctly.
This pledge, or commitment, is a certification seal. If I want to prove that I have put a document in the blockchain, I produce the document, and everyone can check its validity. In the case of a Merkle tree, you must also produce a hash chain. But Bitcoin is not the only system to use Merkle trees, Ethereum, for example, makes use of three Merkle trees. They are essential for reducing the amount of data that needs to be kept in a blockchain for verification purposes.
This technology also brings a second important advance: zero-knowledge makes it possible to provide proof that calculations have been carried out correctly, without having to redo them, and without revealing all the necessary information. This is an undeniable saving of time and resources.
Which protocols are involved?
The two most common zero-knowledge proof protocols are known in the blockchain world as zk-STARKs and zk-SNARKs. The zk-STARKs stand for “zero-knowledge scalable transparent arguments of knowledge”, and the zk-SNARKs stand for “zero-knowledge Succinct Non-interactive Argument of Knowledge”. What they have in common is that they are not interactive in nature, which means that the evidence can be deployed and act autonomously. The submission and verification of evidence is usually done in batches, with many transactions. This makes the evidence much smaller and can be verified much faster.
zk-SNARKs have already been in use for several years through the Zcash protocol, which leverages them to provide a blockchain experience that respects the privacy of exchanges, while providing sufficient proof that each transaction is valid.
The zk-STARKs, on the other hand, appeared more recently, in 2018. In addition to privacy and confidentiality issues, zk-STARKs are positioned as a solution to the problem of scaling, i.e. the capacity of the blockchain to handle a growing number of transactions. By allowing computation and storage to be moved off-chain, zk-STARKs would also be more secure, as they are resistant to quantum attacks since they rely on hash functions that are not threatened by the quantum computer. Both zk-STARKs and zk-SNARKs thus pave the way for faster verification.
While they both use advanced cryptography, and while this zero-knowledge technology is already widely used by some startups, this potential to solve crucial industrial problems in the world of the blockchain has not made existing research silent about improving the performance of existing systems. Researchers still have a lot to say about performance, and want proofs to be as short as possible, quick to verify, and quick to compute.