Featured image of post FoCS 08 - Security

FoCS 08 - Security

Digital security is based on Math.

The story of cover: inventors of RSA algorithm.

# CIA

CIA is the short form of three fundamental concepts in security. When security issues arise, it means that one or more of these have been compromised. The goal of security is to ensure their secure.

# Confidentiality

Confidentiality means the data or message is only known by the people who should know. For example, if you answer a phone call on a bus, the people who should know your response is on the other side of the phone call. However, the people around you also can listen to your reply, even though they are not intent to do that, which makes confidentiality potentially compromised.

There are two kinds of attack to cause confidential problem: sniffing and traffic pattern analysis. Just like the case above, sniffing is like a person around you, observing what you did and attempting to retrieve valuable information from your message. But today, most information has been encrypted during transmission, which can help us avoid this kind of attack. On the other hand, attackers might collect and analyze the behaviour pattern of victim, and make reasonable assumption based on their conclusion from the analysis and statistics. For instance, although the attacker does not know the password of your bank account, he knows that you always go to the bank at 4 pm.

# Integrity

Integrity means the data has not been modified by people or actions without permission. Just like distortions in a conversation, the message cannot convey its original meaning, even making errors. As the same as issues of confidentiality, it can lead to serious consequences. Not only is the integrity for the integrity of the message content itself, but also the metadata and other aspects, including the source of the message, anti-replay, anti-denial.

Replay attack is a kind of possible attempt for attackers to reproduce a former operation illegally. The attacker captures a section of traffic and send to the target, impersonating the legal identity. Namely, the victim has transfer money to the attacker for some reasons, and the attacker wants to get more. Replay attack can impersonate the victim to initiate a new transaction. In contrast, denial attack is issued by the legal user in the communication (sender or receiver). Considering the case of an e-commerce store received the payment from the buyer but refused to ship because claiming they have never received it.

# Availability

Availability means users can always access the service they need. The common measure to cause availability issue is denial of service attack, which exhaust the resource of the service provider to make the service no more accessible. The general tackle is to filter incoming illegal requests and scale the service/cluster up. Since availability is not the main point for this post, I am going to skip this part.

# Encryption

Before introduce the concept of encryption, there are two simple concepts we need to grasp:

  • plaintext: the original content of data without any encryption.
  • cipher text: plain text after encryption.

The traditional encryption methods are character-based, while the modern way is based on bits to support various media. In this section, we only discuss the case of modern cryptography. In addition, cipher text can also be restored to plain text, so encryption and decryption are the reverse operation for each other.

# Work Mode

Work mode is another algorithmic concept which defines how encrypted data is organized. In this post, we will discuss two primary work modes: stream and block. For stream work mode, The plaintext will be treated as a bit stream (a long sequence filled with 0 and 1), and the encryption process is bit by bit. In this mode, the algorithm will generate a cipher stream as the same length as the plaintext, and each bit of plaintext and cipher stream as input, the algorithm will generate a bit of cipher text.

stream cipher: XOR

For block work mode, the encryption process performs on blocks of the plaintext, and the size of block depends on the specific algorithm. For the part is not long enough to be a block, it will be padding 0. The algorithm also generates a block cipher that is the same length as the quantity of blocks. A block of plaintext, the cipher for the current block and the cipher text of previous block contribute to the input of the algorithm, while the algorithm outputs a block of cipher text. Stream cipher is usually faster than block.

block cipher

# Symmetric Encryption

This kind of encryption is more known to public since it is more straightforward: the process of encryption and decryption uses the same key. From today’s perspective, Some symmetric encryption algorithms seem impossible to be hacked, and they have been applied to various industries. For example, AES-256-CBC, its key length is 256 bits long, and requires a number of million of years to be brute forced. The key of the algorithm is generated according to the seed provided by users (two or more sides of a communication). To ensure all users have the same key to encrypt/decrypt successfully, all users should have the same seed to generate the same key, so the approach to synchronize the seed between users safely is the prerequisite to use the algorithm.

symmetric encryption

# Asymmetric Encryption

Asymmetric encryption is a solution for secure seed exchange. Compared to symmetric encryption, both the sender and receiver have their own pair of key consisting of private key (only stored locally and never exposed to the public) and public key (exposed to the public). All information encrypted by the public key can be decrypted by the private key.

asymmetric encryption

Asymmetric encryption can also verify the correctness of both sides’ identities if the private keys are safe. Actually, a pair of key have more than this, which I will introduce later. Before the end of this part, may be you have another question: if asymmetric encryption is so good, why do we need symmetric encryption? That’s because asymmetric algorithm runs slower than symmetric ones, which is not suitable for a immense content encryption/decryption.

# Steganography

Today, the largest issue of encryption is no longer a confidential issue, but the trait of applying encryption is so obvious: a stream without any pattern. Despite encryption algorithms can protect the confidentiality of message, that doesn’t mean all good. As the upper gateway/firewall upon the communication, they can still block the communication because the pattern of traffic cannot be recognized. To avoid this from occurring, the sender should minimize characteristics of the encryption and make the traffic appear as ordinary as possible. For example, you can use the color codes of pixels on a specific area of an image to convey message.

steganography

# Hash Function

Hash function is a kind of common-used function in multiple fields, which generates a fix-length characteristic code according to the original content. Any changes on the original file would change the code dramatically. Hash function has 3 traits:

  • same input, same output
  • unreversed: you cannot infer the original content based on the characteristic code
  • anti-collision: for different inputs, it’s unlikely to output the same value

It can help check the integrity of data, generate a unique key for quick search, and etc.

# Digital Signiture

Digital signature is a technique based on hash function and pair of key. Just like the physical signature, it proves: the identity of the signer and the operation/data has not been modified.

digital signature

# Reference

comments powered by Disqus
Hosted by Cloudflare
Built with Hugo
Theme Stack designed by Jimmy