Mit értünk üzleti telefonrendszer alatt és hogyan működik?

Üzleti telefonrendszer alatt az olyan megoldásokat értjük, melyeket üzleti előfizetők használnak az ügyfeleikkel, partnereikkel történő telefonos kapcsolattartásra. A hagyományos telefon alközpontok helyszíni szerverekkel, rézkábeleken keresztül működnek, míg a modernebb üzleti telefonrendszerek – IP PBX és VoIP – az internetkapcsolatot használják a hívások továbbítására.

Jobb a VoIP alapú üzleti telefonközpont, mint a hagyományos vezetékes?

A VoIP alapú üzleti telefonközpontok számos előnyt kínálnak a hagyományos típusúakkal szemben – feltéve, hogy a vállalkozás megfelelő minőségű internetkapcsolattal rendelkezik. Az üzleti VoIP rendszerek költségtakarékosak, rugalmasak, bővíthetőek és szinte korlátok nélkül integrálhatók a vállalat más rendszereivel, így növelik a munkavégzés hatékonyságát. A távoli munkavégzés sem jelent akadályt: a felhőben tárolt adatok bárhonnan és bármikor hozzáférhetőek a megfelelő jogosultsággal rendelkező munkatársak számára.

Hogyan lehet VoIP alapú rendszerré átalakítani a hagyományos telefonközpontot?

A hagyományos telefonközpont VoIP alapú rendszerré történő átalakítása nem egyszerű, hiszen a vezetékes telefonok nem rendelkeznek olyan hardverrel, mely a hangjeleket olyan digitális csomagokká tudná átalakítani, melyek az interneten keresztül továbbíthatók. Természetesen van megoldás, de az bonyolult, így időigényes és magas költségekkel jár. Érdemesebb virtuális VoIP telefonközpontot bevezetni, mely nem igényel beruházást, és rövid idő alatt implementálható.

Milyen előnyei vannak az ArenimTel üzleti telefonos rendszernek?

Az ArenimTel üzleti telefonos rendszer kompatibilis az üvegszálra épülő új technológiával, tudása rugalmasan változtatható, skálázható. Lehetővé teszi, hogy telefonközpontját költségkímélőbb módon üzemeltesse. Nem igényel beruházást, a szolgáltatás havi díjas előfizetéssel vehető igénybe, a mindenkori előfizetési díj az igénybe vett funkcióktól és a programot használó munkatársak számától függ.

Security White Paper

1. Introduction

1.1 Purpose of document

The purpose of this document is to give an overview of security solutions of KvantPhone application.

KvantPhone is a communication system in an isolated environment with usage of end-to-end encryption and identification which allows voice calls, private and group messaging via an encrypted channel.

1.2 Target audience of document

The document is primarily intended for the technical and operational decision-makers participating in the project, and secondarily provides a sufficient summary for the analysis for experts performing security audits.

2. General description of the system

KvantPhone is a closed, highly secure, electronic communication solution.

2.1 Basic functionality

Users can use the system with iOS and Android clients developed specifically for this system. KvantPhone is not compatible with external third-party messaging or VoIP clients due to the end-to-end encryption provided.

KvantPhone users cannot send or receive messages or initiate or receive voice calls outside the system due to the locked nature of the system, so a KvantPhone user can only communicate with one other KvantPhone user and, for security reasons, opening the system in the future is not expected.

The system includes an administration interface that allows you to create and edit organizations, users, manage contacts, manage licenses, and perform basic debugging functions.

2.2 End-to-end encryption

All messages in the system are transmitted encrypted from the client to the server. The server sees only the metadata necessary for transmission. The messages cannot be decrypted on the server without the encryption key, and only the recipient can do this if they know the key.

Messages are also stored on the client in an encrypted database, the opening and closing of which will be discussed later in this chapter.

Call encryption is also done by the client and the voice stream is always first attempted to be established directly between clients. If this is not possible due to network conditions, only in this case the encrypted voice stream flows through a relay server. Voice data, such as messages, cannot be decrypted on the server in the absence of an encryption key.

2.3 Controlled use

Ensures that messages and contact lists cannot be accessed by unauthorized persons when the device is unlocked.

On client devices, the user can only access the account. The identification process may differ depending on the type of devices used by the user. This identification provides additional protection to the username and password. It can be passcode or biometric identification, if supported by the client.

2.4 Service activation

An end user can be created in two ways: in the application or by a KvantPhone administrator. If the user is created in the application (a personal user), he/she can pay and manage subscriptions in the application and in his/her own application store. If the user is a business user, he/she can make an agreement with Arenim Infosec Ltd. and the administrator will create him/her in administration interface.

When a user is created by the administrator, the system sends the activation code to the user’s email address. If this is not the preferred way to send the activation code, the system will allow the administrator to display the activation code. The administrator can also deliver the displayed activation code to the user by an alternative preferred method (e.g. SMS, printout – however, this functionality is not part of the system).

The first step in the activation process is for the user to enter the email address registered in the system.

The second step is to enter the activation code, which is included in the activation email.

In the third step, the process is almost the same for both types of users: the user sets a passcode consisting of 6 digits of. This passcode will then be used to access the application on this device. Important: a passcode is only valid on one device. If the device makes it possible, and the administrator has made it possible, you can enable biometric identification (e.g. fingerprint, facial recognition) after entering the passcode.

In the last step, the users – those is allowed to make backups – can choose whether he/she wants to have his/her data backed up regularly on the server. If so, they will need to create a password for backup/restore. The user can use this password to restore their data during a new activation, even on another device.

2.5 Using accounts on multiple devices

A user account can be used on one device at a time.

As soon as the user activates his client on another device, the previously activated application will receive a notification of the new activation and will be automatically logged out.

2.6 Backup and restore data

The KvantPhone system allows you to back up your data at certain intervals and to store this backup on the KvantPhone servers. Of course, before the upload, the data will be encrypted using the strong access password provided during activation. The data stored on the server cannot be read without the key. Since the client application requires the use of a strong password, the backup can be considered properly protected.

2.7 Search visibility

The personal user can control who can search her/him in application by KvantPhone ID. If the user denies the permission request, he/she will completely be invisible in the system for other users.

A business user has no such permission in the application, his/her administrator can allow such search feature in the administration website.

3. General security solutions

3.1 Random number generation

Generated values are needed in many places in the application, but usually in passwords and keys. In the following text, we will look at what is used to generate a random number in each system.

3.1.1 On iOS devices

The preferred form of random number generation in iOS is SecRandomCopyBytes()[1] function. The function is passed the number of random bytes and the destination buffer. Apple continuously manifests[2] its solutions.

3.1.2 On Android devices

On Android, the getInstanceStrong() method is used, which uses the strongest available random number generator algorithm available on the device.

3.1.3 On backend servers

Server side uses SecureRandom()[3]-to generate the random numbers. It is invoked without a parameter, which then uses the NativePRNG algorithm. As an initial parameter the /dev/urandom is passed as the entropy source, so that calls are not blocked while generating the random number.

3.2 Jailbreak/Root detection

A bad actor could perform a rooting/jailbreaking on the victim’s device and, with some extra effort, this can be made potentially undetectable to the application layer. For this reason, it cannot be guaranteed that such a risk could be mitigated completely. At the same time, a security-conscious user should not get into a situation where his device can be rooted/jailbroken without his knowledge – since this would require active access to his/her device passcode.

KvantPhone does its best to detect such modifications. It is up to the administrator to define, whether jailbreaking is allowed with a warning, or whether it leads to a complete blocking of the user. This property can be defined at organizational, group and individual user level.

3.2.1 iOS jailbreak detection

On an iOS device, for verification purposes, we attempt to run code that can only succeed on a jailbroken device. The successful execution of these functions indicates that the device is jailbroken. If the client detects this, it sends this information to the server hidden in a regular request and, depending on the user/group/organizational configuration, this may block the user’s account to prevent further attempts.

3.2.2 Android rooting detection

On Android, The Play Integrity[4] component is used to determine whether the app or the device have been modified and re-signed, or whether the device is rooted, etc. If any security-related problems are detected, the server can warn or block the user, depending on the configuration.

3.3 Android play integrity protection

For KvantPhone Android clients, Google Play Integrity has been implemented. This communicates with Google Play Services and our backend server to ensure the security of the client and the application. Play Integrity checks the security of the client side on several levels:

It checks whether the device is trustworthy and original. It excludes rooted devices with custom ROM.
It checks whether the application signature or the application itself, are valid.
It checks whether the app has been installed from the Play Store or from some other or unlicensed source.

In case of any error or vulnerability, the user can be warned or blocked depending on the configuration.

3.4 iOS app attestation

Apple app attestation has been introduced for KvantPhone iOS clients. This component communicates with Apple and the KvantPhone backend server to ensure the security of the client and the application. If an application binary is broken or modified, the app attestation component detects this, and the server will immediately block the user.

3.5 Push notification security

The payload of the push messages is encrypted on the server in a form that clients can decrypt. The payload is encrypted using the AES-256-GCM algorithm. The algorithm needs 3 parameters: key, nonce and salt value:

The key is derived from the user’s push token. The derivation is performed using PBKDF2 with 4096 iteration steps and the HmacSHA256 algorithm.

The value of the nonce is 12 random bytes, which is generated on each transmission. It is generated using the SecureRandom class.

The value of the salt is 32 random bytes, which is also generated on every sending operation. The SecureRandom class is used for generation.

The encrypted payload is combined with the nonce and salt values and sent in the push notification. The content of the push notification looks like as follows following:

Base64(encrypted_payload).Base64(nonce).Base64(salt)

3.5.1 Security of the information displayed in push notifications

The content of the push notifications could be displayed on the device that is locked on the client. For this reason, the KvantPhone application includes a setting to specify whether the name of the caller or the sender of the message should be displayed in the notification. If this setting is turned off, all you see when a message arrives is “You have a new message.”, whereas if you allow the setting, you will see the name of the sender. Setting this value is the responsibility of the user.

3.6 Password strength measurement

In the mobile apps, uses must choose a strong password to configure the backup function and to perform the restore operation. The application will measure the strength of the entered password, and it will only be accepted if the complexity is above a certain level. The strength of the password is measured according to the byzxcvbn[5] password strength meter.

[1] https://developer.apple.com/documentation/security/1399291-secrandomcopybytes
[2] https://support.apple.com/library/APPLE/APPLECARE_ALLGEOS/HT209637/APPLEFIPS_GUIDE_CO_ARM.pdf
[3] https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/SecureRandom.html
[4] https://developer.android.com/google/play/integrity
[5] https://github.com/dropbox/zxcvbn

4. Account activation

A precondition for account activation is that the business user must be created by an in the system or the account should be registered in the application by a personal user. During the creation or registration process, the server sends an activation/verification email to the email address that the user entered in the first part of the activation/registration. If the user has access to the email account provided, they can continue the process by copying the activation code from the email into the application. If the user tries with invalid activation codes several times, the server will block the IP address of the client for 24 hours.

4.1 PIN code setting

A central security element of the mobile application is the PIN code, which is provided and entered by the user. The PIN must be a complex 6-digit number and the following rules apply when the PIN is chosen:

You cannot have the same number three or more times in a row (e.g. 111222)
The numbers cannot be in ascending order (e.g. 123456)
The numbers cannot be in descending order (e.g. 654321)

The PIN code has a key role in many places. It is required for:

Opening the local encrypted database.
Storing the OTP key on the device.
Accessing the backup key.

4.2 Use of biometrics

Most of the current mobile devices already have some form of biometric sensor that can identify the user by recognizing their fingerprint, face, or retina. Biometrical authentication is not considered as secure as the PIN protection, but for convenience reasons this feature has been implemented in KvantPhone.

During activation, the user can choose whether she/he wants to use biometric identification. This decision can be changed at any time in the application settings.

When using biometric authentication, the passcode is stored on the device in such a way that it can only be retrieved during a successful biometric authentication. The essence of the storage is that the passcode value is encrypted by retrieving the decryption key from SecureEnclave (iOS) or SecureElement(Android). On iOS this can only be achieved by biometric authentication, for Android by entering the corresponding device password also gives access to the passcode if the reuse time is also specified. The use of biometrics can be turned off in the application settings.

There is one more convenience setting related to biometrics, the biometric identification reuse time. This can be used to set how long after the last successful biometric device unlock the device will accept biometric identification (so you don’t have to authenticate 2x in a row, e.g. when opening a notification).

4.3 Backup

A key element of the account activation is configuration of the backup/restore process. It is very important to remember that the data stored in our application is always stored in a locally-encrypted database. If something happens to your phone, or your data gets corrupted, or in the event of a phone replacement, without a proper backup, all your locally-stored data (messages, call history) will be lost. This is what KvantPhone Backup is designed to remedy.

The backup should be sent to the server to be used in a possible later recovery. The backup is encrypted on the device with a generated strong password of 32 bytes. We cannot expect the user to remember or enter such a password during activation. Instead, this so-called master key is encrypted with a strong password chosen by the user and sent to the server in encrypted form along with the secured backup data.

The solution is described in detail below:

4.3.1 Setting up a backup

In the first step, the user creates an OPAQUE session with the server, which verifies the correctness of the data provided. The algorithm uses the libsodium library and the elliptic curve Curve25519.

The following figure shows the representation of the OPAQUE session creation process:

4.3.2 Restore from backup

Restoring from backup can only be done during application activation using the strong password you specified during backup. This process is also done using the OPAQUE procedure previously mentioned.

After successful recovery identification, the user will receive the encrypted recovery master key and a link to download the backup. As a first step of successful application activation, the data stored in the backup will be moved to the local database before the database is opened.

4.3.3 Store and delete backups

Encrypted backups will be stored on the server for a predefined period. This is a system configuration value, and its choice depends on the number of users and the available storage space. This setting can always be changed later according to the current requirements and environment.

The user can decide at each reactivation whether to use his backup. If not, the backup will be permanently deleted from the server. If the user disables the backup function on the client, the backup will also be deleted from the server.

5. Protecting local data

For mobile clients, sensitive data such as profiles, keys, call logs, contacts, and messages are stored in a local SQLite database, encrypted with the SQLCipher module. To encrypt the database, a corresponding key is required, which cannot be obtained from the device. For the generation of the key, a successful server-side authentication and the passcode known only by the user are both required. If biometric authentication is enabled in the application, the passcode can be retrieved from the keychain after successful authentication, where it is stored encrypted and protected by biometrics.

The database key is derived as follows:

The database can only be opened after a successful server authentication since the serverDBKey parameter is returned in a successful authentication response and is never stored on the user’s device.

We take the 16-byte long argon2 hash of our passcode, and we pass a salt value stored in the preference for generation. This hash will be the salt value for the final database key generation.

We take the 64-byte long argon2 hash of serverDBKey, where the previous 16-byte hash is used as the salt value.

The base64 encoded form of the resulting byte array will be the database key.

The database key is generated based on the passcode chosen by the user. If the wrong passcode is entered, no serverDBKey will be returned and the server will block the user after N failed attempts.

5.1 Database key format

The SQLCipher[1] accepts database keys in several formats. In our case, the base64 value of the key derived from server and local data is passed to the module as a “string” password. SQLCipher then hashes this with PBKDF2 key derivation with 64000 iterations to form the final database key. This is the basic operation, we have not changed this, so if you know the base64 format key, you can open the encrypted database file with an external application th

[1] https://www.zetetic.net/sqlcipher/sqlcipher-api/#PRAGMA_key

6. Protecting network communications

The client application communicates with three servers:

ABS server: provides the backend services needed to run the application
SIP server: used for VoIP communication (such as call setup and client registration)
TURN server: in case of network traversal issues voice data is relayed through these servers

Similar security mechanisms have been built into the application for both servers.

6.1 Client certificate-based authentication

The KvantPhone client establishes an encrypted channel with the ABS server using TLSv1.2. On the server, the list of acceptable ciphers is limited to the following:

TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (0xc02c)

The server also supports TLSv1.3 with the following cipher:

TLS_AES_256_GCM_SHA384 (0x1302)
TLS_CHACHA20_POLY1305_SHA256 (0x1303)
TLS_AES_128_GCM_SHA256 (0x1301)

In addition, the server requires client certificate-based authentication. A password-protected certificate is embedded in the clients securely.

6.2 Public key “pinning”

In addition to client certificate authentication, the other means of protection is to check the server’s public key. The client will only communicate with the server if its certificate chain sent during connection establishment contains a public key whose fingerprint is accepted by the client.

The client “pins” the public key of the server certificate. When the server certificate expires, a request generated from the same key pair is sent to the issuer, so the public key will not change, the process does not require a client update. Additionally, it is also important that only the hostname associated with the fingerprint is accepted. This way, hostname-public key fingerprint pairs are included in the KvantPhone application, and only hosts for which the public key fingerprint matches the fingerprint calculated from the received certificate are allowed to connect.

These two methods significantly reduce the risk of MitM attacks.

6.3 OTP authentication

After the application has been deployed, the client communicates with the server in the form of authenticated One-Time-Password (OTP) requests in the so-called activated mode.

6.3.1 OTP key generation

For the OTP authentication process, the server needs to generate an OTP key and send it to the client. The algorithm for generating the OTP key is as follows:

The server generates a 64 byte long random byte array using the nextBytes() method of the SecureRandom() class. SecureRandom() is initialized under Java 8 without any parameters, so you will always be able to use the preferred algorithm. Above Java 8, we can use the getInstanceStrong() method, which will use the most relevant algorithm available.
This generated random 64 bytes will be the OTP key itself.
The OTP key is sent to the client in base64 encoded form.

6.3.2 OTP mechanism

The essence of the OTP mechanism is as follows:

At the end of the activation, the server sends to the client the parameters required for OTP authentication: OTP key, OTP counter initial value.
The client saves the OTP key on the client’s file system using AES-256 symmetric encryption, ECB.
Despite its drawbacks, ECB mode AES encryption was chosen for the following reasons:
Whatever passcode is entered by the user, if the OTP key is encrypted in ECB mode them the decryption will result in a key derived from it (even in case of a faulty passcode).
With this method, we can generate an OTP value in all cases, and the server will check the correctness of this value.
This way, the client is protected against brute force passcode attacks, since a valid decrypted OTP key is only returned if the correct passcode is provided. The validity of the passcode is checked against the correct OTP value by the server, which blocks the user after 5 failed attempts.
If a non-ECB mode AES (e.g. GCM, CBC) were used, every time the user enters an incorrect passcode, the procedure would return an error, making brute force guessing of the passcode possible.

The OTP key (OTPK) is encrypted using the hash from the passcode in the following way:

Generate a 16 byte long random salt value => S
From the 6-digitpasscode, we use salt to generate a 32-byte hash using Argon2. => H
The OTP key is encrypted using the hash obtained in previous step as the symmetric key.
The encrypted key is saved to a file.
The salt and OTP counter values are saved to keychain (iOS) or encrypted secured preference (Android)

The passcode is required to read back the OTP key. Even in case of an incorrect passcode, we will get a result which will result in an incorrect OTP value.

OTP is generated by HOTP using the OTP key and counter parameters. After generation, the counter value must be incremented.

7. Voice call security

7.1 Voice data encryption

The voice stream is encrypted according to the SRTP specification (RFC3711). The selected symmetric encryption algorithm is the AES-256 GCM mode, implemented as described in RFC7714 with associated key derivation functionality. The default key derivation interval is 2³¹ packets, corresponding to approximately 37 000 hours. This is considered safe enough by all recommendations.

The above encryption package provides the highest security for SRTP as described in the RFC.

7.2 Key exchange

The purpose of the key exchange for the communicating parties is to agree on the cryptographic algorithm to be used for SRTP and to exchange the master key and salt used in SRTP in a secure way.

The key exchange is based on the ECDHE_ECDSA defined in RFC8422 for TLS. This is simplified and modified for UDP operation with ideas taken from the ZRTP and DTLS specifications.

Several algorithms can be used for key exchange; in the initial phase of the protocol, the caller and callee can choose algorithms to use for signing, key exchange, and hashing.

In the quantum-resistant solution, we combine the classical Diffie-Hellman procedure with Kyber key encapsulation methods.

7.2.1 Key exchange process

The key exchange process is illustrated in the following figure. Key exchange happens in the media stream as soon as the call is established.

The caller sends a “Hello” message, initiating the key exchange process. The message includes encryption algorithms known to the caller and a secure random number generated.
The caller selects the encryption algorithm he knows from the offered list, generates his own random number, and generates the hashes of the two random numbers, his ECDH public key and KEM public key. It signs the hash with both classic and post-quantum private keys and sends the CalleeKeyExchange message.
The caller performs the same steps (hash calculation, signature verification, generation) and sends his key exchange message to the callee: CallerKeyExchange.

At this point, both parties can calculate the ECDH shared secret from the ECDH keys and the KEM shared secret from Kem keys and ciphertext. From ECDH common secret and KEM shared secret a common master secret can be derived using HKDF key derivation function. The key of the HKDF algorithm should be the KEM shared secret concatenated with the random numbers of the caller and the callee, the salt is the ECDH shared secret and “master key” string is used as label.

The result is a master key agreed upon by both parties.

The master key is used to derive the SRTP master key and master salt, which are sent in the Finished message.

After calculating the common key, the caller sends the Finished message, which contains a 128-bit initial vector and hashes of the master key, a label, and the hash of handshake messages encrypted with AES-256 GCM algorithm using the hash of SRTP master key and a label as symmetric key and with a random initial vector.
The callee sends their own Finished message after the caller’s Finished message is received.
Once the caller receives the Finished message, he sends FinishAck in response.

After the key exchange is over, SRTP can be activated for both parties:

when FinishAck arrived at the calling page.
when the Finish message arrived at the page you were calling.

7.2.2 Message fragmentation

With the incorporation of quantum-safe encryption algorithms, the size of the key exchange packets is also greatly increased. Key exchange is still done in the RTP stream, but RTP packets have an upper size limit of 1500 bytes. This includes both the IP header and the UDP header, so that’s a total of 28 bytes, so you can expect RTP packets of around 1470 bytes. Without the use of quantum-proof algorithms, this is sufficient as none of the key exchange packets exceeded this size. In the new version, however, packets larger than even 6000 bytes are possible. The solution to this is to split large packets into several packets and, on the receiving side, re-assemble the data structures and interpret the current operation of the key exchange procedure after all the parts have arrived.

Care must be taken when fragmenting packets, and one must also be prepared for possible attacks where the attacker tries to access unauthorized memory space by modifying the packet. The basic concept is to chop a long message into fragments with a maximum length of 1400 bytes. The size of the last fragment can vary.

7.2.3 Default algorithms

The following table summarizes the default, currently chosen algorithms which are used in key exchange process:

Hash	SHA-512
Classic signature	Ed25519
Quantum-resistant signature	Dilithium5
Symmetric encryption	AES-256-GCM
DH key exchange	x25519
Quantum-resistant key exchange	Kyber1024

7.3 Receiving calls on locked devices

In a recent iOS update, Apple performed a change, which requires VoIP applications to use the so called CallKit for call handling. This change required VoIP application developers to implement call answering features on the native screen while the application is in the background, or on a locked device. From the user’s point of view, the experience is great because they don’t have to click on a notification and open the app, authenticate, and receive the call in the app: they can receive the call in one go. From security perspective, this requires a new approach, since the protected database of KvantPhone cannot be accessed before the application authenticates the user.

Meanwhile, Android has also added the ability to show the user a full-screen notification (incoming call screen) on a locked device without unlocking, where they can answer the call.

The original philosophy of the application was that calls can only be answered with an unlocked database, because this is where the user’s primary keys are stored, and these are required to set up a secure call. Each participant of the call has to sign the key exchange packet with its own private key, which is verified by the other side with the signer’s public key. Since a call can only be initiated from within the application (no Siri or other integration is enabled), the calling party’s private key is always available. The challenge for the receiving party lies in the closed state of the database during call establishment while the application is not in the foreground.

KvantPhone solves this situation the following way:

During application activation, two pairs of keys are generated for the user for each algorithm. One is the primary key pair which is kept in the protected database, and the other is a secondary key pair, which is saved to a part of the device that is secured by the operating system itself or the device hardware: the keychain for iOS and the Secure Element for Android. Both keys are signed by the server and the public key of both key pairs is sent to the users in a query, indicating which is the primary and secondary key.

During call setup in encryption handshake, the caller party gets flags in the RTP packets about the callee party’s state weather she/he is in foreground or background when accepting the call and encryption. Hence the remote public key based on flags should be used at caller side. We indicate to the user on the call screen whether the call is with an “identified” party, i.e. the call was built using the primary key, or “unidentified”, i.e. the packet verification was successful using the secondary key.

This way, we can ensure that calls are always established in a user-friendly way, replacing the emphasis on the identified/unidentified caller display.

The encryption and confidentiality of the call stream itself is not affected; this approach has effect only in the identification of the remote party.

8. Instant message security

The KvantPhone Diffie-Hellman (KPDH) key exchange protocol enables two parties to generate a shared secret key by using their respective public keys for mutual authentication. This protocol ensures forward secrecy.

In general, such key exchange protocols require both parties to be available at the time of sending, which is not always the case for mobile applications. To handle the situation when the receiving party is not running the application in unlocked state, KvantPhone uses the concept of so-called pre-keys.

8.1 Safety considerations

Before sending a message, a DH session extended with post-quantum key encapsulation must be established.
The session is maintained forever, but shared secrets are hashed after a message is sent or received. Thus the “forward secrecy” property applies to messages already sent.
Normally, the session will not be terminated, but the session must be reestablished if one of the parties does an account reset for whatever reason, be it an application reinstallation, device change, or similar.
The session data and messages waiting to be sent are stored in the local protected database.
The session ID and statuses are saved in the unencrypted database as well, as in some cases it may be necessary to detect rebuildable sessions with applications not running in foreground.
To keep the hashed session states in sync, the number of messages sent and received is saved and sent with an AES encryption key derived during a session build in each message. This key is not protected by a pin code so that it can be accessed at any time and a message can be decrypted as soon as it arrives.
Messages that can be kept for a long time are stored in a database with maximum security.
To send messages to back-end partners, each party must have a pre-built initial compound DH and KEM key exchange prekey, which is synchronized to the server, so that the sending party can retrieve such a prekey and immediately establish the session.

8.2 Protocol participants

In the KPDH protocol, we distinguish four participants: Alice, Bob and two servers: SIP, ABS.

Alice initiates the communication, aiming to send a secure message to Bob. She starts the messaging session by setting up an encrypted channel with Bob, allowing for ongoing message exchange.

Bob is the receiving party who, in this case, can receive messages from Alice by sharing key information beforehand that allows Alice to share a shared key with Bob and thus establish an encrypted session. A very important aspect is that Bob may be offline at the time of sending the message and may receive Alice’s messages days later.

The SIP server is responsible for getting messages from Alice to Bob and back. In addition to relaying messages, it also temporarily stores messages sent to offline users. For messages sent to a chat room (group chat), it also checks whether the message goes from an authorized group member to an authorized group member.

The ABS server stores the certificates of users, which are needed to validate signatures. In addition, the client synchronizes the pre-keys to the ABS server, which are necessary to build the message session.

8.3 Applied algorithms

The following algorithms are used in the KPDH procedure:

NO.	Algoritm	Parameters	Description, usage
1	AES	256bit, GCM mode	Encrypt message text, encrypt message metadata
2	Ed25519[1]		Classic signing of key exchange structures, signature verification.
3	SHA2	Length: 512bit	Deriving counters, sending and receiving keys, control hashes
4	ECDH	Curve: x25519	For key exchange of message session
5	Dilithium	Dilithium5	Post-quantum signing of key exchange structures, signature verification.
6	Kyber	Kyber1024	For key exchange of message session

[1] Az RSA algoritmus csak a küldő validálását szolgálja.

8.4 How the KPDH works

The KPDH procedure consists of the following phases:

Bob uploads his custom public key and generated pre-keys to the server.
Alice will download Bob’s public key and a pre-key from the server, which will be used to generate the initial message.
Bob will receive and process Alice’s startup message.
The conversation continues in the established channel.

8.4.1 Uploading keys to the server

When activating the application, each user generates an Ed25519 and a Dilithum5 key pair whose public key is signed by the KvantPhone’s Certification Authority server. The server saves the public key in the user’s profile and returns the certificate to the client. The private part of this key pair never leaves the client device, where it is stored in the protected database. The public key needs to be uploaded to the server, or in case of account reset, a new key pair will be generated by the client and uploaded to the server, and the old key pair will be deleted.

Additionally, each user will generate an N-element single-use pre-key stack during activation, which will also be uploaded to the server. The number of N elements can be configured considering the number of users, number of messages exchanged, number of sessions, and number of groups.

8.4.1.1 Send first (initial) message

When a user attempts to send a message to a partner with whom he has not yet exchanged messages, he must set up a new session before sending the first message. The session setup consists of the following steps:

Alice fetches Bob’s Ed25519 and Dilithium5 public keys from the server.
Alices fetches a one-time pre-key for Bob. This pre-key is deleted on the server immediately, to avoid re-use.
Alice verifies the signatures of Bob’s pre-key. If the verification fails, Alice aborts the protocol.
Alice generates an ephemeral x25519 DH keypair and a Kyber KEM keypair.
Alice derives the DH shared secret using Bob’s pre-key and her own DH keypair. Additionally, Alice encapsulates Bob’s KEM public key to derive the encapsulated shared secret.
Alice derives the session key from the DH shared secret and KEM shared secret using an HKDF algorithm.
Alice derives the initial sending and receiving session keys with counters.
Finally, Alice encrypts the message and sends it with a responder key exchange part.

8.4.2 Receiving the initial message

Bob will receive a message with the information needed to set up the session. Processing takes place in the following steps:

Bob checks that the pre-key can be found in its own set of pre-keys. If not, the message cannot be decoded and will return an error to Alice. Bob will be notified of the message, but its contents will not be visible until Alice resends it in a new session.
1. This case can occur if Alice sends an initial message to Bob, but Bob performs a reset/recovery before actually receiving it, during which the set of pre-keys is deleted and new ones are generated.
If the session ID exists, the key exchange structure and the encrypted message are saved in the unprotected database, but in a fully encrypted format as sent from the other side.
1. If the application is running in the background, the protected database is locked and will only be opened when the application comes to the foreground (after successful authentication).
2. Messages can be received in the background, but they will be saved in encrypted state in the plain database until the user unlocks the application and opens the protected database.
After successful authentication, Bob opens the protected database and processes the received messages in the following steps:
Bob checks the signatures of key exchange structure. If it fails, he aborts the procedure.
Bob computes the DH shared secret and KEM shared secret decapsulating ciphertext of key exchange data.
Bob derives the session key from shared secrets and sending and receiving keys with counters.
Bob finally decrypts message and increments the message counters.
Decrypted messages are saved in the protected database.

8.4.3 Sending and receiving of subsequent messages

Sending and receiving all further messages in an already established session is done in a similar way as for the first message, except that the session establishment is no longer needed, but instead the existing session info and data structures needs to keep in sync on both sides.

8.4.4 Ratcheting

The above-mentioned solution ensures Perfect Forward Secrecy, because of the hashing, the “ratchet” mechanism can only be operated forward in time, not backwards. However, if the user’s data is compromised without his knowledge, the attacker can perform the same operations as the legitimate client, since no new unknown secret is involved. With the acquired data and encrypted messages in hand, the plaintext content of the message exchanges can still be tracked.

This attack is addressed by the mechanism to regenerate and exchange keys during message exchanges. For each message, an ECDHE key exchange travelling in the already signed and encrypted message could be added, which could be used by the recipient to encrypt the next message. The same could be done for the PQ key exchange, but since its ciphertext and public key size is significant, the number of times per message that the re-encryption should take place has to be determined.

This solution is not part of the initial version, but it will be included in a later version.

8.4.5 Message delivery errors

Messaging is based on establishing a session between two parties with the first message and then with being able to send/receive further messages in that session without having to constantly exchange keys. However, during the use of the application, a number of unexpected events may occur that may affect the processing of messages.

For instance, there may be occasions when one, the other or both parties reset / restore the application. In such a case, new key pairs, and new pre-keys are generated and the old ones are deleted. In a case where there is already an established session between two parties and one party changes its keys, the other party will not be able to decrypt the messages received and vice versa.

In such cases, the application will send an “undecipherable” delivery notification back to the sender, who will then resend the message in a new session.

It is important to note that resending is only possible if the sender is in the foreground with an open database.

This “undecipherable” message indication is delivered using push notifications, and as there is no actual message content in such a case, it would be a strange user experience to receive a message notification, but actually not see any message when logging into the application. To overcome this issue, KvantPhone app displays a “broken” message bubbles (without actual content) in such cases, indicating that something is broken. In addition, the receiving party can manually initiate a message resend request, so that the sending party is immediately notified that there is a message to be resent.

8.4.6 Session counters deviation

The send/receive process relies heavily on the fact that messages are sent to a queue and can only be received/decrypted in the exact order in which they were sent. As it be seen from the previous explanations, the message key changes every time a message is sent, as it is derived from the previous key. These keys are kept in sync with the message counters.

In a normal case the sender sends the message with the same counter as the receiver is expecting based on the previous message in that session. However, other cases may occur:

The incoming message counter is smaller than the current session counter.
Incoming message counter is greater than the current session counter.

Case (1) means that we received an older message while we were already processing newer messages, while case (2) means that we received a message that is not the next message in the queue, but a newer message. In this case, there would be a “hole” in the queue number and the messages received in case (1) could not be decrypted.

In case (2), possible keys for the missing messages are temporarily generated and stored in the protected database. If we receive a previous message in case (1), we check if we can find a generated key for it, if so, we will decrypt the message and delete the saved intermediate message key. If no key is generated, the message will not be decrypted.

This solution also ensures that missing messages do not make subsequent messages undecipherable.

8.4.7 File sending

The KvantPhone application provides the possibility to send/receive pictures/files in a conversation or chat room. File sending is based on messaging. The process of sending a file is as follows:

It is checked whether Check that the selected file format is supported and whether its size does not exceed the maximum amount. This limitation can be configured on the server.
A strong unique and random password is generated to encrypt the file using the AES-256-GCM symmetric algorithm.
The file is uploaded to the server where a random filename. This filename is returned to the client upon successful upload.
We create a message containing the file metadata (returned filename, name, size, type) and the key, nonce values needed to decrypt the file.
The message is sent to the remote party over the encrypted channel (in session) already established or established with this message.
In addition to returning the usual notifications (received, read), the remote party also sends feedback if it has downloaded the file from the server.

The files are stored on the server for a pre-configured period (defaults to 1 year), which can be configured the same way as in case for the backups, according to the characteristics of the system and environment.

9. Chat room security

Fundamentally, there are two basic methods for sending a message to a group of contacts: the client sends the message to all group members one by one, or the client sends the message to a server, which distributes it to the group members. The KvantPhone application uses the first method, i.e. the client sends the message to all group members individually.

The chatroom messaging logic is very similar to the private messaging described above. An individual session is established with each group member by the sender based on the methods described in the private messaging. As additional information, the group ID is included in each message that a client sends to a group member.

This may result in a phenomenon in which a single user may have several different live sessions with the same other user: one private session and as many group sessions as many common groups they are member in.

The security of the communication with each member is therefore equivalent to the security level of private messaging, as described in the previous chapter.

In the case of a chat room, on the one hand, the administrative functions must be controlled and protected against attacks, and on the other hand, it must be ensured that only those who are actually members can post to a particular room.

9.1 Chat room protection – when sending messages

The server has information about the members of the chat room. When a message is sent, the identifier of the room can be read from the message by the server, as well as the identifiers of the sender and receiver. The server can undoubtedly decide whether to deliver the message to the recipient. In such a case, four different scenarios are possible:

The sender and the receiver are both members of the room: in this case, the server transmits the encrypted message.
The sender is a member of the room, but the recipient is not: in this case the server will not deliver the message. Such a situation may occur during normal operation if a member has been removed from the room, but the sender does not yet have this information. Another case is a potential attack scenario, where an unauthorized party wants to intercept messages not originally intended for him.
The sender is not a member of the room, but the receiver is. This is like case 2, and the message should not be delivered.
Neither the sender nor the receiver is members of the room. Such a message cannot be initiated from the client, it is definitely a sign of some malicious attempt, and the server will not deliver the message.

9.2 Chat room protection – when changing room details

Another way to potentially compromise the chat room security is for an attacker to pretend to be a legitimate room member, perhaps by gaining unauthorized access to the room by simulating a change of membership (adding himself)

In the chat room the following actions administrative actions are available:

Create a room
Delete a room with history kept for members
Delete a room for everyone
Change room name
Exit a room
Add a member to an existing room
Remove a member from existing room
Actions 1,2,3,4,7 can only be done by the room creator, while the other actions can be done by any of the members

The protection of the actions is based on the fact that a room modification must be signed by the user who performs the action, which is attached to the room. The signature of a room is therefore always generated by the last user who modifed the properties of the room.

Group members are immediately informed about these actions, and the changes will be synchronised. During synchronisation, the room signature is checked and, in case of a bad signature, the modified room data is not accepted or applied.

10. Administrative interface

The system includes an Administration interface that is available for corporate subscribers. This interface allows you to create users and check their status, as well as help in certain cases.

The Administration interface can only be accessed from Trusted networks.

10.1 Identification of administrators

The administrator can access the interface from trusted networks using two-factor authentication. The administrator will be notified by email upon the creation of his/her account, which will contain the password required for the first login. Upon first login, the administrator will also receive an email with a verification code. After providing the code, the administrator must change the password before logging into the interface. The administrator’s password must be a strong one, containing a lowercase and uppercase letter, a number and at least 6 characters.

The administrator’s password is hashed on the server using the bcrypt procedure. The validation code is valid for 10 minutes.

The administrator session is valid for 10 minutes and is identified on the server by a Secure, HttpOnly, Strict, SameSite, 10-minute session cookie. The administrator is automatically logged out after 10 minutes of inactivity.

Apart from the two-factor authentication, all requests to the server contain a HOTP token. For HOTP generation, the key is derived from the activation code used for two-factor authentication.

When the administrator logs out or after 10 minutes of inactivity on the server, his session is deleted, which includes the OTP identification details.