23 Jun Protecting your data against unauthorised reads on SQL Server
The need to protect data from being read by unauthorized people has always been there. In the last years, this need has even increased. One of the first things that come to mind is e.g. GPDR but there are many other cases why you need to protect your data.
Here are some examples(*):
- 2012: LinkedIn announced that 6.5 million unassociated passwords (unsalted SHA-1 hashes) were stolen by attackers and posted onto a Russian hacker forum. However, it wasn’t until 2016 that the full extent of the incident was revealed. The same hacker selling MySpace’s data was found to be offering the email addresses and passwords of around 165 million LinkedIn users for just 5 bitcoins (around $2,000 at the time).
- 2013: roughly 153 million user records were stolen from Adobe. The hack had also exposed customer names, IDs, passwords and debit and credit card information.
- 2014: eBay reported that an attack exposed its entire account list of 145 million users in May 2014, including names, addresses, dates of birth and encrypted passwords. The online auction giant said hackers used the credentials of three corporate employees to access its network and had complete access for 229 days—more than enough time to compromise the user database.
- 2016: 421.2 million user records were stolen from Adult Friend Finder. The stolen data spanned 20 years on six databases and included names, email addresses and passwords.
- 2019: 137 million accounts were stolen from Canva. The attack exposed email addresses, usernames, names, cities of residence, and salted and hashed with bcrypt passwords (for users not using social logins — around 61 million) of 137 million users.
- 2019: a spectacular case of data theft was done by an employee of the “UBS Group AG”: https://www.bloomberg.com/news/articles/2019-01-07/ex-ubs-employee-charged-with-data-theft-on-trial-in-switzerland
- September 2020: Thousands of MSSQL Databases infected by new MrbMiner malware
As the last example shows, this was not an external attack, but a data theft done by an employee working for the company. So not only external threats need to be managed, also internally the necessary security measures need to be taken. And don’t forget: also external attackers get smarter!
Running your databases in the backend is a good starting point, but with previous examples in mind, do you think it will really avoid any data to be stolen? What do you do when your vendor asks a copy of the data for testing purposes? And what about your Development/Acceptance and Testing databases? Is the data copied from Production without any further action, so it is fully visible for your developers and testers (which possibly are external consultants)?
Let’s have a look at some available solutions on both client and database level so that you can decide which level of protection your data needs.
Start with the basics
The next 2 solutions should be implemented correctly by default, it’s a starting point before you go any further.
Encrypt the storage of your client devices
This can be done by encrypting the hard disk(s) on your laptop/desktop. Many new PCs that ship with Windows 10 will automatically have “Device Encryption” enabled. This feature was first introduced in Windows 8.1, and there are specific hardware requirements for this. Not every PC will have this feature, but some will.
If Device Encryption isn’t enabled—or if you want a more powerful encryption solution that can also encrypt removable USB drives, for example—you’ll want to use BitLocker. Microsoft’s BitLocker encryption tool has been part of Windows for several versions now, and it’s generally well regarded. However, Microsoft still restricts BitLocker to Professional, Enterprise, and Education editions of Windows 10. An alternative is third party software such as TrueCrypt, DiskCryptor, Comodo, …
On Apple computers one can use FileVault.
For Linux we have: Tomb, Cryptmount, CryFS, GnuPG, VeraCrypt, …
It’s recommended to enforce a policy which encrypts portable external storage (e.g. USB sticks/drives) if the intention is to use them for write operations.
If the device is lost/stolen, storage encryption prevents the data from being read.
Limit the accounts and rights on the server and databases
Giving access to SQL Server is a 2 step procedure:
- Create the account on the server level.
- Grant this account access to one or more databases.
As you don’t want the whole company to have access to the accounting database, only add the accounts which need this. E.g.: do not give the NT-group ‘MyDomain\AllEmployees’ access to the server dedicated for accounting and financial applications.
On the database level, clearly define which accounts have access. E.g.: do not enable the “guest” account.
Be extremely careful with the high privileged roles on the server level:
- Use a dedicated NT-group which contains the users needing sysadmin rights. Review this group at least once every year.
- Give the “sa” account a random password and then disable it.
- No end-user applications should run with sysadmin rights.
Verify in every database the membership of the “db_owner” role: no end-user application should be member of this role.
Go to the next level
These statements have been around since the dawn of time. They are used for fine tuning how and by whom an object can be accessed. This is a simple solution and will work just fine for “normal” end-users.
- Easy to implement and maintain.
- Members of the db_owner database role and members of the sysadmin server role have access to all objects.
- Network sniffers can read the data as it passes through the network.
Users which have NT-administrator rights, can copy the backup of the database and restore it on another server/desktop/laptop where they have sysadmin rights.
Use Transparent Database Encryption (TDE)
As the name suggests, TDE encrypts the complete database but this process is transparent for the end-users. In other words: you don’t have to modify your application.
Although the data files are encrypted, the decryption is done when the result is sent to the client. This makes the data visible for network sniffers.
Before SQL Server 2019 this was only possible using the Enterprise Edition, starting from SQL Server 2019 it is available for Standard Edition users too.
Of course, to prevent access to the data, you still have to restrict access by selecting the accounts which have access and by using GRANT, REVOKE, DENY on the database objects.
Note that your backup files are protected as well: without the correct certificate it will not be possible to restore the backup on another machine.
- Relatively easy to implement.
- Transparent to end-users (no software modifications needed).
- Impossible to restore on another server without the correct certificate.
- Now also available with the Standard Edition of SQL Server 2019.
- Members of the sysadmin server role have access to all objects.
- Network sniffers can read the data as it passes through the network.
- Only available on the Enterprise edition of SQL Server (if using a version older than 2019).
- Extra overhead when restoring the DB on another server.
Transport Layer Security (TLS)
The “Use GRANT/REVOKE/DENY” and “Use Transparent Database Encryption (TDE)” methods have the same nagging problem coming back: network sniffers. To close this security hole, we can encrypt the communication between the client & SQL Server using TLS.
It’s worth mentioning that enabling encrypted connections can have a performance impact. This is due to the extra network round trip when establishing a secure connection, as well as the encryption and decryption overhead which is handled by the .NET-libraries on the client and server.
- Network sniffers only see encrypted data.
- Only the connection string of the client application needs to be modified.
- Encryption while communicating with SQL Server can be enforced.
- There might be an overhead in encrypting/decrypting the data.
- Members of the sysadmin role can still access every object.
Encrypt data programmatically
You can either devise your own encryption code, use the “Encrypt/Decrypt” methods in .NET or buy third-party libraries to encrypt/decrypt your data.
This will incur a lot of work as every piece of code which communicates with SQL Server will have to encrypt/decrypt data. If somewhere in your application you forget to apply an encrypt/decrypt operation, this can result in serious data corruption.
- As the decryption is done on the client’s computer, even members of the sysadmin role won’t be able to read your data.
- Network sniffing will only show encrypted data.
- You have to modify & maintain your application so that the encryption/decryption is done in all places where it is needed. This might be quite time consuming and error prone.
- Most probably you won’t be able to do comparisons on the encrypted fields, nor will you be able to use them in a JOIN statement.
Dynamic data masking
This technique is available since SQL Server 2016. Instead of encrypting a field, you mask either a part or the complete field using a function.
E.g.: Here we mask an email address
By default, only the owner of table will see the unmasked data. If you grant SELECT to the table to other accounts/roles, they’ll only see the masked data.
Giving access to the unmasked data in done via the GRANT command.
GRANT UNMASK TO [ro_Accountants];
- You’re protected against unauthorized reads in the application and against exposure to developers or DBAs who need access to the production DB.
- Data is masked on the fly. No encryption of the underlying data.
- Transparent to the application.
- Applied according to user privilege (GRANT UNMASK TO …)
- Network sniffing will only show the masked data.
- When casting/converting the column to another format, the complete field is masked.
- You can guess data. E.g. this query will return the correct result:
SELECT Lastname, Name, email FROM tb_Persons WHERE email@example.com'
If you take your data seriously then you should have a strict separation between PROD and non-PROD: PROD should only be accessible for the business. Developers should only have access to non-PROD.
However, before releasing a new version in PROD you want to test it thoroughly in non-PROD with enough relevant data. So you have to copy your PROD DB to non-PROD, don’t you?
The answer is yes, but you should scramble the sensitive data so that it becomes impossible to link the data with the real information in PROD.
Setting up a good scrambling method is far from easy but fortunately there are some good third-party tools available to make this a smooth process. E.g.: Red Gate Provisioning & Data Masker, CA test data manager, Net2000 Data Masker, …
- Your developers/testers can use a representative amount of data.
- Sensitive data can no longer be linked in non-PROD.
- Setting up the scrambling process for the first time can be time consuming.
- Must be reviewed each time new tables/columns are added.
Before this is feature was only available on the enterprise edition of SQL Server (versions prior to SQL Server 2016).
Since version 2016, this has been made available on all editions.
In this model the data is encrypted on the server and, in contrast with TDE, stays encrypted until it arrives at the client.
- Data remains encrypted on the server and remains encrypted until it arrives at the client.
- Only clients which have the correct certificate installed can decrypt the data.
- In comparison with programmatically encryption methods, there are less modifications to be done on the application level.
- Extra overhead to distribute the certificates to the correct clients.
- The application must use SqlParameter objects when passing plaintext data to the server with Always Encrypted columns.
- Encrypted columns do not allow range-like operations such as ‘>, <‘ , ‘LIKE’, etc.
- Passing encrypted values to functions, user-defined or otherwise, is not allowed (the database doesn’t have access to the unencrypted values).
- Equality comparisons can only be performed on columns that use deterministic encryption.
- Indexes can only be applied to deterministic encryption columns
- Need same column encryption key for columns that are joined
- Constant expressions that refer to encrypted columns not allowed
WHERE SSN = ‘111-11-1111’, but WHERE SSN = @SSN is because the driver works with the SqlParameter class
- Unsupported data types: xml, rowversion, image, ntext, text, sql_variant, hierarchyid, geography, geometry, and UDFs
- Currently the only driver that supports this feature is .NET 4.6 and higher
- Encrypted columns take significantly more space
- String-based columns that are encrypted with deterministic encryption must use a _BIN2 collation (e.g. Latin1_General_BIN2)
- The following data types are NOT supported as encrypted columns, per the documentation: text, ntext, image, XML, hierarchyid, geography, geometry, alias types, user-defined data types, SQL_VARIANT, rowversion (timestamp)
- These features are not supported with Always Encrypted:
- Sparse column set (sparse columns are okay, provided the table doesn’t contain a column set)
- Built-in alias types, e.g. SYSNAME
- Identity columns
- Computed columns
- Temporal tables
- Triggers are partially supported
- Full-text search
- Replication (need more research)
- In-Memory OLTP
- Stretch Database
As you can see there are quite a few limitations in the current version of Always Encrypted. On the other hand, this is the most secure way to encrypt your data.
Which setup is the best for you really depends. Monin can help you to evaluate this case by case. We can provide a tailored Microsoft SQL Security Awareness session based on your environment. First we’ll perform a high-level security assessment, afterwards we’ll go through all the security aspects on Microsoft SQL Server level which you could implement in your environment. Our specialist will discuss all advantages and disadvantages of each security solution.
Interested? Just ask us for an offer to better security via phone: +32 3 450 67 89 or via e-mail: firstname.lastname@example.org