Skip to main content Microsoft Intelligent Data Platform Azure Arc Azure databases Power BI SQL Server 2025 SQL Server BI SQL Server 2022 SQL Server 2019 SQL Server 2017 SQL Server 2016 SQL Server 2005 - 2014 Downloads Community SQL End of Support Data Security - SQL Server Encryption SQL Server blog SQL Server and Azure SQL workshops Browse Microsoft Solutions Hub SQL Server Tech Community Azure Databases Tech Community Azure Synapse Analytics Tech Community Developer Find a partner Become a partner Partner resources Try SQL Server 2025 Microsoft Security Azure Dynamics 365 Microsoft 365 Microsoft Teams Windows 365 Microsoft AI Azure Space Mixed reality Microsoft HoloLens Microsoft Viva Quantum computing Sustainability Education Automotive Financial services Government Healthcare Manufacturing Retail Find a partner Become a partner Partner Network Microsoft Marketplace Marketplace Rewards Software development companies Blog Microsoft Advertising Developer Center Documentation Events Licensing Microsoft Learn Microsoft Research View Sitemap

Introducing UTF-8 support in SQL Server 2019 preview

With the first public preview of SQL Server 2019, we announced support for the widely used UTF-8 character encoding as an import or export encoding, and as database-level or column-level collation for string data. This is an asset for companies extending their businesses to a global scale, where the requirement of providing global multilingual database applications and services is critical to meet customer demands, and specific market regulations. The benefits of introducing UTF-8 support extend to scenarios where legacy applications require internationalization and use inline queries: the amount of changes and testing involved to convert an application and underlying database to UTF-16 can be costly, by requiring complex string processing logic that affect application performance.

To limit the amount of changes required for the above scenarios, UTF-8 is enabled in existing the data types CHAR and VARCHAR. String data is automatically encoded to UTF-8 when creating or changing an object’s collation to a collation with the “UTF8” suffix, for example from LATIN1_GENERAL_100_CI_AS_SC to LATIN1_GENERAL_100_CI_AS_SC_UTF8. Refer to Set or Change the Database Collation and  Set or Change the Column Collation for more details on how to perform those changes. NCHAR and NVARCHAR remain unchanged and only allow UTF-16 encoding.

UTF-8 is only available to Windows collations that support supplementary characters, as introduced in SQL Server 2012. You can see all available UTF-8 collations by executing the bellow command in your SQL Server 2019 CTP:

SELECT Name, Description FROM fn_helpcollations() 
WHERE Name like '%UTF8';

Additionally, if your dataset uses primarily Latin characters, significant storage savings may also be achieved as compared to UTF-16 data types. For example, changing an existing column data type from NCHAR(10) to CHAR(10) using an UTF-8 enabled collation, translates into nearly 50 percent reduction in storage requirements. This is because NCHAR(10) requires 22 bytes for storage, whereas CHAR(10) requires 12 bytes for the same Unicode string.

Getting started

English (United States)
Your Privacy Choices Opt-Out Icon Your Privacy Choices
Consumer Health Privacy Sitemap Contact Microsoft Privacy Manage cookies Terms of use Trademarks Safety & eco Recycling About our ads