ykhabins I think Pedro meant to say that "UTF-8 is not supported in the XML datatype". And by "XML datatype is stored in binary format", he means that the XML datatype is an optimized type that reduces overall size by doing several things, such as:
1. place strings into a dictionary so that they only exist once (and yes, here they will be stored as UTF-16)
2. remove insignificant whitespace
3. if using an XSD / XML Schema Collection, attributes and elements designated with specific datatypes will be stored in the native binary representation for that type (when no XML Schema Collection is used then all values are stored as UTF-16 strings)
Here is one example of this format as found in the overall protocol specification for the XML datatype:
https://docs.microsoft.com/en-us/openspecs/sql_server_protocols/ms-binxml/d5bd1f42-8643-435c-a0df-0ba8680a19ee
And here is a section of one of my posts where I show that for XML data containing repeated strings, the overall data size (in bytes) is smaller in the native XML datatype than even for VARCHAR data containing the same XML data. Of course, if the XML data contains mostly unique strings, then it's possible that the VARCHAR representation would be smaller, though that doesn't always help since VARCHAR can't contain all code points. Either way, take a look:
https://sqlquantumleap.com/2019/11/22/how-many-bytes-per-character-in-sql-server-a-completely-complete-guide/#xml
ALSO, if by UTF-8 support in XML you mean the encoding of it going in and/or coming out, then that might be a different story. The XML datatype does currently (and always has) support converting most encodings into UTF-16. The reason that the <?xml ?> declaration is stripped off is that it isn't necessary as the only encoding it can be internally is UTF-16 (and it could be a conflict if the encoding specified in the declaration states otherwise). HOWEVER, if you pass in valid bytes of UTF-8 (or even Windows-1252, etc) and provide the <?xml ?> declaration which states the encoding used for the bytes you are passing in (and if you do NOT prefix the string literal with an upper-case "N" or use an NVARCHAR variable) then the XML datatype will convert from that encoding into UTF-16. This is supported starting in SQL Server 2005.
Here is a StackOverflow answer of mine in which I provide examples of doing this conversion from VARCHAR data encoded as UTF-8 or Windows-1252 into XML:
https://stackoverflow.com/a/39922862/577765
Of course, getting the data out as UTF-8 encoded bytes is not supported natively until SQL Server 2019 when you can use CONVERT(VARCHAR, ..) to achieve that. Prior to SQL Server 2019, you can create a T-SQL or SQLCLR scalar function (i.e. UDF) to return a VARBINARY(MAX) value containing the UTF-8 encoded bytes of whatever was passed in.
Take care,
Solomon....