1. Background
The following exception was found in our production deployment:
- MySql.Data.MySqlClient.MySqlException (0x80004005): Authentication to host '172.16.0.203' for user '******' using method 'mysql_native_password' failed with message: Reading from the stream has failed. ---> MySql.Data.MySqlClient.MySqlException (0x80004005): Reading from the stream has failed. ---> System.IO.IOException: Unable to read data from the transport connection: An established connection was aborted by the software in your host machine. ---> System.Net.Sockets.SocketException: An established connection was aborted by the software in your host machine
- at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
- at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
- --- End of inner exception stack trace ---
- at MySql.Data.Common.MyNetworkStream.HandleOrRethrowException(Exception e) in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\common\MyNetworkStream.cs:line 95
- at MySql.Data.Common.MyNetworkStream.Read(Byte[] buffer, Int32 offset, Int32 count) in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\common\MyNetworkStream.cs:line 112
- at System.Net.FixedSizeReader.ReadPacket(Byte[] buffer, Int32 offset, Int32 count)
- at System.Net.Security._SslStream.StartFrameHeader(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
- at System.Net.Security._SslStream.StartReading(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
- at System.Net.Security._SslStream.ProcessRead(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
- at System.Net.Security.SslStream.Read(Byte[] buffer, Int32 offset, Int32 count)
- at MySql.Data.MySqlClient.TimedStream.Read(Byte[] buffer, Int32 offset, Int32 count) in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\TimedStream.cs:line 208
- at System.IO.BufferedStream.Read(Byte[] array, Int32 offset, Int32 count)
- at MySql.Data.MySqlClient.MySqlStream.ReadFully(Stream stream, Byte[] buffer, Int32 offset, Int32 count) in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\MySqlStream.cs:line 183
- at MySql.Data.MySqlClient.MySqlStream.LoadPacket() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\MySqlStream.cs:line 204
- at MySql.Data.MySqlClient.MySqlStream.LoadPacket() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\MySqlStream.cs:line 228
- at MySql.Data.MySqlClient.MySqlStream.ReadPacket() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\MySqlStream.cs:line 143
- at MySql.Data.MySqlClient.NativeDriver.ReadPacket() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\NativeDriver.cs:line 137
- at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.ReadPacket() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\Authentication\MySQLAuthenticationPlugin.cs:line 171
- at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.AuthenticationFailed(Exception ex) in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\Authentication\MySQLAuthenticationPlugin.cs:line 92
- at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.ReadPacket() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\Authentication\MySQLAuthenticationPlugin.cs:line 177
- at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.Authenticate(Boolean reset) in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\Authentication\MySQLAuthenticationPlugin.cs:line 131
- at MySql.Data.MySqlClient.NativeDriver.Authenticate(String authMethod, Boolean reset) in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\NativeDriver.cs:line 505
- at MySql.Data.MySqlClient.NativeDriver.Open() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\NativeDriver.cs:line 309
- at MySql.Data.MySqlClient.Driver.Open() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\Driver.cs:line 240
- at MySql.Data.MySqlClient.Driver.Create(MySqlConnectionStringBuilder settings) in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\Driver.cs:line 227
- at MySql.Data.MySqlClient.MySqlPool.CreateNewPooledConnection() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\MySqlPool.cs:line 172
- at MySql.Data.MySqlClient.MySqlPool.GetPooledConnection() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\MySqlPool.cs:line 155
- at MySql.Data.MySqlClient.MySqlPool.TryToGetDriver() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\MySqlPool.cs:line 244
- at MySql.Data.MySqlClient.MySqlPool.GetConnection() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\MySqlPool.cs:line 257
- at MySql.Data.MySqlClient.MySqlConnection.Open() in E:\GitHubRepos\mysql-connector-net-oracle\Source\MySql.Data\Connection.cs:line 505
- at MySqlConsole.Program.Main(String[] args) in E:\GitHubRepos\mysql-connector-net-oracle\Samples\MySqlConsole\Program.cs:line 19
Our application contains an NTService and several IIS web sites. The .NET application runs on multiple windows servers (have 2012 and 2016). The MySQL database server runs on a separate CentOS machine (172.16.0.203). We're using Connector/NET 6.9.8.0 with MySQL 5.7.14. The connection string looks like:
- server=172.16.0.203;database=MatchTiming;user=***;password=***;charset=utf8;
The test code with the same issue looks like:
- try
- {
- using (MySqlConnection conn = new MySqlConnection(Properties.Settings.Default.ConnString))
- {
- MySqlCommand cmd = new MySqlCommand($"SELECT ID,Name FROM MatchTiming.`Match` LIMIT 0, 10;", conn);
- conn.Open();
- MySqlDataReader reader = cmd.ExecuteReader(System.Data.CommandBehavior.SingleResult);
- while (reader.Read())
- {
- Console.WriteLine(reader.GetString(1));
- }
-
- reader.Close();
- }
- }
- catch (Exception ex)
- {
- Console.WriteLine(ex.ToString());
- }
2. Findings
After some investigation, I have the following findings:
The exception doesn't occur on all the servers.
- Both NTService and IIS sites have this exception. So does a windows console application.
- The exception will occur 100% on the first database connection attempt for every process/WebAppPool. Then the following database connections (in the same process/WebAppPool) can be established successfully.
- The exception will be thrown about 15 seconds after the first database connection attempt. Looks like a timeout issue.
- The exception is never seen in development environment.
- MySQL Workbench can see a connection with no username from the same IP address. Then it disappears after 15 seconds.
- The servers with this exception don't have internet access.
Some similar MySQL bug reports are found but no consistant steps to repeat the exception. A number of workarounds were provided in the discussion. But the reason is still unknown.
- https://bugs.mysql.com/bug.php?id=76597
- https://bugs.mysql.com/bug.php?id=86056
I've tried all the following workarounds with no luck:
- Upgrading the Connector/NET to 6.9.9.0.
- Upgrading the MySQL Server to 5.7.19.
- Setup a new MySQL Server in the production environment.
- Use Percona MySQL Server instead of official MySQL Community.
- Replace Oracle Connector/NET with Async MySQL Connector for .NET and .NET Core (https://github.com/mysql-net/MySqlConnector)
In some perticular cases the issue got resolved with some of the workarounds but not my issue. Then I have to earn my own living.
3. Investigation
I dived into the Connector/NET source code (https://github.com/mysql/mysql-connector-net) and found something interesting here:
https://github.com/mysql/mysql-connector-net/blob/5864e6b21a8b32f5154b53d1610278abb3cb1cee/Source/MySql.Data/NativeDriver.cs#L290
Line 309 is the source of exception. But line 290-299 attracted me. It says the SSL is enabled by default. This is unexpected to me since I've never configured the SSL certificate. I don't need SSL because my production database servers are in a managed LAN without internet access. But this gives me an important clue. Then I found the server provided a self-signed SSL cert to the client. The Windows PKI (Public Key Infrastructure) API (Crypto API) will validate the SSL cert immediately after it received the SSL cert from server. This self-signed cert is not in the certificate trust list (CTL) by default. If the current CTL is too old, Windows needs to update the latest CTL from the internet. But this server doesn't have internet access.
Here comes the network packet capture:
The Wireshark capture confirms the above cert validation process. And this is the root cause.
4. Solutions
Knowing the root cause, the solutions become easy.
Solution 1: If SSL is not required. Since it is caused by SSL, we can turn off SSL by appending "SslMode=None" to the connection string.
Solution 2: If SSL is required, server identity is important and needs to be verified. The server needs a internet connection to do the cert verification. Please note the crypto API doesn't update CTL for every process. The CTL is maintained at operating system level. Once you connect the server to internet and make an SSL database connection to the server, the CTL will be updated automatically. Then you may disconnect the internet connection. Note again the CTL has its expiration date and after that the Windows needs to update it again. This will occur probably after several months.
Solution 3: If SSL is required but the server identity is not important. Typically SSL is only used to encrypt the network transport in this case. We can turn off CTL update:
- Press Win+R to open the "Run" dialog
- Type "gpedit.msc" (without quotes) and press Enter
- In the "Local Group Policy Editor", expand "Computer Configuration", expand "Administrative Templates", expand "System", expand "Internet Communication Management", and then click "Internet Communication settings".
- In the details panel, double-click "Turn off Automatic Root Certificates Update", clickEnabled, then click OK. This change will be effective immediatelly without restart.
5. References
Configuring Certificate Revocation: https://technet.microsoft.com/en-us/library/cc771079(v=ws.11).aspx
How Certificate Revocation Works: https://technet.microsoft.com/en-us/library/ee619754(v=ws.10).aspx
Windows XP: Certificate Status and Revocation Checking https://social.technet.microsoft.com/wiki/contents/articles/4954.windows-xp-certificate-status-and-revocation-checking.aspx
Should clients be able to access ctldl.windowsupdate.com when using WSUS? https://serverfault.com/questions/714637/should-clients-be-able-to-access-ctldl-windowsupdate-com-when-using-wsus
An automatic updater of untrusted certificates is available for Windows Vista, Windows Server 2008, Windows 7, and Windows Server 2008 R2 https://support.microsoft.com/en-us/help/2677070/an-automatic-updater-of-untrusted-certificates-is-available-for-window
Certificate Support and Resulting Internet Communication in Windows Vista https://technet.microsoft.com/en-us/library/cc749331(v=ws.10).aspx