Forum Discussion
W10-1903 UNC path failing 0x80070043
- Jun 13, 2019
Hi everyone. We got access to the Dell KB and see the issue in Dell/EMCs 'Unity' CIFS implementation.
From the DELL EMC KB attached to thread (below), the Unity SMB Server implementation is failing on the "SMB2_NETNAME_CONTEXT" and "SMB2_COMPRESSION_CAPABILITIES" we added in 1903. These were changes designed to add some new capabilities to SMB; we make some variant of these at most OS releases. If an SMB/CIFS server doesn't recognize capabilities, it should ignore them, not fail. Otherwise Dell would have to update their SMB implementation every time we released a new SMB capability that didn't also include a protocol dialect revision (like "SMB 3.1.2"), forever and ever.
Error Messages in the Unity c4_safe_ktrace.log:
sade:SMB: 3:[nas_serverx] Unrecognized SMB2 negotiate context type 0003
sade:SMB: 3:[nas_serverx] Unrecognized SMB2 negotiate context type 0003SMB client sends the compression context before the netname context, so the server encounters the compression context first. The Unity server would probably encounter the same problem with the netname context. Instead of failing when their SMB Server version doesn't support more advanced capabilities, it should be ignoring those capabilities. This is what Windows and other 3rd party SMB products do.
kurgan thanks for opening this techcommunity item, I'm sorry I didn't see it until now.
Ned Pyle | Principal Program Manager, MS | @nerdpyle on twitter
Not sure why most people are trying to blame EMC / FluidFS for the SMB implementation.
I think Microsoft should be the one to fix the issue.
SMB 3.1.1 is not new, our EMC NAS has been on SMB 3.1.1 for years, and on previous Windows 10 releases (1703, 1709, 1803, 1809), it is able using SMB 3.1.1 to connect to our EMC NAS servers. Why the same SMB 3.1.1 stopped working in 1903?
Try to have a Windows 10 PC with build of (1703, 1709, 1803, 1809) connected with your EMC NAS. and run Get-SMBConnection, it should prove that the connection is utilizing SMB 3.1.1. There is no issue on EMC's SMB implementation.
IronRolia Because Dell was not following the SMB specification. An SMB server is not supposed to fail when sent unsupported capabilities during NEGOTIATE phase, regardless of the maximum dialect support. 1903 added a Compression flag to SMB capabilities (which was also publicly documented so that vendors could support it if they wished) and if not supporting compression, the Dell device is supposed to respond that it doesn't support that flag, not error and tear down session.
This is why Dell already fixed their other product with this symptom. Other manufacturers were unaffected by the addition of SMB Compression support in 1903 because they followed the spec.
Note: I own SMB and its specification.
- ghwright3Oct 19, 2022Copper Contributor
fcabj fixed with FluidFS Version 6.0.600004
https://www.dell.com/support/manuals/en-us/dell-compellent-fs8600/fluidfs-v6-rn/fixed-issues-in-fluidfs-version-60600004?guid=guid-efb74913-18ef-4b00-a55b-c793e124c3a9&lang=en-us - ghwright3Sep 30, 2021Copper Contributor
MikeCrowley
I got Win11 working by limiting it to SMB 3.0.2, not ideal but I don't see us rolling out Win11 in production any time soon.Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanWorkstation\Parameters]
"MaxSMB2Dialect"=dword:00000302
- MikeCrowleySep 17, 2021Iron Contributorsigh...
- ghwright3Sep 17, 2021Copper Contributorconfirmed, not functioning on win 11, see you back here
- MikeCrowleyAug 17, 2021Iron ContributorExcellent! We also applied the latest Dell update, which resolved the issue. Let's hope they actually fixed the root cause this time, and we're not meeting here again at the Windows 11 launch!
- adecroce24Aug 13, 2021Copper Contributor
MikeCrowley They released an update to fix this just recently. We applied it over the weekend and it appears to have resolved the SMB 3.1.1 issues.
- MikeCrowleyAug 13, 2021Iron Contributor
We're seeing the same thing - this issue seems to have returned sometime this year. Setting Fluid FS to the below seems to have stabilized the issue, at the cost of downgrading to SMB2.
Frame 448: 131 bytes on wire (1048 bits), 131 bytes captured (1048 bits) on interface \Device\NPF_{[redacted]}, id 0
Ethernet II, Src: [redacted] ([redacted], Dst: VMware_[redacted] ([redacted])
Internet Protocol Version 4, Src: [redacted], Dst: [redacted]
Transmission Control Protocol, Src Port: 445, Dst Port: 55205, Seq: 1, Ack: 273, Len: 77
NetBIOS Session Service
SMB2 (Server Message Block Protocol version 2)
SMB2 Header
ProtocolId: 0xfe534d42
Header Length: 64
Credit Charge: 0
NT Status: STATUS_INVALID_PARAMETER (0xc000000d)
Command: Negotiate Protocol (0)
Credits granted: 17
Flags: 0x00000001, Response
Chain Offset: 0x00000000
Message ID: 0
Process Id: 0x0000feff
Tree Id: 0x00000000
Session Id: 0x0000000000000000
Signature: 00000000000000000000000000000000
[Response to: 446]
[Time from request: 0.000216000 seconds]
Negotiate Protocol Response (0x00)
[Preauth Hash: [redacted]…]
StructureSize: 0x0009
Error Context Count: 0
Reserved: 0x00
Byte Count: 0
Error Data: 34I'm betting Dell's "fix" was just to once again hard-code the new values and not use the dynamic behavior Ned mentioned above. Hopefully they will issue a new patch.
- SAinslieJun 24, 2021Copper Contributor
Not happy with Dell at all on this. After 1903 they simply added the 2 extra flags to the FluidFS SMB implementation. SMB 3.1.1 stopped working on FluidFS again after 2004, NedPyle were there additional SMB flags added then?
Unfortunately we have had to run on SMB 2 since then however I see Dell has just released a new update for FluidFS now in June 2021. I have not tried it yet but will do shortly.
- DarrenMillerJun 13, 2021Copper ContributorWe also had similar problems - they started again when Windows 2004 came out. We had just started migrating to a different platform so didn't bother contacting Dell.
I heard Dell had stopped developing FluidFS and had to get some developers in to fix 1903 issues, no idea if they are still around. - NedPyleJun 11, 2021
Microsoft
Thanks for this update. Ouch, that's a super disruptive workaround. Disabling SMB 3.1.1 turns off a bunch of security and performance capabilities, never mind the functionality that will stop. Is the Fluid FS system just past end of life and they are winding down support? - tomattheJun 11, 2021Copper Contributor
There are continuing issues with the Dell Fluid FS system (firmware 6.0.400016) and smb shares. I don't recall the exact update, but I believe it was all the way back in May 2020 or so and basically end users started seeing dropped/not working shares. I worked on it for a bit and eventually contacted Dell. They acknowledged there was now an issue with SMB 3.11 and the Fluid FS and the only solution was to disable smb 3.11 on the client end. They haven't fixed this and from everyone I have spoken to are not going to fix this.
We ended up pushing out a gpo to disable 3.11 for all Windows users. We are still under Dell support with this system until March 2022 but they have completely stopped updating it as far as I can tell. I've contacted our support rep and got nowhere on this.
We have since got new hardware in from a different vendor and will no longer be using Dell storage.
- NedPyleJun 11, 2021
Microsoft
I had not heard of any issues after Dell issued their own fixes, and they have not contacted me through the partner channel about these symptoms you're describing, I'm afraid you will need to contact them and find out if there's been some new regression on their or our end. They will be able to engage us pretty fast if they need to. - DragonsfireJun 11, 2021Copper Contributor
Do you know if Dell properly remedied this issue? (or how they went about doing so?)
We're running their latest version now which helped fix SMB 3.1.1 since it didn't work at all, but always noticed oddities here and there. And now we've started seeing an uptick in random drive mounting and UNC access issues even with version 6.0.400016 on our cluster. I believe so far this is mostly on Win10 2004 and higher. We're attempting to reach out to Dell but figured I'd pick your (or anyone's) brain on this.
Message Analyzer Dump
- tomattheOct 10, 2019Copper Contributor
DarrenMiller They eventually said the luns were dropped during the restarts because there was a raid rebalance going at that time. I'm not sure why that would cause it to drop connections, but that's what I was told. I had replaced a drive in one of the sc2080's under the NAS head a few days before doing the restarts. From the end user side I don't believe there is any way to actually see that the system is still balancing data after replacing a disk.
The amount of disk these arrays have dropped is at about 15% in 2.5 years of use which is about 10% worse then any other array I run fwiw.
- DarrenMillerSep 26, 2019Copper Contributor
tomatthe They requested rolling reboots before the upgrade and logs, but not rolling reboots after.
I'll will try rolling reboots now the upgrade is done to check it doesn't cause us any issues. We normally have to do rolling reboots once a month anyway as the controllers regularly run out of cache (long standing issue, not related to this).
I was a bit surprised by the extra steps - I've never had to do that before so it does seem they are being extra cautious with this one.
- tomattheSep 26, 2019Copper Contributor
DarrenMiller finally got this installed our on DR side yesterday. Did Dell request you rolling restart both controllers on the FS prior to updating, and then again request you restart both controllers after updating firmware? They asked that I do that, then send them the gen diags after doing all of that. The 2nd set of rolling restarts did create some problems.
They are also asking for access to the storage arrays, which makes me a bit hesitant to think this firmware is production ready. That being said I haven't seen any issues with it so far, and 1903 does work properly.
- LColladoSep 26, 2019Copper Contributor
Having the same issue with Unity 300.
This Dell EMC Article worked for me.
https://emcservice.force.com/CustomersPartners/kA43a00000003w4CAA
- tomattheSep 25, 2019Copper Contributor
DarrenMiller Still trying to get Dell to review the diags I sent them and get me the actual firmware.
FWIW a friend of mine who also runs this system sent me the info below. Hopefully unlikely anyone else will run into this, but sounds like a bad situation.
"
The problem that occurred on our end occurred during the initial diagnostics process. We were working with the individual assigned to help get us prepped for the update, and so we pulled the ISO, and were running the initial tests. Part of that process – they had asked us to update the support password. When we did, the process tossed an error. They didn’t think anything about it. However, that process created a seed that didn’t exist, and the system literally hung during the diagnostic process trying to sync that file between the two nodes. It was difficult to diagnosis and we had to have someone from the team that actually wrote the queuing system for the product determine the problem and write a small utility to stop the cascade of badness. We were actually at a point of essentially re-imaging both nodes because no one could figure out all Thursday and Friday what the hell was going on.
Once that was fixed, the firmware update took 25-30 minutes – tops."
- DarrenMillerSep 24, 2019Copper Contributor
tomatthe I installed the new version on our DR FluidFS this morning. It all went smoothly and it now works with SMB3 and Windows 10 1903!
I'm going to give it a few more days testing before installing it on the production servers but it looks ok so far.
- tomattheSep 23, 2019Copper Contributor
ftp://customer:Y3V2s-uH@ftp.compellent.com/SOFTWARE/FluidFS/680-115-002_FluidFS_v6_Release_Notes.pdf
Pretty minimal changes,
Fixed Issues in FluidFS Version 6.0.400016
The following issues were fixed in FluidFS 6.0.400016:
Area Description
NAS Volumes, Shares,
and Exports
After installing Windows 10 update 1903, SMB shares are not accessible using the UNC path.
System Functionality The system incorrectly reports that duplicate IP addresses are in use for nodes on the cluster.
The system incorrectly reports battery failures during the battery calibration process