Forum Discussion
Root bot, skill bot and scaling
- Dec 14, 2022
We had James check his data and found this. See if it helps. In the root bot:
- Double check and be 100% sure that you're using the SkillConversationIdFactory that is a part of the MS chatbot framework (NOT one that you may have created). It should have a IStorage constructor parameter that lets you pass in whatever storage you want to use to persist ids used with skills communication. You probably need to use the class that is given to you in the chatbot framework. (i.e. SkillConversationIdFactory that inherits from"Microsoft.Bot.Builder.Skills.SkillConversationIdFactoryBase")
- For the IStorage object used by SkillConversationIdFactory, If you are using some kind of in-memory only storage, (i.e. A ConcurrentDictionary or other MemoryStorage type object), that might be a problem. The code in SkillConversationIdFactory might not be persisting the conversation/skill ID lookup data (needed to talk with skills) into a place that other apps can read.
I have found some old MS examples that give a "demo" of how to use skills and shows a SkillConversationIdFactory that uses in-memory storage...which of course won't scale or work across different apps.
https://learn.microsoft.com/en-us/dotnet/api/microsoft.bot.builder.skills.skillconversationidfactory?view=botbuilder-dotnet-stable
We're having this exact scaling issue on 4.10. If we try to scale the root bot instances above one, communication made by a skill back to a root bot instance that is NOT the originating root bot produces a 404 (and the skill errors out).
Any luck with finding out whether this is a version issue?
We found this delivery mode option (ExpectReplies), which will tie the call and the response to the same root bot instance, but it seems like it might just be an alternate workaround.
https://docs.microsoft.com/en-us/azure/bot-service/skills-about-skill-consumers?view=azure-bot-service-4.0#using-a-delivery-mode-of-expect-replies
- voonsionglumJan 19, 2022Brass ContributorGood to know we are not the only ones having this issue 🙂 We upgraded our root bot and skill dialog bot to use the latest 4.15.0 npm packages. We scaled out both the root and skill dialog bots to 2 instances. Sadly, we still face the same error.
Our plan was to redeploy the 4.15 dialog root bot and dialog skill bot samples and scale out the instances to 2. We have been having some trouble overwriting the existing dialog root bot's web app with the 4.15 sample. I'll update again when we get this resolved and test out scaling.
We were not aware of the delivery mode option. Thank You for bringing that to our attention. A workaround is better than nothing 🙂- Slacked2737Jan 19, 2022Copper Contributor
Yup, misery loves company 🙂
We also upgraded to 4.15.1 but it did not solve the problem (within our own application); we still see skills getting 404s when scaling above 1 root instance. We're going to try and see if server affinity or some kind of root bot shared state (i.e. db) options even available at all within the framework, but perhaps delivery mode is the only multi-root option.Its also not yet clear whether the expectReply is built into the sender and receiver framework (i.e. handled automatically by the middleware or other libraries) or is something that will have to be manually coded to keep things synchronous.
FYI, a couple of more (semi-useful) references about delivery mode:
https://github.com/microsoft/botbuilder-dotnet/pull/5142
https://github.com/microsoft/botbuilder-dotnet/pull/5162
https://github.com/microsoft/botframework-sdk/blob/main/specs/botframework-activity/botframework-activity.md#delivery-mode
- voonsionglumJan 19, 2022Brass ContributorBy server affinity, are you referring to the ARR affinity setting in the app service? If so, we have already tried setting both the root bot and the dialog skill bot's settings to ON. It did not have any effect.
We initially thought the problem could be due to the way our bots are storing the conversation state. We were using memory as the default storage. We thought that because the scaled bot instances are actually referring to their own memory storage, it could be likely that the bot instances do not have any references to the required conversations states, causing replies to get lost. We switched to using a physical storage and store all of the bots' conversation states in CosmosDB. However, that did not have any impact as well.
Maybe we are doing it incorrectly. If you could try it out on your end, we would like to compare notes and see if we both get the same results.
- HunaidHanfee-MSFTJan 18, 2022Iron Contributor
Slacked2737 - Hello did you checked by installing the manifest I share? Also, can you elaborate more on the repro step to be make sure not missing anything.
- voonsionglumJan 21, 2022Brass ContributorHunaidHanfee-MSFT, would it be possible to have access to the actual web apps behind the manifest you have shared? We would like to view the code via Kudu console and examine the scale out settings that have been applied to both the root bot and skill dialog bot.
- Slacked2737Jan 21, 2022Copper Contributor
Thank you for the update.
I reviewed most of the botbuilder-dotnet code and came to a few conclusions:
- There does not seem to be much code related to ARR at all. My guess is that cookie affinity is not a part of the framework to support pinning root to skill calls.
- Root to skill using DeliveryMode.ExpectReplies looks like it should work (and it sounds like you may have tried it already. Details would be great :-). Check out the code example below, it's a good template to how it works.
https://github.com/microsoft/botbuilder-dotnet/blob/f28cad18948298f30cb7fc4973c143cf08ad7341/tests/Skills/Parent/ParentBot.cs#L77
Also, check out how it's handled in the SendActivitiesAsync call:
https://github.com/microsoft/botbuilder-dotnet/blob/f28cad18948298f30cb7fc4973c143cf08ad7341/libraries/Microsoft.Bot.Builder/TurnContext.cs#L373This does not explain why the 404 occurs in the first place. It would be nice to get a definitive answer on whether the root bots are able to share across instances, the skill bot responses.