Deleting Actor is not freeing up the Disk space

Published Aug 23 2020 11:42 PM 1,183 Views
Microsoft

Issue:

 

With reference to article https://docs.microsoft.com/hi-in/azure/service-fabric/service-fabric-reliable-actors-delete-actors , after Calling the DeleteActorMethod and enumerating the Actor list, the disk space is not getting cleaned up.

 

What can we analyze from our end:

 

We can check the snapshot of the partition size after RDP into the node. Below is an example of one the Partition folder (E.g.: a16a1c07-1468-4664-bf4f-483436dcbda0) size before Calling DeleteActorMethod:

 

Pranjal_Gupta_0-1598251195357.jpeg

 

             

Pranjal_Gupta_1-1598251195362.jpeg

 

 

After Calling DeleteActorMethod: Disk space remains same:

 

Pranjal_Gupta_2-1598251195364.jpeg

 

 

Points to Note:

 

  • Actual usage on disk depends on numerous factors from the underlying store and we don’t see immediate reduction of disk usage right after Actor deletion.

 

  • Deleting data does not shrink the physical size of the DB down; it only shrinks the logical size (the size of the data) of the DB. However, the remaining space is reused when more data is added.

 

For example, imagine the execution of the following operations:

  1. Inserting 10GB of data.
  2. Deleting 7GB of data.
  3. Inserting 3GB of data.

 

The physical size of the DB remains to be 10GB after the occurrence of all the above operations. This is because of physical size not going down after the data is deleted as stated above. However, during step 3, the existing available space is reused instead of creating additional physical space.

 

  • If we are interested in bringing the physical size of the db down, we can perform compaction. Shrinking of db file size is not supported proactively as it impacts write latency.

The recommended way is to test on workloads with real data which is having huge size. Disk space is rarely a bottleneck is general workloads.

 

  • For compacting the partitions, we have added settings under LocalEseStoreSettings:

CompactionThresholdInMB = set to the max_data_size that customer expects to add like 5 GB + delta

FreePageSizeThresholdInMB = some threshold for skipping compaction if bloating is less than this size.. e.g. 500 MB

CompactionProbabilityInPercent = 5 or 10 %

 

These settings will make sure that compaction of partitions happen at appropriate time to reduce bloating of db files.

These can be set in settings.xml under “<ActorName>LocalStoreConfig” like “GameActorServiceLocalStoreConfig”

 

  • Minimum db file size with 0 data is 4 MB. After that as we write data, file size will grow. Deletion information is shared by Hima above. Generally, file space is reused.

 

  • The current compaction steps are deprecated in favor of new automatic compaction feature that we are working on priority and will update the Release Notes when the same is public.

 

  • There may be a question where a node in the cluster fails for some reason and the cluster will automatically reconfigure the service replicas to the available node to maintain availability.

During this scenario, does the disk space reclaims?

 

The answer is No, because db file gets copied from some other node (which is in UP state) to new node.

 

For completeness, Replica folder and files get deleted on old node where replica is not needed anymore. New replica/node will get db files from current primary.

%3CLINGO-SUB%20id%3D%22lingo-sub-1607306%22%20slang%3D%22en-US%22%3EDeleting%20Actor%20is%20not%20freeing%20up%20the%20Disk%20space%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-1607306%22%20slang%3D%22en-US%22%3E%3CP%3E%3CSTRONG%3E%3CU%3EIssue%3A%3C%2FU%3E%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EWith%20reference%20to%20article%20%3CA%20href%3D%22https%3A%2F%2Fdocs.microsoft.com%2Fhi-in%2Fazure%2Fservice-fabric%2Fservice-fabric-reliable-actors-delete-actors%22%20target%3D%22_blank%22%20rel%3D%22noopener%20noreferrer%22%3Ehttps%3A%2F%2Fdocs.microsoft.com%2Fhi-in%2Fazure%2Fservice-fabric%2Fservice-fabric-reliable-actors-delete-actors%3C%2FA%3E%20%2C%20after%20Calling%20the%20DeleteActorMethod%20and%20enumerating%20the%20Actor%20list%2C%20the%20disk%20space%20is%20not%20getting%20cleaned%20up.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CU%3EWhat%20can%20we%20analyze%20from%20our%20end%3A%3C%2FU%3E%3C%2FP%3E%0A%3CP%3E%3CU%3E%26nbsp%3B%3C%2FU%3E%3C%2FP%3E%0A%3CP%3EWe%20can%20check%20the%20snapshot%20of%20the%20partition%20size%20after%20RDP%20into%20the%20node.%20Below%20is%20an%20example%20of%20one%20the%20Partition%20folder%20(E.g.%3A%20a16a1c07-1468-4664-bf4f-483436dcbda0)%20size%20before%20Calling%20DeleteActorMethod%3A%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Pranjal_Gupta_0-1598251195357.jpeg%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F214252i13FFEBB8834AACE9%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20title%3D%22Pranjal_Gupta_0-1598251195357.jpeg%22%20alt%3D%22Pranjal_Gupta_0-1598251195357.jpeg%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%20%26nbsp%3B%26nbsp%3B%26nbsp%3B%20%26nbsp%3B%26nbsp%3B%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Pranjal_Gupta_1-1598251195362.jpeg%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F214251iD385F91D0FD111D4%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20title%3D%22Pranjal_Gupta_1-1598251195362.jpeg%22%20alt%3D%22Pranjal_Gupta_1-1598251195362.jpeg%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EAfter%20Calling%20DeleteActorMethod%3A%20Disk%20space%20remains%20same%3A%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Pranjal_Gupta_2-1598251195364.jpeg%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F214253iEC38822FEDF30204%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20title%3D%22Pranjal_Gupta_2-1598251195364.jpeg%22%20alt%3D%22Pranjal_Gupta_2-1598251195364.jpeg%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3E%3CSTRONG%3E%3CU%3EPoints%20to%20Note%3A%3C%2FU%3E%3C%2FSTRONG%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3EActual%20usage%20on%20disk%20depends%20on%20numerous%20factors%20from%20the%20underlying%20store%20and%20we%20don%E2%80%99t%20see%20immediate%20reduction%20of%20disk%20usage%20right%20after%20Actor%20deletion.%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3EDeleting%20data%20does%20not%20shrink%20the%20physical%20size%20of%20the%20DB%20down%3B%20it%20only%20shrinks%20the%20logical%20size%20(the%20size%20of%20the%20data)%20of%20the%20DB.%20However%2C%20the%20remaining%20space%20is%20reused%20when%20more%20data%20is%20added.%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EFor%20example%2C%20imagine%20the%20execution%20of%20the%20following%20operations%3A%3C%2FP%3E%0A%3COL%3E%0A%3CLI%3EInserting%2010GB%20of%20data.%3C%2FLI%3E%0A%3CLI%3EDeleting%207GB%20of%20data.%3C%2FLI%3E%0A%3CLI%3EInserting%203GB%20of%20data.%3C%2FLI%3E%0A%3C%2FOL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThe%20physical%20size%20of%20the%20DB%20remains%20to%20be%2010GB%20after%20the%20occurrence%20of%20all%20the%20above%20operations.%20This%20is%20because%20of%20physical%20size%20not%20going%20down%20after%20the%20data%20is%20deleted%20as%20stated%20above.%20However%2C%20during%20step%203%2C%20the%20existing%20available%20space%20is%20reused%20instead%20of%20creating%20additional%20physical%20space.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3EIf%20we%20are%20interested%20in%20bringing%20the%20physical%20size%20of%20the%20db%20down%2C%20we%20can%20perform%20compaction.%20Shrinking%20of%20db%20file%20size%20is%20not%20supported%20proactively%20as%20it%20impacts%20write%20latency.%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3EThe%20recommended%20way%20is%20to%20test%20on%20workloads%20with%20real%20data%20which%20is%20having%20huge%20size.%20Disk%20space%20is%20rarely%20a%20bottleneck%20is%20general%20workloads.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3EFor%20compacting%20the%20partitions%2C%20we%20have%20added%20settings%20under%20LocalEseStoreSettings%3A%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fnam06.safelinks.protection.outlook.com%2F%3Furl%3Dhttps%253A%252F%252Fdocs.microsoft.com%252Fen-us%252Fdotnet%252Fapi%252Fsystem.fabric.localesestoresettings.compactionthresholdinmb%253Fview%253Dazure-dotnet%2523System_Fabric_LocalEseStoreSettings_CompactionThresholdInMB%26amp%3Bdata%3D02%257C01%257CPranjal.Gupta%2540microsoft.com%257Cf4e84da0a19f4e8076ed08d7c779fa4b%257C72f988bf86f141af91ab2d7cd011db47%257C1%257C0%257C637197199170399668%26amp%3Bsdata%3D5zWm33MPjLiLCVEKdH6cHg7IAOyBJa2LQvtSAIQKAAI%253D%26amp%3Breserved%3D0%22%20target%3D%22_blank%22%20rel%3D%22nofollow%20noopener%20noreferrer%22%3ECompactionThresholdInMB%3C%2FA%3E%3CSPAN%3E%20%3D%20set%20to%20the%20max_data_size%20that%20customer%20expects%20to%20add%20like%205%20GB%20%2B%20delta%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fnam06.safelinks.protection.outlook.com%2F%3Furl%3Dhttps%253A%252F%252Fdocs.microsoft.com%252Fen-us%252Fdotnet%252Fapi%252Fsystem.fabric.localesestoresettings.freepagesizethresholdinmb%253Fview%253Dazure-dotnet%2523System_Fabric_LocalEseStoreSettings_FreePageSizeThresholdInMB%26amp%3Bdata%3D02%257C01%257CPranjal.Gupta%2540microsoft.com%257Cf4e84da0a19f4e8076ed08d7c779fa4b%257C72f988bf86f141af91ab2d7cd011db47%257C1%257C0%257C637197199170409660%26amp%3Bsdata%3DU61Byp%252F4G%252FbnIM9rJWoW%252Bhkju4nAVokxS89L3mHGO98%253D%26amp%3Breserved%3D0%22%20target%3D%22_blank%22%20rel%3D%22nofollow%20noopener%20noreferrer%22%3EFreePageSizeThresholdInMB%3C%2FA%3E%3CSPAN%3E%20%3D%20some%20threshold%20for%20skipping%20compaction%20if%20bloating%20is%20less%20than%20this%20size..%20e.g.%20500%20MB%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fnam06.safelinks.protection.outlook.com%2F%3Furl%3Dhttps%253A%252F%252Fdocs.microsoft.com%252Fen-us%252Fdotnet%252Fapi%252Fsystem.fabric.localesestoresettings.compactionprobabilityinpercent%253Fview%253Dazure-dotnet%2523System_Fabric_LocalEseStoreSettings_CompactionProbabilityInPercent%26amp%3Bdata%3D02%257C01%257CPranjal.Gupta%2540microsoft.com%257Cf4e84da0a19f4e8076ed08d7c779fa4b%257C72f988bf86f141af91ab2d7cd011db47%257C1%257C0%257C637197199170409660%26amp%3Bsdata%3DifNjww%252Bq%252BP1dcwidYz3DPHi09qSGVZqK3j8wq%252FRAdWs%253D%26amp%3Breserved%3D0%22%20target%3D%22_blank%22%20rel%3D%22nofollow%20noopener%20noreferrer%22%3ECompactionProbabilityInPercent%3C%2FA%3E%3CSPAN%3E%20%3D%205%20or%2010%20%25%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThese%20settings%20will%20make%20sure%20that%20compaction%20of%20partitions%20happen%20at%20appropriate%20time%20to%20reduce%20bloating%20of%20db%20files.%3C%2FP%3E%0A%3CP%3EThese%20can%20be%20set%20in%20settings.xml%20under%20%E2%80%9C%3CACTORNAME%3E%3CSPAN%3ELocalStoreConfig%E2%80%9D%20like%20%E2%80%9CGameActorServiceLocalStoreConfig%E2%80%9D%3C%2FSPAN%3E%3C%2FACTORNAME%3E%3C%2FP%3E%0A%3CP%3E%3CSPAN%3E%26nbsp%3B%3C%2FSPAN%3E%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3EMinimum%20db%20file%20size%20with%200%20data%20is%204%20MB.%20After%20that%20as%20we%20write%20data%2C%20file%20size%20will%20grow.%20Deletion%20information%20is%20shared%20by%20Hima%20above.%20Generally%2C%20file%20space%20is%20reused.%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3EThe%20current%20compaction%20steps%20are%20deprecated%20in%20favor%20of%20new%20automatic%20compaction%20feature%20that%20we%20are%20working%20on%20priority%20and%20will%20update%20the%20Release%20Notes%20when%20the%20same%20is%20public.%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CUL%3E%0A%3CLI%3EThere%20may%20be%20a%20question%20where%20a%20node%20in%20the%20cluster%20fails%20for%20some%20reason%20and%20the%20cluster%20will%20automatically%20reconfigure%20the%20service%20replicas%20to%20the%20available%20node%20to%20maintain%20availability.%3C%2FLI%3E%0A%3C%2FUL%3E%0A%3CP%3EDuring%20this%20scenario%2C%20does%20the%20disk%20space%20reclaims%3F%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EThe%20answer%20is%20No%2C%20because%20db%20file%20gets%20copied%20from%20some%20other%20node%20(which%20is%20in%20UP%20state)%20to%20new%20node.%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%0A%3CP%3EFor%20completeness%2C%20Replica%20folder%20and%20files%20get%20deleted%20on%20old%20node%20where%20replica%20is%20not%20needed%20anymore.%20New%20replica%2Fnode%20will%20get%20db%20files%20from%20current%20primary.%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-TEASER%20id%3D%22lingo-teaser-1607306%22%20slang%3D%22en-US%22%3E%3CP%3E%3CSPAN%20class%3D%22lia-inline-image-display-wrapper%20lia-image-align-inline%22%20image-alt%3D%22Pranjal_Gupta_3-1598251231325.png%22%20style%3D%22width%3A%20400px%3B%22%3E%3CIMG%20src%3D%22https%3A%2F%2Ftechcommunity.microsoft.com%2Ft5%2Fimage%2Fserverpage%2Fimage-id%2F214254iA034D36331C53F25%2Fimage-size%2Fmedium%3Fv%3D1.0%26amp%3Bpx%3D400%22%20title%3D%22Pranjal_Gupta_3-1598251231325.png%22%20alt%3D%22Pranjal_Gupta_3-1598251231325.png%22%20%2F%3E%3C%2FSPAN%3E%3C%2FP%3E%0A%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-TEASER%3E%3CLINGO-LABS%20id%3D%22lingo-labs-1607306%22%20slang%3D%22en-US%22%3E%3CLINGO-LABEL%3EAzure%20Service%20Fabric%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EService%20Fabric%20Disk%20Space%20IssueReliable%20Actor%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3EService%20Fabric%20Reliable%20Actor%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3ESF%20Actor%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E
Version history
Last update:
‎Aug 23 2020 11:42 PM
Updated by: