SOLVED

Seedling distribution

Copper Contributor

I am looking for some Excel help to help us organize a tree planting plan for a project we are doing. We have base data that includes farmers' names, the species of seedlings they want to plant and the qty of each species. We need to sort this data into bags of seedlings. Each bag can have a maximum of 25 seedlings and we try to keep the species together, but sometimes two or more species can share the same bag. I have worked out some formulas to assign bag numbers but I am stumbling over the task of allocating the species to the bags.I have attached a sample of what the data looks like and how we are trying to organize it.

If there is a clever formula or function or some VBA that someone could help me with, I will be very grateful. I can usually manage fine with formulas but my VBA skills are copy and paste. We tried sorting this data manually but there are around 1.3 million seedlings and 8000 farmers and planting season begins next week.

19 Replies

@ms_mit I am not sure if I had solved your problem. Please check sheet 2. I have used Pivot.

Hi Jitin, Thanks for the help. Pivot tables are a wonderful thing but I also tried using them and couldn't get it. The problem I am working on is finding a formula (or set, or some code), to parcel these gross amounts (i.e. 80 mahogany for a particular farmer) into individual bags of 25 seedlings, or some limit like that. So in other words, I'd end up with 3 bags of 25 mahogany each, with a remainder of 5 which would go in the next bag of 5 mahogany and 20 teak. Does that make sense?

let me reach home and look at your post again so I can help..I am on the road taxi reading this Hope you have the excel file attached so I can take a logical look

@ms_mit 

Do you want to calculate the number of bags and the number of remaining seedlings? I applied the following formulas to do this:

=SUMIFS($D$6:$D$34,$C$6:$C$34,J4,$A$6:$A$34,I4)

=ROUNDDOWN(K4/$B$3,0)

=MOD(K4,$B$3)

 

Could you communicate the formulas to assign bag numbers?

Hey @OliverScheurich, thanks for helping

My formula for assigning bag numbers is

=COUNTIF($A$6:A6,A6)&"/"&ROUNDUP(SUMIF($I$3:$K$14,A6,$K$3:$K$14)/$B$3,0.1)

But it is just counting instances of the Farmer name. This only works until I have a bag with more than one species in it, then it breaks.

 

Your formulas are, I think, approaching it the other way around. The data I showed is example only. What I am trying to calculate is the lists in columns A:D. I only have the data in the base data columns I:K. Nearly 20,000 rows of it that I have to parcel out into a list like A:D.

I am sure there is a way, it's just really testing my excel skills.

@ms_mit I have created a vba for sheet2. I could not fill BagNo. Please check once.

Hi@Donald_Genes_ I have put a fresh copy of my Excel file here if you want to give it a try. As I mentioned below, My goal is to assemble the list of bags of tree seedlings on the left side of this worksheet from the data included in the 'Base Data' table to the right.I am still stumped about how to enumerate a list of Farmer Names according to the qty of seedlings each one gets and then further assign tree species in the correct number to that list and finally the qty of seedlings of each species that goes in eat bag. It becomes more complicated when there are multiple species in one bag. Any help you can give will be very much appreciated.

Ultimately, I have to run the final list through a mail merge to do a whole lot of labels but that is another subject.

Hi Jitinm, Thanks for that. It calculates the full bags (of 25) and the remainder correctly. Do you have any idea how to enumerate a list with the individual line items? So for example Farmer1 + Mahogany = 3 bags of 35 seedlings each becomes:
Farmer1 | Mahogany | 25
Farmer1 | Mahogany | 25
Farmer1 | Mahogany | 25
.....?
Can you share working in excel for example how it should look. That would help much.

@ms_mit Since you mention to have 1.3 million seedling to distribute to 8000 farmers, you should look into Power Query. Played around a bit with your example and came up with a "solution" that needs refinement. Half of it is in PQ. For the other half I cheated a bit by adding some manual columns in Excel. Should be possible to do those in PQ as well but have the inspiration right now.

Hi@jitinm I have cleaned up the formating a bit to hopefully make it easier to understand. Here is the idea basically. I have the 'Base Data' you see in the table on the right. I have to transform this data to a list like what you see in the 'Results Data' on the left. The transform goes roughly like this: The first line of the Base data is:

Farmer NameTree SpeciesQty of Seedlings
Farmer 1Mahogany75

 

I need to find formulas or VBA to transform this into a list like this:

Farmer NameSpeciesSeedling qty per bagBag No.
Farmer 1Mahogany251/10
Farmer 1Mahogany252/10
Farmer 1Mahogany253/10

 

But then it gets more complicated. Some species are not divisible by 25 (max seedlings per bag) so there is a remainder. This means we have to group one of more species of seedlings into a single bag, so it means one bag has more than one row of data (but it still has the same bag number). Does that make sense? Let me know if you have another idea. The data could also be arranged with the multi species bags in more than one column. At the end, this results data will be used to print labels for all the bags so workers can go and fill the bags with the correct seedlings

Hey Riny, That is exactly what I need. Thank you!!! Helper columns work fine for me. This is a thing I need immediately so, if it works, it's good. I can figure out the bits with the helper columns. Can you please point me toward where to open your bits of PQ? I need to study how this works.
Thanks again!
Understood, there must be some additional column in your base data which specifies which one has single bag or multiple bag. I mean the vba I shared added multiple columns and I used some logic to distribute the qty as per 25 rule( I hope my vba solved your first/earlier query?). So to create additional logics there must be some input in your base data to do.

Hi@Riny_van_Eekelen I had a look at your PQ setup and I have to admit that for now it is over my head but I will come back to it and learn it. In the mean time, I wonder If I could ask you if you have any ideas for an alternate solution. I am using this data as the basis for printing labels for all the bags of seedlings and it would be more workable to put the species in the shared bags in a single row, rather than more rows of the same bag number.  I have attached a file with the structure I am talking about. If it you have a quick idea how to do this, I would very much appreciate it.

@ms_mit Well, if you are completely unfamiliar with PQ, this is not very straight-forward. The file I uploaded has quite some "non-stadard" coding, and is not really a finished product. Not sure if you are going to master it if you are in a hurry. What you could try is take the attached file and dump your real data in the structured table called "Table1" with exactly the same column names. Then press Refresh on the Data ribbon and see what happens.

 

More about the PQ basic here:

https://exceloffthegrid.com/power-query-introduction/ 

 

 

 

@Riny_van_Eekelen 

Thanks Riny, This gets me on the road to the solution. I have used PQ in the past for some simple aggregation of data from multiple worksheets but nothing like the transforms you are doing here. Admittedly, I don't understand the queries you created, especially the steps in the Append1 and what the GRT function does but I will eventually figure it out.

I did run this query with a full dataset. Takes about 15 minutes to run even with a relatively powerful computer. But it works. Now I just have to figure out how to transpose the species and seedling count for the mixed bags.

Thanks also for the link to the PQ tutorial. It looks like a good one.

best response confirmed by allyreckerman (Microsoft)
Solution

@ms_mit Great you could get it to work. And what's 15 minutes for a job that would otherwise take perhaps a whole day of copying and pasting. Now that I look back at the query, I see I was over-thinking the matter by adding the Grouped Running Total. We can do without and it will probably speed up the process. See attached.

 

 

 

 

@Riny_van_EekelenYes it works great.For some reason it is filling a lot of seemingly redundant rows with empty Qtys but this is easily filtered out.

For sure it would have been more than just one day of cutting and pasting to assemble this list. The finished list has around 70,000 rows. It has also showed me that we have considerable non-unique Farmer Names. But this is also easily fixed.

Thanks again for your help with this Riny.

@OliverScheurichFor some strange reason, I received your latest solution by email notification but it is not shown in the comments of this thread. And no sample file is here to download. Anyway I did paste in your "Seedling" code and the formula for doing the bag numbers. I found that it works for all the seedling quantities that are 25 or more. Any quantities below 25, it ignores....which also throws off the bag number formula. If you have a solution for that, I would really appreciate it.

 

Another strange thing is that When I run this on an almost full data set (31,000 rows) the code throws an overflow error on the line "z = z + 1". Alternately, I I run the full data set (33,764 rows), after changing the line "For i = 4 To 33764", it immediately gives an overflow error. I guess I could break the data set up and run separate smaller sets but it would be interesting to know why this happens.

 

I have attached a file with some actual data in it, so you can see what happens with the problem of ignoring the <25 quantities.

1 best response

Accepted Solutions
best response confirmed by allyreckerman (Microsoft)
Solution

@ms_mit Great you could get it to work. And what's 15 minutes for a job that would otherwise take perhaps a whole day of copying and pasting. Now that I look back at the query, I see I was over-thinking the matter by adding the Grouped Running Total. We can do without and it will probably speed up the process. See attached.

 

 

 

 

View solution in original post