Dec 25 2021 08:24 AM
Dec 25 2021 08:24 AM
merry XMAS to you all
I use the time to get some "long term wish" fixed but no luck at all - excel is not my "world" :)
my problem ist
I get every month a CSV from one of our "SaaS Services" - I nee now to import this to a Blank (or Template if it would help) and do the following
- take all eMail Addresses with ending @XXX.ch and put this to the Tab XXX.ch
- take all eMail Addresses with ending @XXX.de and put this to the Tab XXX.de
- take all eMail Addresses with ending @XXX.com and put this to the Tab XXX.com
These are just 3 examples, I have to do this for a lot of domains - every month
again, I have no idea to use Excel for more than my Expense report :)
can someone help me here?
Dec 25 2021 10:41 PM
@PaddyB Connect to the CSV file with PowerQuery (PQ), do some transformations, sort and load back to Excel. Sound easy! And it is, though, you need to learn the basics of PQ first. The link below would be a good starting point.
Dec 25 2021 11:06 PM - edited Dec 25 2021 11:08 PM
You might need to use a macro to create the worksheet based on unique domain names. And then another macro to copy the relevant data for the domain names. You might need to separate the domain name from the email addresses first. Something like this will help:
Once you separate the names, macro can do the rest.
Side note: If I were to do the work, I would not create separate sheets for each domain. I would lump them in one formatted table and filter them to display only those I need to work on.
Dec 26 2021 05:59 AM
Dec 26 2021 10:24 AM
> find a solution what more or less is working
I have combined a few things into something attached that could be suitable for you.
Column A holds the email addresses and F2 visualizes the different email domains from A2 and down.
Sheet2 and 3 holds the email addresses from two different domains.
Similar to @NowshadAhmed suggestion, each of those sheets shows the mail addresses related to the chosen domain.
The selection list shows domains not already used on another sheet.
Since xxx.com is selectable, there is no sheet for that domain. Copy Sheet2 (right click the tab of Sheet2: Move or copy: [x] Create a copy and (move to end)) and select xxx.com in A1.
Sheet PowerPage has in column A a list of the workbooks sheets. They are also linked to respective sheet.
If you'ld like to know more about how the automatic sheet list works, you may read about version 4 macros at https://exceloffthegrid.com/using-excel-4-macro-functions/.
Column B shows the content from cell A1 in each sheet using the function INDIRECT.
F2 lists the unique domains using
=LET( data; OFFSET(Input!A:A;1;0;COUNTA(Input!A:A)-1;1); output; UNIQUE(MID(data;FIND("@";data)+1;4711)); output )
H2 lists the domains that are missing an own sheet;
Finally, cell A1 at the different domain sheets have a data validation list that suggests any missing domain, listed in H2 with its spill area (H2#)
Instead using F2 would be like NowshadAhmed suggested.
The file attached contains version 4 macros and has to be saved as xlsM.
To activate them it seems nowadays necessary to include a VBA macro to get the enable question.
To get the sheet list updated, you may have to save, close and repoen the file (with macros enabled without question).
The VBA macro (that trigs the yellow enable question bar) is also a shortcut to activate the index sheet/first sheet.
Sub activateSheet1() ' ' Keyboard Shortcut: Ctrl+Shift+A ' ThisWorkbook.Activate Sheets(1).Select End Sub
Finally, try Data: From Text/CSV to connect to the csv file could be worth the while as a first step to using Power Query. You will get the csv data into a new sheet and you just have to cut that new table and paste it at cell input!A2 to have it all quite automatic with Ctrl+Alt+F5.
Dec 27 2021 04:18 PM
I would go with what Mr Van_Eekelen suggested because you set up power query once and it will repeat the process just by pressing the update all button, so if you setup a query by folder all you have to do is drop the new file into a folder then open the pq workbook and press ctrl+alt+f5 and it will update the workbood with the new data.
Dec 29 2021 12:07 AM
@Yea_So IMHO Power Query (PQ) is a nice alternative when connecting to slow changing data since it demands a specific update event in Excel.
PQ also needs a completly different set of skills compared to worksheets but it may pay back by providing a bunch of nice features like unpivot.
To get the transformed data, in this case email addresses sliced and separated by domain names to different sheets, PQ need to have a separate question for each domain. They can all be based on the same connection to the source file, but still need to be splitted into one separate 'question' for XXX.ch, another for XXX.de and a third for XXX.com since the wish is to have the result load to different sheets.
With PQ alone it is needed to prepare a sheet for each domain and manually manage (append a new question and a new sheet) whenever a new domain appears.
I do not see any smooth way around that except using a macro to loop through the months domains and setup a sheet for each one found. And macros seem to not be an alternative for this user.
Loading the csv file via the menu Data and command From Text/CSV and then use a worksheet formula like @NowshadAhmeds ought to be a pretty good solution as above Dec 26 2021 10:24.
Dec 29 2021 06:08 AM
yes it requires a one time query setup for each domain/sheet but after the initial query setup, the task becomes automatic just by pressing ctrl+alt+F5 to update all queries setup within the workbook so the cost benefit more than justifies the initial setup whereas with VBA the user might not be familiar with it and make maintaining more complicated as you have mentioned:
"To get the transformed data, in this case email addresses sliced and separated by domain names to different sheets, PQ need to have a separate question for each domain. They can all be based on the same connection to the source file, but still need to be splitted into one separate 'question' for XXX.ch, another for XXX.de and a third for XXX.com since the wish is to have the result load to different sheets.
With PQ alone it is needed to prepare a sheet for each domain and manually manage (append a new question and a new sheet) whenever a new domain appears."
if the user is not familiar with VBA how would the user maintain to update the code to add the VBA statements for the new domains?
Dec 29 2021 07:16 AM
Dec 29 2021 07:59 AM - edited Dec 29 2021 08:09 AM
How is creating one power query against a CSV file, then creating a reference query filtering for each domain sheet over engineering?
Come now iterate the procedures of updating each domain sheet every time there is a CSV update.
With power query by folder, after the
1. initial setup of each domain sheet
2. all the op has to do is drop the new CSV file into the queried folder
3. then press Ctrl+alt+F5 and it updates the main query and all the reference queries.
4. and if there are new domain sheets to setup just create new reference queries for the new domains, filter the appropriate domain within the reference query then close and load to a new sheet and save the workbook so the next time there is another CSV update all the op has to do is steps 1 thru 3.
Now iterate the procedure involved using formatted structured table to update each domain sheet below and let the op decide which solution they would prefer to use.
Dec 29 2021 10:44 PM
I interpret 'lot of domains' as between 4 and 100 and assume that quite a few of them are new from month to month.
As described 26 Dec 2021 10:24 AM above, the attached file calculates the domains in the input addresses. Column A may be pasted or loaded via Data: (Get & Transform): From Text/CSV.
__/ Sheet3 has one of the domains entered into A1 and uses FILTER to fetch the addresses in that domain.
__/ The first sheets;
Click the button to generate the missing sheets.
__/ The macro will generate sheets for each missing domain as follows:
Sub domains2sheets() While Not IsError(Range("domains_without_sheet").Value) Sheets(Sheets.Count).Select If Not Range("a1").Value Like "?*.?*" Then MsgBox "Last sheet seems to not be a valid template?" Exit Sub Else Sheets(Sheets.Count).Copy After:=Sheets(Sheets.Count) Sheets(Sheets.Count).Name = "Sheet" & Sheets.Count Range("a1").Value = Range("domains_without_sheet").Value End If Wend Sheets(1).Select End Sub
The defined name domains_without_sheet refers to =PowerPage!$H$2
Row 9 could just as well be
Sheets(Sheets.Count).Name = Range("domains_without_sheet").Value
but would be misleading if cell A1 is changed.
Since it's just once a month, I think a paste and a click could be automatic enough.
It is also possible to select sheet4 and Shift+click the last sheet and delete them to only have the current months domains.
For this purpose or rather to get an overview of the number of items in each domain, I also added a counter in PowerPage!D:D. Minus one to exclude the sheets title.
Dec 30 2021 04:25 AM - edited Dec 30 2021 04:32 AM
You might want to look into merge query "merge" using the join type anti-left or anti-right depending on which list is selected in the dialog box to isolate and create a new domains list where the current csv list does not match the current master list.
if the new csv get and transform list is selected as the first list, use join type anti-right to get the new domains list
if the new csv get and transform list is selected as the second list use anti-left to get the new domains list
after getting the new domains list, use the query merge "append" to add the new domains list into the master list
Dec 30 2021 11:26 AM
Dec 30 2021 06:09 PM - edited Dec 30 2021 06:15 PM
For me I tend to not separate data set unless there are certain processes that need specific data.
Maybe the op needs to put some context on which process needs separate data set specific to the process.
The best way to separate data set without using pq is the filter function for a specific process relative to the data set