Forum Discussion
Excel is adding extra digits behind the scenes for a report
dailyjm wrote: ``it did not wipe out all decimals, only some.``
I cannot imagine any reasonable way that could happen. (An example of an "unreasonable" way is an event macro that changes the data. Not likely!)
If you can attach a csv/txt file and resulting Excel file that can be used to demonstrate the problem, I could take a look at it.
(We only need the product IDs in the csv/txt and Excel files.)
Otherwise, the only thing that I can suggest is for you to use formulas of the form =ISTEXT(A1) to confirm that all of the product IDs are, in fact, text after you import the csv/txt file in the manner that I described.
But even if some of the product IDs are not text, I cannot imagine how only some of their "decimal fraction" could be dropped in the process.
Final XLSX outcome after following your instructions above: https://1drv.ms/x/s!AnS3coMvk-0jgbFQ-00Y5zb-g43C_g?e=E18QYr
Original File received from business :
https://1drv.ms/x/s!AnS3coMvk-0jgbFPhtSGvWrowBVBEg?e=qKgG9b
XLSX that I opened and resaved as a CSV file
https://1drv.ms/u/s!AnS3coMvk-0jgbFRi3S14IUQ_6EOww?e=8BAaMD
- JoeUser2004Apr 19, 2022Bronze Contributor
dailyjm wrote: ``I tried your method which did work for all of those with the extra digits but now it has cut off trailing zeros that were needed. For instance 3140339.020 has become 3140339.2 [sic; errata: 3140339.02] and 31300071.000 has become 31300071``
You are right: my work-around is flawed in that way.
But how do you know the project ID should be 3140339.020?
My point is: if you have access to the correct data, you should use that as your "raw data".
Once you permit text like 3140339.020 to be stored as a number in Excel, it is impossible to know whether 3140339.02 should be 3140339.020 -- or 3140339.02000, for that matter (unless you know that the data is limited 3 decimal places at most).
The point is: the information is already lost.
An equivalent analogy is: suppose you use a word processor that autocorrects what it perceives to be a typo. You purposely type the sentence "is was weird that the user typed wierd", but the autocorrected result is "is was weird that the user typed weird".
It is now impossible to know how the sentence was typed initially.
Bottom line: The only way to prevent the project IDs from being changed is to enter them as text into Excel. Since Excel provides a way to do that, the problem is with your process, not with Excel.
- dailyjmApr 15, 2022Copper Contributor
Apologies for sending the complete set of records and not just the ones that had issues. That was an oversight on my part, I was trying to fix other issues at work between sending the files over so was in a bit of a hurry. I thoroughly appreciate all of your help and time that you have devoted to this.
For the .xls files, I am only provided those from my colleague as xls files. I had then opened the file and saved as a csv and then pulled into excel using the text import option for those. Unfortunately our company uses an archaic version of Oracle right now. The reports are coming from another application that is fed the data from Oracle and just to get those reports she had been working with our IT department for months and months (sadly). If I did request csvs it would probably take another year to get those. I believe the data is an issue from those reports because another colleague mentioned he had the same issue with separate reports that he had pulled from Oracle itself.
I tried your method which did work for all of those with the extra digits but now it has cut off trailing zeros that were needed. For instance 3140339.020 has become 3140339.2 and 31300071.000 has become 31300071
I'm attempting to create a new column in BI based off of a little logic to add the extra zeros back in and combine the columns to create a new project number but it is still not matching up exactly for some reason.
But at least I'm down from 7000 entries being wrong to 4000 so that is progress! I might just tell the team that what they're doing can't be done because of the data issue.. but I do not like to give up easily and admit I can't find a solution.
- JoeUser2004Apr 14, 2022Bronze Contributor
I'm afraid those attachments are not very useful to me.
First, you provide files with more than 50,000 rows of data. I have no idea which rows create problems for you. At a minimum, you could tell me which rows to look at.
Second, all of the files that you provide are xlsx files. I thought the problem arises when you take a csv or txt file created by Oracle, and you open or import the csv file into Excel.
If I understand your process correctly, I would need the initial csv file and the resulting Excel file.
But perhaps I misunderstand the process. Perhaps you start with the "orig" xlsx file from Oracle, save it as csv, then (following my instructions) import the csv file into an xlsx file.
All of your problems might be remedied if you can coerce Oracle to save to a csv file directly, not to an xlsx file that you save as csv.
-----
That said, here is one observation. I cannot say for certain whether or not it is relevant to your problem, for the reasons noted above. But it might give you some insight.
Note: I will suggest a work-around after I explain the problem that I observe.
Looking at the "orig" file, consider two examples: A7 and A65.
Both cells are formatted as Text, but the date type of the cell values is numeric. Note the difference between cell format and cell value data type.
This can happen when we enter a numeric value into a cell that is initially formatted as General, then the cell format is changed to Text. Formatting alone does not change the cell value data type. So, it remains type numeric.
(Aside.... It seems incredulous that Oracle per se would do that. But let's move on....)
More importantly, A7 displays only 30900921, but its actual value is 30900921.001 in the Formula Bar. Note the difference between displayed value and actual value.
In contrast, for A65, both the displayed and actual values are 31401967.01.
If we save the "orig" file as csv, the displayed values, not the actual values, are usually written into the csv file.
Consequently, for A7, the csv file contains the value 30900921; precision is lost.
But for A65, the csv file contains the value 31401967.01; precision is not lost.
The difference in displayed values is coincidental, due to the rules for the General format. I am not aware of any explanation of those rules (sigh). But obviously, they depend on the precision of the decimal fraction. They also depend on the number of significant digits.
-----
Suggested work-arounds....
The best work-around is to coerce Oracle to save as csv directly. Presumably, the results will be consistent, with no loss in precision.
Alternatively, do the following.
1. Open the "orig" xlsx file in Excel.
2. Create a new worksheet.
3. In the new worksheet, enter A4:A50116 in the Name Box (upper-left) to select the range.
4. Type (do not press Enter) the formula ='Source Data'!a4 & "" , then press ctrl+Enter (not ctrl+shift+Enter) to copy the formula down the column.
5. Save the new worksheet as csv.
Now you can follow my previous instructions to import, not open, the saved csv file into an xlsx file.
Recall that in the third dialog box of the Import Text Wizard, we must select the first column of the imported csv file, and select Text for the column format.
That step is still necessary to prevent Excel from interpreting the text as numeric whenever possible.