Search for words in your images in Office 365

Naomi Moneypenny · ‎Dec 13 2017

Unlock content inside of images easily with this new search capability in Office 365.

Earlier this year, we rolled out automatic detection of images that are uploaded to SharePoint and OneDrive. This intelligence identifies whether an image is a whiteboard, a receipt, outdoors, a business card, an X-ray and many other types. You can then search for ‘whiteboard’ and you’ll see all the whiteboard photos you’ve captured and uploaded.

Now, as we announced at Ignite, any printed words in an image are automatically detected, extracted and made searchable. Using computer vision technology, when you upload the image, the location data (if available) from a photograph (such as Oslo, Norway), and the identification and extraction of text will happen automatically and become searchable. You can search in SharePoint, OneDrive or Office.com to find your captures.

Use visual content intelligence to simplify your work life

Many people complete expense reports for travel. While at a restaurant, snap a photo of the receipt. You can do this directly from the OneDrive mobile app, Office Lens mobile app, or just upload a photo you’ve taken with your device. Later on, when you go to file your expenses, you don’t have to remember where you stored it, but instead can search for something that you remember about the expense, for example ‘sushi’ or a location.

We’re excited to bring you this new capability and would love to hear how you use it and what ideas you have to make the service better. Let us know in the comments, or submit new ideas to onedrive.uservoice.com.

Frequently asked questions

What kinds of images can be made searchable?

UPDATED 4/5/21

We create previews and thumbnails for the types listed below:

"bmp", "png", "jpeg", "jpg", "gif", "raw", and also "arw", "cr2", "crw", "erf", "mef", "mrw", "nef", "nrw", "orf", "pef", "rw2", "rw1", "sr2".

We only tag and run OCR for items containing text on files of type: "jpg", "jpeg". "png".

There’s a great range with 21 different file formats including common ones such as "bmp", "png", "jpeg", "jpg", "gif", "tif", "tiff", "raw", and also "arw", "cr2", "crw", "erf", "mef", "mrw", "nef", "nrw", "orf", "pef", "rw2", "rw1", "sr2".

What languages are supported?

Text extracted from an image is in the language captured from the image and is searchable in that language.

The detection of the image type right now is only English. For example, a receipt, business card whiteboard. In the future, we’ll automatically look at the language set on the SharePoint site that the image was uploaded to and translate the type into that language. In the case of OneDrive, we’ll translate it to the language you have set in your preferences.

What other features do you have planned?

We really want to connect your captures to workflows. The goal is to look at what the object is and take action based on it, via Flow or PowerApps, so we can help you move your work forward. We also will learn from patterns you have with types of objects – personalized learning, as part of the Microsoft Graph, to suggest actions and perform them automatically for you after the pattern is established.

Happy finding!

Naomi

Allan With Sørensen · ‎Dec 14 2017

Sounds good! So will it become possible to upload images automatically from the OneDrive mobile app to OneDrive for Business? Will the Windows Photos app support images stored in OneDrive for Business? Aso. Aso.

And yes, I did vote for it on UserVoice, feedback hub and so on :)

Martin Laplante · ‎Dec 14 2017

"we’ll automatically look at the language set on the SharePoint site"
That may not be a useful indicator of the desired language of the image type. The current language of the user is closer, like in OneDrive. Why do the types need to be translated, why not let the MUI do it?

Having the text itself extracted using language detection is good, but your receipts might be in a variety of languages, so you'll have to remember to also search for kaffe or kahvi.

Timothy · ‎Dec 14 2017

Any support for on-premise installations?

Benjamin Haynes · ‎Dec 15 2017

Not sure what the roll out situation is from this article. Is this an expansion on the announcement form Ignite, in Targeted Release or at General Availability?

Kuba Borkowski · ‎Dec 18 2017

Benjamin Haynes - the rollout is stated in the blogs.office.com article - see below:

Text in image search is currently rolling out to Office 365 commercial subscribers and will be available worldwide by the end of 2017.

Alex Fernandez de Jauregui · ‎Dec 18 2017

In the past, if a PDF was uploaded as an image (scanned document), SharePoint did not OCR the PDF document and the text was not searchable. With this implementation, will PDF's be searchable, if scanned as an image?

Mehmet Demirörs · ‎Dec 20 2017

You mention 21 different file formats but unfortunately not PDF, is this going to be implemented as well or are PDFs excluded from this search feature at all?

Many of our users do not understand the difference between searchable OCR PDFs and non OCR PDFs, therefore they are mostly disappointed if search does not show all expected files.

In a global organization where you have different scanners wit and without OCR it is a nightmare to explain people that it is about the document and not about the search why they won’t find non OCR PDFs in the search results.

Mark Gibbs · ‎Dec 26 2017

I really don't care to much about pictures but PDF's are huge, we now have to go through and fix all the pdf's that have been uploaded that were not OCR before uploading .

Kaushal Khamar · ‎Jan 01 2018

Hello,

I am not able to use this functionality. Is there any pre-configuration required to use this functionality. I have uploaded my business card in asset library in SharePoint and now not able to search using any word of my business card. I have also uploaded business card in OneDrive and same thing happen in onedrive also.

Christopher Gilbert · ‎Jan 02 2018

Has there been any response on this with regards to non-OCR PDFs, images of scanned documents within a PDF, being processed and becoming text-searchable?

Lewis Eigen · ‎Jan 03 2018

It looks like this would be very helpful to me.

But I could not figure out from your article, how I activate the capability. is it a now APP? An addition to an existing Office program?

Please advise.

Marnix Van de Kauter · ‎Jan 08 2018

This photo intelligence feature don't work with my Onedrive for Business account. Is this already implemented ? I made a photo in JPEG with a really clear text inside and uploaded it to my OneDrive map. But afterwards the search function in OneDrive couldn't find my photo... ?

Oliver Sahlmann · ‎Jan 17 2018

Where is the data analyzed and stored for OCR? Is it in the tenant?

Mehmet Demirörs · ‎Feb 09 2018

Any Microsoft representatives here?

Unfortunately there are more questions than answers in this blog right now.

Is this search feature also available für PDF files or at least is it on Microsoft's development roadmap?

How can we find out if this Feature is already rolled out on our tenant?

Best Regards,

Mehmet

Aleksandra Żurawska · ‎Feb 19 2018

Agree with Mehmet.

The article is confusing rather than informing. It is half of February and feature is not working.

Would be good if anyone would have answered here...

Dean Gross · ‎Mar 15 2018

@Naomi Moneypenny could you get someone to answer these questions

@Michael Holste

Michael Hunsberger · ‎Mar 22 2018

It would be useful to know how this will work with an organization's compliance rules. For example, if a user uploads a bunch of photos that turn out to have PII (Social Security numbers, credit card numbers, etc.) in them, will that get flagged in some way? Will the user be notified? Will an admin be notified?

Mehmet Demirörs · ‎Mar 22 2018

@Naomi Moneypenny

@anyone from Microsoft who takes care for customers

What is the intention of your post? Why has this been added to this panel?

You announced a new feature in a tech community panel but your are neither here to discuss with us nor answering any of our questions - this is very frustrating.

Happy to get an answer at least on this question.

Regards Mehmet

Naomi Moneypenny · ‎Mar 22 2018

Hello everyone, I've been trying to gather some answers to your questions. This feature has been completed rolled out at the end of last year. PDF files are generated by many different applications which has consequences for how those documents are made searchable. Even though as an end user, it appears that a PDF is one format, how the PDF is created makes a big difference in how to make it searchable. In SharePoint there is already a search function makes many types of PDFs searchable. There's no plans currently to extend the work of the image recognition team to PDFs imminently but engineering is aware that this is a concern, but there are many nuances to how to make this cover every situation. The data extracted is processed and lives wherever the data is stored, which includes geo support for data sovereignty. Hope this helps.

Lewis Eigen · ‎Mar 22 2018

The previous Microsoft announcement stated that there was no a way in which we could take an image that contained text, and produce the text in digital form.

Is this a capability that exists today?

If so, where and how do we find out how to utilize it?

Thank you.

Dean Gross · ‎Mar 23 2018

@Lewis Eigen one way to do that is within OneNote, see https://support.office.com/en-us/article/copy-text-from-pictures-and-file-printouts-using-ocr-in-one...

Lewis Eigen · ‎Mar 23 2018

Thanks much. This is a great capability. Publicize it more.

Frank Saia · ‎May 07 2018

Is the Intelligent search for images in OneDrive also available for OneDrive for Business, if not do you know when it will be? Is this feature on the roadmap for business users?

Thank you

Joost Koopmans · ‎May 24 2018

This feature is still not working for us. Is it working for others?

Lewis Eigen · ‎May 24 2018

If what you want to do is take a graphic file (a photo for example) where there is text on a sign or a screen shot of text or any graphic that contains text and extract the text from the photo, this works very well and we use it all the time.

The instructions are a little vague however.

You MUST use OneNote for this function, however. At the moment it does not work in any other Office Program.

Once you realize that you have to use OneNote, the rest is easy.

Copy the graphic from anywhere -- any application -- with Control C.
Go to a page in OneNote
Paste the graphic into that page using Control V
Right Click on the Graphic You have just pasted
Click on the Command "Copy Text from Picture"
(This extracts the text from the photo and places it in the Clipboard)
Go to any application you wish and Control V will paste the text from the picture (now in the clipboard) into the application where you want it.

John Huschka · ‎Aug 24 2018

So, after research, digging yet further, I have an escalated support ticket at Microsoft (#10638094, unsolved) and there are conversations at https://techcommunity.microsoft.com/t5/Intelligent-Search-Discovery/Search-for-words-in-your-images-..., https://techcommunity.microsoft.com/t5/Microsoft-SharePoint-Blog/Enrich-your-SharePoint-Content-with..., and https://stackoverflow.com/questions/51934105/does-office-365-image-search-work-if-so-how/51999323#51.... I have yet to hear of this functionality working for anyone. I will keep digging, and I will certainly post if I hear anything. J

Christopher Gilbert · ‎Aug 24 2018

I think I have seen this work in SharePoint Online in a limited fashion in the last 2 months - but have not had a chance to test thoroughly, which is why I say 'limited'. I have been using SharePoint Wiki pages to document a simple process and inserting PNG files as screenshots of the application. When I perform a search, the PNG files I inserted, which are stored in the Site Assets library are being returned as results. I'm fairly certain it is not returning the Wiki Page and that the PNG has no metadata that would facilitate it being found as a match. I have yet to see this work with non-OCR'd PDFs, but I am also exploring the use of Muhimbi's PDF converter for use with Flow that I think could be configured to do the trick. Of course, there is a price for that, but in theory, I should be able to use Flow and have it monitor a specific SharePoint Library (or several) and when a file is added, use Muhimbi to OCR it and put it back in the same spot (or email it, or move it somewhere else).

Jason Morse · ‎Oct 25 2018

I was excited about this feature until I discovered that OneDrive, as referenced in the article, has since been changed to only generate PDFs.

Dave Baldwin · ‎Nov 01 2018

It's been about a year for this feature to "mature". Has anyone been able to clarify if the PDF functionality is supposed to work or if there is another feature that covers that?

Jason Morse · ‎Nov 01 2018

I reached out to support on this issue and it looks bleak.

I checked the roadmap (you can visit the site here: https://office.com/roadmap) and filtered out SharePoint that is in development or rolling out but I was not able to find any. We are very interested to know what features you would like for all of our products and you can post your ideas here: https://office365.uservoice.com/. Office 365 User Voice is where we ask feedback from our customers. You can post your ideas here and other customers that want the same feature can vote up your idea. Developers can then include the most wanted feature in future updates.

I manually generated a searchable PDF from a scan and it works great once uploaded. The only other option I've run across is paying for a separate service and building the plumbing to SharePoint.

Dave Baldwin · ‎Nov 01 2018

Jason, thanks for the quick response. I was afraid that was the case. Our scanning solution does have OCR capability so I guess well be doing some testing to see how well that works. Was hoping not to need any third party functionality though.

Lesly Verry · ‎Nov 06 2018

We have recently moved to OneDrive, but are now regretting the move. Nothing is searchable, either from a scan from a phone (android or iPhone), nor an upload. In all scenarios they are PDFs, but we checked png/jpg as well. Nothing.

Well, I shouldn't say nothing, we have one document that was scanned with 4 words, and that document ocassionally shows up in search results for words it does not have. Out of 20+ word searches, only two showed a result, and it was the one PDF that has 4 words of which the two "matches" were not even in that document.

Microsoft simply replied with "there are many issues we are facing from the OneDrive update in summer." I suppose I was wanting to confirm if any one else is having 100% success rate?

John Huschka · ‎Dec 09 2018

Seems like the functionality has matured recently. I have been testing it more thoroughly, and I have documented the results in my blog at http://www.collaboration-foundry.com/SharePointImageAnalysis.

Bottom line: It works for me in OneDrive and SharePoint (modern and classis), but I've only seen it work on the out-of-the-box Document content type--which limits custom solutions somewhat.

It's cool functionality when it works. Looking forward to seeing Microsoft build on this.

John

Daniel Moerland · ‎Jun 07 2019

Just want to see if we can any addition insight on this thread regarding image only PDFs. @Naomi Moneypenny You mentioned this at the SharePoint Conference Keynote and for our company this is one of the most interesting bullets :) My understanding is that text in Image only PDFs (PDFs without a text layer) will soon be a indexable by the Search engine in SharePoint without going through a 3rd party OCR process. Is this correct? If so, you said "Coming Soon," while I don't need an exact date...does that mean in the coming months or more like a 2020 thing. Thanks again and awesome updates at the SharePoint Conference. Keep it up!

Amal_PS · ‎Jan 20 2020

Text on image is working fine. But what if the image is in a powerpoint slide or in a PDF/Doc file ?

Pracyal · ‎Dec 09 2020

Is there any way to search images embedded in PDF? We are not planning to use 3rd party applications.

Pobblebonk · ‎Dec 17 2020

We are using a third party solution (Aquaforest) for handling the OCR'ing of PDF's but it does not handle images and stores the results in PDF.

Our storage requirements are huge and growing quickly so need to look at size reduction alternatives so was wondering if searchable TIFF's will be brought to on-premise?

Also no matter what format, word, excel, notepad, pdf is there a OOTB way of converting to TIFF or would we still require a third party solution for this?

HenrivdM · ‎Mar 10 2021

Unfortunately Microsoft still lags MILES behind Evernote in terms of searching for OCR text in images. It is a real pity. I would love to switch fully to OneNote, but the search keeps me on Evernote.

SharePointBlackBelt · ‎May 10 2021

Why isn't this made available in SharePoint Server 2019/on-prem?

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

Search for words in your images in Office 365