searchable pdfs not searchable within sharepoint document library

alex_k60 · ‎Sep 29 2021

We are scanning invoices and creating searchable pdfs. I've searched the contents of the invoices and they are searchable. However, when I upload them to the document library and search the document library for the contents, it doesn't find any results. Why not?

I've confirmed that the documenting library settings as shown in document https://docs.microsoft.com/en-us/sharepoint/troubleshoot/search/search-results-missing :

Allow this site to appear in Search results is set to Yes.
Allow items from this document library to appear in search results
Any user who can read draft items
Allow this site to appear in Search results

Any other ideas to what the problem could be?

Vertebre85 · ‎Oct 03 2021

@alex_k60

HiYou speak about sharepoint online?

just to be sure, when you are on a library, in the searchbox, could you type

content:<a word present in your pdf> filetype:pdf

It will specifically look for the content in all the pdf file. If you see result, your pdf are well searchable and the issue is somewhere else.

Are you using information right management? I think that in that case, the content is not searchable YET ( something in the roadmap).

alex_k60 · ‎Oct 03 2021

@Vertebre85

Thanks for answering. Using SharePoint online. We're not using information right management. What else could it be? I've even saved a word document as pdf and still cant search the contents.

Thanks

Vertebre85 · ‎Oct 03 2021

@alex_k60

Could you go on the main page of sharepoint online (https:/:<yourdomain>.sharepoint.com and just try the search with the keyword "content:..." to see if it's a domain issue or a specific site issue?

If it has never work, i would advise to reach the microsoft support. On Server, it's often due to issue with the search crawl.

if you have the PNP powershell, you can look at the crawl log https://www.sharepointdiary.com/2019/07/get-search-crawl-log-in-sharepoint-online-using-powershell.h...

It's officialy not possible to start/restart a manual crawl in sharepoint online. I've seen some "hack" but never tested it.

alex_k60 · ‎Oct 04 2021

Done the above and seen a few items in the log file. Nothing for the document library in-particular at https://companyname.sharepoint.com/Finance/company1_invoices

However, there is the entry below , will that be the scan for the whole of finance or is that only scanning the root directory not the document library "finance/company1_invoices"

"Url : https://companyname.sharepoint.com/Finance
CrawlTime : 03/10/2021 15:02:38
ItemTime : 01/01/0001 00:00:00
LogLevel : Success
Status :
ItemId : 11257
ContentSourceId : 1"

Vertebre85 · ‎Oct 04 2021

@alex_k60

Hi
Seing your url, the "finance" stuff seems to be a subsite and not a separate website.
Could you go in the " https://companyname.sharepoint.com", press the cog wheel and check for the "search and offline availibility". If the search option is turn off, it's apply to the subsites.

Verify then all the settings on the parent sites before.

Small note, subsites are not recommended anymore, it's best (unless specific requirement) to create separate websites and link it to a hub ( also Hub of hubsite is currently in roll out)

alex_k60 · ‎Oct 04 2021

@Vertebre85

all sites are enabled :

Thanks for the tips on subsites :)

alex_k60 · ‎Oct 07 2021

OCR is now working within SharePoint. There was a search option within the scanning software that I enabled. The confusing thing about this is that before the option was enabled, I was able to search the pdf for content using foxit. The other weird problem is that even after the option was enabled I could search in the sharepoint search box for say "ABC company" and it would find text but if I used "content: ABC company filetype:pdf" it wont find anything.

Vertebre85 · ‎Oct 07 2021

@alex_k60 Glad you solve it. I didn't expect something in the OCR, i'm not familiar with PDF to be honest.

Concerning the property of the pdf, SHarepoint is able to find it.

For example, all the property are considered as metadata and if you have put your company as metadata, you can search for it and Sharepoint will give you data.

searchable pdfs not searchable within sharepoint document library

searchable pdfs not searchable within sharepoint document library

Re: searchable pdfs not searchable within sharepoint document library

Re: searchable pdfs not searchable within sharepoint document library

Re: searchable pdfs not searchable within sharepoint document library

Re: searchable pdfs not searchable within sharepoint document library

Re: searchable pdfs not searchable within sharepoint document library

Re: searchable pdfs not searchable within sharepoint document library

Re: searchable pdfs not searchable within sharepoint document library

Re: searchable pdfs not searchable within sharepoint document library

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

searchable pdfs not searchable within sharepoint document library