How to extract text from pdf on Windows? It is a scanned PDF

Question

Hi everyone,I have several scanned PDF files that I need to extract text from. Some PDFs allow me to select and copy the text directly, but others seem to be scanned documents or locked in a way that makes copying difficult.Could you please suggest a reliable way to extract text from PDF document and save it as Word, TXT, or another editable format. I would prefer a method that works offline because I don't really want to upload private PDF files to online converters.What tools or built-in Windows options do you recommend for this on a Windows 10 PC?&nbsp;

rhysin · Answer

The most direct and private method is to use the built-in OCR engine that comes free with Windows. You don't need to install any extra software to use this feature.

How it works: Windows has a native OCR engine called Windows, Media, Ocr that can extract text from images. It works entirely offline, meaning your documents never leave your computer.

The Tool: You can access this engine via Microsoft PowerToys, a free, open-source utility officially published by Microsoft for power users .

How to extract text from PDF:

1. Install PowerToys: Download and install Microsoft PowerToys.

2. Open your Scanned PDF: Use any PDF viewer (like Microsoft Edge or Adobe Reader) to open the scanned document on your screen.

3. Activate Text Extractor: Press the activation shortcut: Win + Shift + T . A transparent overlay will appear on your screen.

4. Select the Text: Click and drag your mouse to draw a box over the area of text you want to copy.

5. Paste the Text: The text is automatically copied to your clipboard. You can now paste it (Ctrl + V) into any document or text editor.

Start with Microsoft PowerToys if you want to know how to extract text from PDF. It is an official Microsoft tool, works entirely offline, and is perfectly suited for quickly extracting text from any scanned document you see on your screen.

zoeiur · Answer

if you are looking for a legitimate, safe, and completely free way to extract text from PDF free on Windows without installing sketchy software, Share X is an excellent choice. Just open your PDF, point, click, and paste.

Because Share X works by looking at your screen, the process is slightly different from a standard PDF converter. However, it is very straightforward. Here is the step-by-step to extract text from PDF free:

Open your scanned PDF: First, use any PDF viewer to open the scanned document on your screen.
Activate Share X's OCR: Instead of taking a regular screenshot, you will use Share X's tex recognition tool. You can find this by opening the Share X main window and navigating to the Tools menu, where you will see an option for Text Recognition .
Select the text region: Your cursor will change, allowing you to click and drag a box directly over the text in the scanned PDF that you want to copy. This is very precise.
Get your text: Instantly, Share X will process the image inside your selected box, recognize the letters and words, and automatically copy that text to your computer's clipboard. You can then simply paste it into any document, email, or text file.

Share X is a fantastic tool for this task, but understanding its small quirks will help you use it most effectively.

Excellent for Short or Medium Extracts: This method is perfect when you need to copy a few paragraphs, a recipe, a quote, or a technical command from a PDF. It is much faster than re-typing everything.

Not for Whole Book Conversion: It is not designed to automatically process all 300 pages of a scanned novel. The tool works best as an on-demand text grabber for the specific sections you select on your screen.

holtsawye · Answer

gImageReader is an open-source, free tool that uses the Tesseract OCR engine, and it can extract text from scanned pdf offline without any internet connection.

It lets you recognize text in scanned documents with high accuracy and multiple language support.

First, download and install the software from the official website

Next, open the program, click File > Open, and select the PDF file you scanned.

Click the Recognize button, select the desired language, and wait for the OCR process to complete.

You can then copy the recognized text directly or export it as a TXT file.

This method is excellent for accurate OCR with many language options, so it works well for users who need to extract text from scanned pdf in different languages.

The software is open-source, so there are no hidden fees. If you don't use English, you'll need to download the installation package for another language.

If you're looking for a highly accurate offline OCR solution, this is a reliable choice, although its interface is more technical than that of applications designed for general users.

ps

Poor scan clarity and low contrast can significantly reduce recognition accuracy; we recommend optimizing scan quality in advance.
Processing large or multi-page PDF files may be slow and could result in program response delays.
When exporting to a TXT file, the original document’s formatting will be lost, and only plain text content will be retained.

elenorp · Answer

Let me explain both situations so you know exactly how to extract text from PDF using Edge on your Windows machine.

Situation 1: Standard PDFs

If you open a PDF and can already highlight the text with your mouse cursor, you're looking at a standard text‑based PDF. In this case, extracting text is extremely straightforward:

1. Open the PDF in Microsoft Edge (it's the default PDF viewer on Windows)

2. Select the text by clicking and dragging your mouse over the content you want

3. Copy the text using either:

Right‑click and select "Copy" from the menu
The keyboard shortcut Ctrl + C
Paste it anywhere with Ctrl + V

Edge even provides a convenient mini‑menu that pops up when you select text, giving you quick access to copy, highlight, or add comments. It's fast, intuitive, and requires no extra software.

Situation 2: Scanned PDFs

This is where things get interesting when you learning how to extract text from PDF — and where Edge's hidden superpower comes into play. Scanned PDFs are essentially images of pages, not actual text. Normally, you can't select or copy anything from them. However, Microsoft has been testing a feature that solves exactly this problem.

The Experimental OCR Feature

Microsoft Edge is currently testing an "OCR for PDF" feature that integrates Windows 11's built‑in OCR engine directly into the browser's PDF reader. Here's what you need to know:

How to enable it:

1. Type edge //flags into Edge's address bar and press Enter

2. Search for msPdfWindowsOcrCoverage

3. Change the setting from "Default" to "Enabled"

4. Restart Microsoft Edge

yatesgriffin · Answer

Microsoft Word includes built-in OCR functionality that allows you to extract text from scanned pdffiles, enabling you to recognize text in scanned documents without the need for additional software.

How to Extract Text from Scanned PDF

Open the software
Click the File menu, select Open, and then choose the PDF file you want to scan
Confirm the prompt displayed by the system
The program will perform automatic OCR analysis on the file's content
Edit the recognized text, or save the file in Word format

Once loaded, the application can run offline and successfully extract text from scanned PDFs. Please note that Microsoft Office must be installed; while Windows comes with a version that offers a one-month trial, it is not available for free.

Pros

Built-in OCR functionality; no need to install additional tools
Text extraction can be performed offline after loading
Recognized text can be edited directly

Cons

Requires Microsoft Office to be installed
Recognition results are subpar for PDFs with complex layouts

Forum Discussion

How to extract text from pdf on Windows? It is a scanned PDF

8 Replies