Convert PDF to text file using PHP

by in PHP MySQL


If you need to read data from PDF converted to text file using PHP, the following example will solve that problem. It is very simple and concise, easy to do.

This example uses pdfparser library to read text, image files in pdf files. (https://www.pdfparser.org).

1. Install pdfparser library

In your source folder, create a composer.json file with the following content:

{
“require”: {
“smalot/pdfparser”: “*”
}
}

Make sure you have composer installed on your computer, then run the command to install the smalot and pdfparser libraries:

composer update smalot/pdfparser

 

2. Perform file reading file PDF

2.1 Create form select file, index.php

<form action=’result.php’ method=’post’ target=””>
<label>Select File:</label>
<p>
<input type=’file’ name=’file’ value=”>
</p>
<p><input type=’submit’ name=’submit’ value=’Submit’></p>
</form>

2.2 Export content to read PDF files, result.php

 All content on one page:

require_once __DIR__.’/vendor/autoload.php’;

if (isset($_POST[‘submit’])) {
// file selected
$file = $_POST[‘file’];

if (empty($file)) {
header(“location:javascript://history.go(-1)”);
}

echo ‘<p>Result contents: </p>’;

$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile($file);

// All content on one page
echo ‘<p><b>All content on one page:</b> </p>’;
$text = $pdf->getText();
echo $text;
}

 

 Multi page, the content of each page:

    echo ‘<p><b>Multi page: </b></p>’;
// Multi page
$pages = $pdf->getPages();

// Loop over each page to extract text.
foreach ($pages as $key => $page) {
echo ‘<p>Content of Page ‘ . $key . ‘: ‘ . $page->getText() . ‘</p>’;
}

 

 Read image:

    // Read image
echo ‘<p><b>All image:</b> </p>’;
$images = $pdf->getObjectsByType(‘XObject’, ‘Image’);

foreach( $images as $image ) {
echo ‘<img src=”data:image/jpg;base64,’. base64_encode($image->getContent()) .'” />’;
}

 

 Detail page:

    echo ‘<p><b>Detail page: </b></p>’;
// Retrieve all details from the pdf file.
$details = $pdf->getDetails();

// Loop over each property to extract values (string or array).
foreach ($details as $property => $value) {
if (is_array($value)) {
$value = implode(‘, ‘, $value);
}
echo ‘<p>’ . $property . ‘ => ‘ . $value . ‘</p>’;
}
}

 

3. Result

Convert PDF to text file using PHP result:

 

convert-pdf-to-text-file-using-php

Result after convert PDF to text using PHP

 

 

★ Conversely, you can refer to how to convert text to PDF using PHP here

Tags: , , , , , ,