A few days ago, I got an interesting question about my post which describes using the Amazon AWS SDK for Texttract. The question was “How can I do this with a PDF stored in S3? I know you need to use analyzeDocumentAsynch but unsure how to then get the results of the Asynch operation“.

It turns out to be pretty easy, once you’ve got the synchronous example running. The synchronous Textract example is described in that previous blog post.

Here are the code changes you need to make. Keep all the source code as before, but starting with the call to analyzeDocument, replace that and the following lines with this code:

$promise = $client->analyzeDocumentAsync($options);
$promise->then(
    // $onFulfilled
    function ($value) {
		echo 'The promise was fulfilled.';
		processResult($value);
    },
    // $onRejected
    function ($reason) {
        echo 'The promise was rejected.';
    }
);

// If debugging:
// echo print_r($result, true);
function processResult($result) {
	$blocks = $result['Blocks'];
	// Loop through all the blocks:
	foreach ($blocks as $key => $value) {
		if (isset($value['BlockType']) && $value['BlockType']) {
			$blockType = $value['BlockType'];
			if (isset($value['Text']) && $value['Text']) {
				$text = $value['Text'];
				if ($blockType == 'WORD') {
					echo "Word: ". print_r($text, true) . "\n";
				} else if ($blockType == 'LINE') {
					echo "Line: ". print_r($text, true) . "\n";
				}
			}
		}
	}
}

When you run your PHP code from the command line, you’ll notice a small wait while the asynchronous code processes, and then you’ll see the same output as before.

Here’s a link to the Guzzle Promises project to give you an idea of how to use Promises in PHP.

And here’s the full source example use of analyzeDocumentAsync.