Tips Extract Text from Images with Amazon Textract

kientrung470 · Sep 28, 2023

[TIẾNG VIỆT]:
Amazon Textract là một dịch vụ học máy có thể trích xuất văn bản từ hình ảnh.Nó có thể được sử dụng để trích xuất văn bản từ các tài liệu được quét, ảnh và các loại hình ảnh khác.Textract có thể được sử dụng để trích xuất văn bản từ hình ảnh ở nhiều định dạng khác nhau, bao gồm JPEG, PNG và TIFF.

Để trích xuất văn bản từ một hình ảnh với Amazon Textract, bạn có thể sử dụng các bước sau:

1. Tải hình ảnh lên Amazon Textract.
2. Chọn ngôn ngữ của văn bản trong hình ảnh.
3. Chọn loại đầu ra bạn muốn.Bạn có thể chọn lấy văn bản dưới dạng đối tượng JSON, tệp văn bản hoặc tệp CSV.
4. Nhấp vào ** Trích xuất **.

Amazon Textract sẽ trích xuất văn bản từ hình ảnh và trả lại theo định dạng bạn đã chọn.

Dưới đây là một ví dụ về cách trích xuất văn bản từ hình ảnh bằng Amazon Textract:

`` `
Nhập boto3

# Tạo một khách hàng cho Amazon Textract.
client = boto3.client ('Textract')

# Tải hình ảnh lên Amazon Textract.
phản hồi = client.put_document_text_detection (
Document = {
'S3Object': {
'Xô': 'My-Bucket',
'Tên': 'My-image.jpg'
}
}
)

# Nhận văn bản từ hình ảnh.
phản hồi = client.get_document_text_detection (
Jobid = phản hồi ['jobid']
)

# In văn bản từ hình ảnh.
In (Phản hồi ['Blocks'] [0] ['Text'])
`` `

Đầu ra của mã này sẽ là văn bản được trích xuất từ hình ảnh.

Để biết thêm thông tin về cách sử dụng Amazon Textract để trích xuất văn bản từ hình ảnh, vui lòng tham khảo [tài liệu Amazon Textract] (Amazon Textract).

### Những bài viết liên quan

* [Cách trích xuất văn bản từ hình ảnh với API Google Cloud Vision] (https://cloud.google.com/vision/docs/text-extraction)
* [Cách trích xuất văn bản từ hình ảnh với các dịch vụ nhận thức của Microsoft Azure] (https://docs.microsoft.com/en-us/az...omputer-vision/tutorial-read-text-from-images)

[ENGLISH]:
Amazon Textract is a machine learning service that can extract text from images. It can be used to extract text from scanned documents, photos, and other types of images. Textract can be used to extract text from images in a variety of formats, including JPEG, PNG, and TIFF.

To extract text from an image with Amazon Textract, you can use the following steps:

1. Upload the image to Amazon Textract.
2. Select the language of the text in the image.
3. Choose the type of output you want. You can choose to get the text as a JSON object, a text file, or a CSV file.
4. Click **Extract**.

Amazon Textract will extract the text from the image and return it in the format you selected.

Here is an example of how to extract text from an image using Amazon Textract:

```
import boto3

# Create a client for Amazon Textract.
client = boto3.client('textract')

# Upload the image to Amazon Textract.
response = client.put_document_text_detection(
Document={
'S3Object': {
'Bucket': 'my-bucket',
'Name': 'my-image.jpg'
}
}
)

# Get the text from the image.
response = client.get_document_text_detection(
JobId=response['JobId']
)

# Print the text from the image.
print(response['Blocks'][0]['Text'])
```

The output of this code will be the text that is extracted from the image.

For more information on how to use Amazon Textract to extract text from images, please refer to the [Amazon Textract documentation](https://docs.aws.amazon.com/textract/latest/dg/).

### Related articles

* [How to Extract Text from Images with Google Cloud Vision API](https://cloud.google.com/vision/docs/text-extraction)
* [How to Extract Text from Images with Microsoft Azure Cognitive Services](https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/tutorial-read-text-from-images)

Tips Extract Text from Images with Amazon Textract

kientrung470

New member

Latest posts