Project

General

Profile

Actions

CompressPdf #466

open

Integrate PDF Content Percentage Extraction Function

Added by Zahid Hassan over 1 year ago. Updated over 1 year ago.

Status:
Complete
Priority:
High
Assignee:
Category:
feature
Target version:
Start date:
09/25/2024
Due date:
09/25/2024 (about 19 months late)
% Done:

100%

Estimated time:
8:00 h

Description

Description

We need to implement a function that extracts the percentage of text and images from a given PDF file. This function should take a PDF file as input and return the percentages. It should also return the number of pages in that pdf.

Acceptance Criteria

Function: extract_content_percentages(file_path: str)

Input: Path to the PDF file as a string.

Output
Function will return the percentages of text and images, and number of pages.

# Sample use case

text_percentage, image_percentage, no_of_pages = extract_content_percentages(file_path)

# then populate the respective fields of the file_info object  

Functionality

  • The function should analyze the PDF and calculate the percentage of text and images and return them.
  • It should return the number of pages of a pdf.
  • It should return the text percentage and image percentage.
  • Ensure the function handles edge cases, such as empty PDFs or PDFs without text/images.
Actions #1

Updated by Redmine Admin over 1 year ago

  • Status changed from To Do to Complete
Actions

Also available in: Atom PDF