id summary reporter owner description type status priority milestone component version resolution keywords cc drp_resources i_links o_links remaining_time sprint 11980 Extract text from large PDFs for indexing jballanco-x jballanco-x A PDF file may contain multiple images, causing it to exceed the large-file-size rejection limit (see #11979), even though the text content of the file is below the limit. We need a mechanism for extracting text from such files and checking the text against the size limit independent of the parent file. If the text alone is below the cut-off, we should still index it. task new major Asynchronous Search 4.4.10 search, full text indexing