Brokers! Please make your CIMs ocr/text extract friendly!

 profile

April 28, 2026

by a searcher in Orlando, FL, USA

calling all brokers. your buyers are using LLMs to review your CIMs - and that's not a bad thing. but all of your image-heavy pdf layouts, powerpoints embedded in websites, and infographics fail when your buyer drops it into the LLM - it extracts what text it can, and that's more often than not _wrong_ and the LLM has to halluicinate to fill in the gaps. and your buyers are using that information to guide their process. please consider a more text-friendly format (even if it's a secondary format) - maybe even drop your pdf into a LLM yourself and see what it tells you and how that compare to what you know. not everyone goes to the trouble of extracting your graphics into images and feeding them into a vision model - at best they're relying on OCR, which often does a REALLY BAD JOB of extracting your cim! thanks. ...your buyers
2
3
117
Replies
3
commentor profile
Reply by an intermediary
from The University of Michigan in Bonita Springs, FL, USA
99.9% of the time you are going to receive PDFs -- and you should. There is a good reason for this. Brokers are tasked with maintaining confidentiality and security of the documents they produce. 1. In spite of the NDA, Brokers don't always know where their documents are going. PDFs are fixed-format not meant to be edited. PDFs are harder to tamper with content without leaving traces. Word documents, for example, are inherently editable and designed for collaboration. A confidential document needs to be secure -- a CIM is NOT for collaboration. 2. Word documents retain hidden metadata — author names, revision history, tracked changes, comments, and even deleted text. This can leak sensitive information unintentionally. PDFs can also contain metadata, but it's easier to strip and there's no revision history baked into the format. 3. PDFs have built-in encryption, password protection, and permission controls to prevent printing, copying text, or editing. Word only supports password protection with no additional controls. You would be shocked at the number of times someone has shared an altered document with me. I've seen my name, logos, and marks stripped from the CIM and replaced with another firm. I've seen significant changes to numbers, and text information changed in the CIM. Because nobody can be trusted, PDF is one of the most secure and traceable means of transferring confidential information.
commentor profile
Reply by an admin
from Massachusetts Institute of Technology in Portland, OR, USA
^redacted might be able to comment here. Press @ to tag someone else! :-)
commentor profile
+1 more reply.
Join the discussion