PDF Metadata Stripper

A PDF metadata stripper removes hidden author, title, creator, producer, creation date, modification date, custom properties, and XMP metadata from a PDF so the file no longer reveals who wrote it. This stripper runs entirely in the browser using pdf-lib, shows every field before removing it, and includes a verify-clean re-upload pass so authors can confirm a double-blind submission is anonymous before sending it to the journal or conference system.

Calculated locally · nothing uploaded 0 requests sent

The PDF you drop is parsed and rewritten inside your browser tab. The library code is bundled into this page; no part of your file is sent anywhere.

What is a PDF metadata stripper?

A PDF metadata stripper removes the hidden fields a PDF carries about who wrote it and how it was made: Author, Title, Subject, Keywords, Creator (the application that produced the source), Producer (the library that wrote the PDF), CreationDate, ModDate, any custom properties left over from a Word template, and the XMP stream that often duplicates and extends those fields in XML. For a double-blind manuscript, every one of those fields is a potential leak. This stripper shows you what is in the file, clears both the Info dictionary and the XMP block in your browser, and lets you re-upload the cleaned file to verify it is empty before you submit.

Loading calculator…

How to anonymize your PDF for double-blind submission

Drag your manuscript into the upload zone. The left pane lists every populated metadata field the file actually contains. Anything that names you, your co-authors, your institution, your university LaTeX template, or your Overleaf project carries identifying information; reviewers will see all of it if you submit without stripping.

The right pane previews the same fields cleared. Click Strip and download and the browser writes a new PDF with the Info dictionary emptied and the XMP stream removed. The file is offered as a download from the same tab; no upload happens.

Underneath the widget is a verify zone. Drop the cleaned PDF back in and every field should render empty. That second pass is the only proof that survives a venue-side metadata check. Screenshot the verify state for your advisor or attach it to your submission notes.

What gets exposed when you do not strip PDF metadata

Reviewers and submission systems read every one of these by default:

  • Author. LaTeX \hypersetup{pdfauthor=...} writes your full name here. Word writes whatever is in File → Options → User Information, which usually carries the author of the very first document the template was based on. Overleaf inherits its account-holder name.
  • Title. Sometimes carries an internal working title that names the lab or grant.
  • Creator. Identifies the source application (Microsoft Word 16.79, LyX 2.3.7, Pages 13.2). Combined with the Producer field it narrows the author pool fast.
  • Producer. The library that wrote the PDF (pdfTeX-1.40.25, Skia/PDF m119, Acrobat Distiller). Often paired with a build identifier that ties back to a specific OS install.
  • Last Modified By. Survives a Word → Save As PDF if revision metadata is preserved; carries the most recent co-author’s account name.
  • CreationDate / ModDate. ISO timestamps accurate to the second. Cross-referenced against an Overleaf commit log, they identify the project.
  • Custom properties. Hidden Word-template fields named Company, Manager, Owner that propagate from the first document opened on the machine.
  • XMP block. Duplicates the above as XML inside the document catalog, and adds extras: xmpMM:DocumentID, xmpMM:InstanceID, history of derived-from documents. Stripping only the Info dictionary leaves XMP in place, which is the failure mode that catches careful authors out.
  • Comments and annotations. Reviewer comments written in Acrobat carry the commenter’s name. PDF flatten before submission, or remove comments via Tools → Comment → Delete All.

Fynman extracts the full metadata block from every PDF in a literature review automatically, which makes the reverse problem (auditing whether included studies report what they should) take minutes instead of hours.

When venues actually check your PDF metadata

Treat every submission system as if it preserves the file exactly as uploaded. Behavior in the wild:

  • OpenReview. Strips a small set of fields server-side and warns on others. A safe baseline, not a guarantee; do not rely on it for fields the venue does not explicitly enumerate.
  • CMT (Microsoft). Preserves the file as uploaded. Reviewer downloads carry whatever you sent.
  • EasyChair. Preserves the file as uploaded.
  • Scholastica. Author guidance tells you to strip metadata yourself before upload and links out to Word and Pages instructions. No server-side stripping.
  • Nature Research double-blind option. The author checklist requires metadata removal, enforced by self-attestation. A reviewer who notices an Author field is grounds for desk reject.
  • ACM, IEEE, ICLR, NeurIPS, ACL, CHI. Conference-specific submission instructions vary year over year. The constant: assume nothing is stripped for you.

The cost of being wrong is a desk reject for breach of double-blind, which usually means waiting a full cycle to resubmit. Stripping locally takes thirty seconds.

How this stripper works (and how to verify the privacy claim yourself)

The file is read into a Uint8Array via the browser’s FileReader API. pdf-lib, vendored as a same-origin asset on this page, parses the PDF, clears the document-information fields (setTitle(''), setAuthor(''), setSubject(''), setKeywords([]), setProducer(''), setCreator('')), and removes the XMP metadata stream from the document catalog. The output is a fresh Uint8Array piped into a Blob and offered as a download via an object URL. There is no fetch, no XMLHttpRequest, no WebSocket.

The privacy badge at the top of this page carries a live counter that monkey-patches window.fetch and XMLHttpRequest.prototype.open after the page-load event. Each request bumps the counter. When you use the stripper, the counter stays at zero. To prove it for yourself: open browser DevTools, switch to the Network tab, clear the log, and drop a PDF. No new entries appear.

What this tool does not do (and what to use instead)

Honest scope, because trust here matters more than feature breadth:

  • Body text scanning. The stripper does not search the paper for your name, your co-authors, or your institution. Use Find and Replace in your editor for that pass before exporting to PDF.
  • Comment-author scrubbing. Annotation and comment authors written by Acrobat are preserved (the stripper only touches document-level metadata). For comment scrubbing use Acrobat’s Sanitize Document command, or delete all comments before exporting.
  • Track Changes residue. If you exported from Word with revision metadata still enabled, the Last Modified By field can return on the next save. Run Inspect Document in Word and remove personal information before exporting to PDF.
  • Image EXIF stripping. Figures embedded in academic PDFs almost never carry EXIF, but if your manuscript embeds raw camera output, run it through an EXIF cleaner first.
  • Encrypted or password-protected PDFs. Decrypt first, strip second.

Frequently asked questions

Schedule a demo

Frequently Asked Questions

Find answers to common questions about this topic.

Yes. The stripper only rewrites the document-information dictionary and removes the XMP metadata stream. The page tree, fonts, images, and any PDF/A conformance markers are preserved. If your venue runs a strict PDF/A-1 validator, run that check separately; the stripper does not downgrade or upgrade conformance.
No. This tool removes file metadata only. Body text, footnote authorship, self-citations written in the first person, and acknowledgments are not touched. The Nature double-blind checklist and most CS-conference policies require both passes; use this tool for the metadata pass and search-replace your draft for the body-text pass.
Yes. PDFs carry two parallel metadata stores: the Info dictionary in the trailer (older, key-value) and the XMP stream in the document catalog (newer, XML, often duplicated and extended). Stripping only the Info dictionary leaves XMP in place, which is the exact failure mode that has caught published authors out. This tool clears both.
Yes. OpenReview rewrites a small set of fields and will warn on others, but the safe assumption for every submission system is that whatever you upload is exactly what reviewers see. CMT and EasyChair preserve the file as uploaded. Stripping locally before submission is the only way to be certain.
No. The PDF is read via the FileReader API into a Uint8Array held in tab memory, parsed and rewritten by pdf-lib in place, then offered as a download via a Blob URL. The privacy badge above carries a live counter that watches fetch and XMLHttpRequest after page load; it stays at zero. Open browser DevTools, switch to the Network tab, and watch for yourself.
Acrobat’s Sanitize is more aggressive: it also removes form fields, scripts, attached files, hidden layers, and overlapping objects, which can break a paper that uses any of those features. This stripper targets the narrow case that desk-rejects double-blind submissions: the author-identifying metadata. Use Acrobat Sanitize when you need to scrub a confidential contract; use this when you need a clean submission PDF that still works.
Not for now. The widget handles one file per pass to keep the verify-clean step unambiguous. Most double-blind submissions involve a single manuscript file plus a supplement, and each pass takes under a minute. If you have a batch of dozens, run them through Acrobat’s Action Wizard locally instead.