Who we are
This Privacy Policy is published by Renav Limited (“PDFLM”, “we”, “us”), registered in England and Wales (company number 08758164) with its registered address at 449 Honeypot Lane, Stanmore, England HA7 1JJ.
PDFLM provides a WordPress plugin and a managed API at api.pdflm.com that let our customers index PDF documents and embed an AI chat widget on their websites. This policy explains what data we process, why, who else sees it, and what choices you have.
Throughout this policy we use the word tenant for the website owner who buys a PDFLM subscription, and end user for an anonymous visitor who types a question into the chat widget on a tenant's website. The two roles get different protections because they have different relationships with us.
Our two roles (Controller and Processor)
Under UK GDPR and EU GDPR, PDFLM plays two distinct legal roles depending on which data is in question. The distinction is not cosmetic — it changes who is responsible for what.
We are the Controller of:
Tenant account data — signup email, organisation name, billing details, API usage logs, server access logs. We decide the purposes and means of this processing. The rights and obligations described in this policy apply to us as Controller for this data.
We are a Processor for:
Indexed PDF content and end-user question text. The tenant is the Controller of this data — they decide what to index and what data appears in their visitors' questions. We process it on their behalf according to their instructions, as expressed through their use of the API.
What this means in practice: tenants are responsible for having a lawful basis to process the personal data inside the PDFs they upload (consent, contract, legitimate interest, etc.), and for telling their own end users that their questions are processed by an AI service. Data subject requests about PDF content (deletion, access, correction) reach us through the tenant, who instructs us as their Processor.
A formal Data Processing Agreement (DPA) under Article 28 UK GDPR is available to all paid customers and is incorporated automatically into the service contract for paying tenants. Free-trial tenants may request a DPA at any time. See section 16 for how to request one.
What data we collect and store
We try to keep this list specific. Vague privacy policies are a regulatory risk signal, not a comfort.
From tenants (the website owners)
- Email address — used as the login identifier and for service communications. Retained for the lifetime of the account plus 30 days after deletion.
- Organisation name — displayed in admin tools. Same retention.
- Website URL — informational only; we never fetch or validate the URL. Same retention.
- Hashed API key — the API key shown at signup is hashed (one-way) before storage. We cannot recover the original; if lost, a new key must be issued.
- Billing identifiers from Stripe — Stripe customer ID, subscription ID, and the last four digits of the payment card. We never see or store full card numbers, CVV codes, full billing addresses, or any other PCI-scope payment data. Retained for 7 years from the last transaction, as required by HMRC under UK tax law.
- API usage logs — for each query: tenant ID, question text (up to 500 characters), response time, top match score, input/output token counts, and estimated cost. Retained for 12 months for billing reconciliation, security audit, and accuracy regression detection.
- Server access logs — IP address, user agent, requested endpoint, timestamp. Retained for 30 days for security investigations and operational troubleshooting, then automatically deleted.
From indexed PDFs (we are Processor, not Controller)
- Full text content — extracted by our PDF parser and stored as ~300-token chunks in our vector database, plus searchable copies in our keyword index. Retained until the tenant deletes the document or cancels the account.
- Embeddings — numerical vector representations of each chunk, stored alongside chunks in the vector database. Same retention as the underlying content.
- Page-level summaries — short AI-generated summaries of each PDF page, stored as additional chunks. Same retention.
From end users (anonymous visitors)
- Question text — the literal words a visitor types into the chat widget. Retained for 12 months as part of usage logs (above), for billing and accuracy review.
- Conversation history within a single session — kept in the visitor's browser
sessionStorageand sent back to us with subsequent questions for context. Never stored on our servers across sessions; cleared automatically when the visitor closes the widget or navigates away.
What we explicitly do not collect
We think it's worth stating the absences as positively as the presences.
From end users (chat widget visitors)
- No names, email addresses, phone numbers, or postal addresses.
- No IP addresses. We do not log visitor IPs for chat queries. (Our marketing site at
pdflm.comuses standard server logs that include IPs, as described in section 3.) - No cookies set by the chat widget, ever. No tracking pixels, no browser fingerprinting, no third-party analytics SDKs.
- No location data, no device identifiers, no browsing history, no ad-targeting data.
- No account or login data — unless the tenant explicitly configures the widget to forward an authenticated user ID. That's the tenant's choice and their responsibility under their own privacy notice.
Card and payment data
We never see or store full card numbers, CVV codes, or full billing addresses. Payment instruments are held by Stripe; we receive only an opaque customer reference and the last four digits of the card.
Biometric, voice, and special category data
We do not collect biometric data, voice recordings, or device sensor data from end users.
If a tenant uploads PDFs that contain special category data under Article 9 UK GDPR (health, religion, ethnicity, sexual orientation, political views, etc.), the tenant is responsible for the lawful basis for processing it. Our handling as their Processor is incidental and technical; we do not analyse or surface this data outside the chat-response context the tenant has configured.
On the marketing site
- No third-party advertising cookies, no Google Analytics, no Meta Pixel.
- Privacy-respecting analytics only, configured to anonymise IPs (the current provider is named in our Cookies Policy).
- Marketing emails are sent only to people who have opted in explicitly; we don't use bought or shared contact lists.
Why we process it (lawful basis)
For each kind of processing where we are the Controller, we rely on one of the lawful bases set out in Article 6 UK GDPR:
- Account data — signup, login, billing. Performance of a contract (Article 6(1)(b)). Necessary to provide the service the tenant signed up for.
- API usage logs and server / security logs. Legitimate interest (Article 6(1)(f)) in operating, securing, and improving the service. We've balanced this against tenants' interests; the logs contain minimal personal data and are retained only as long as needed.
- Marketing emails (where they exist). Consent (Article 6(1)(a)) — always opt-in, with an easy unsubscribe link in every email.
- Tenant-uploaded PDF content. We are Processor here, not Controller. The tenant is responsible for the lawful basis; our processing follows their instructions under the DPA.
Who else processes data on our behalf
We use a small number of third parties to operate the service. They fall into three groups depending on how sensitive the data they touch is.
Named explicitly
These services materially process either content or money. We name them up front because procurement teams will ask anyway and we'd rather pre-empt the question.
| Service | Purpose | Data sent | Location |
|---|---|---|---|
| OpenAI | Embeddings, chat generation, OCR, page summarisation | Indexed PDF text, user questions, conversation history | USA |
| Pinecone | Vector storage for indexed documents | Chunked PDF text and embeddings | USA (AWS us-east-1) |
| Stripe | Subscription billing, hosted checkout, customer portal, webhook events | Customer email, organisation name, payment details (last 4 only — full card never reaches us), subscription state | USA / EU (SCCs in place) |
| Resend | Transactional email delivery (welcome, password reset, payment-failed, trial-ending reminders, internal waitlist notifications) | Recipient email address, email body content (e.g. your reset link, your trial expiry date) | USA (SCCs in place) |
| Supabase | PostgreSQL hosting for customer accounts, audit log, and authentication state | Email, hashed password, organisation name, plan + subscription state, Stripe customer IDs, audit log | EU (eu-west-2 / London region) |
| Vercel | Hosts pdflm.com (the marketing site, customer dashboard, signup flow) and the serverless functions behind it | All HTTP requests + responses; Vercel Analytics (cookieless) for anonymous pageview counts | EU edge with US fallback (SCCs in place) |
| Railway | Hosts api.pdflm.com (the Python backend that performs PDF indexing and answer generation) | API requests from your WordPress plugin, indexed PDF content, query logs | USA (Amazon US-East-1) |
OpenAI does not train its models on data submitted via its API — this is stated in OpenAI's API terms as of 2025. Your indexed documents and visitor questions are not used to train any AI model. See section 7 for the one important caveat about OpenAI's short-term retention.
Disclosed by category
These are commodity infrastructure providers. We describe them by function so that we're free to swap vendors without rewriting this policy — but every one is bound by a data processing agreement requiring GDPR-equivalent protections.
- Cloud hosting providers — run our backend servers and primary database.
- Error monitoring services — receive automated error reports. Configured to exclude personal data from stack traces.
- Email delivery providers — send transactional emails (signup confirmations, billing receipts, password resets).
- Cross-encoder reranking providers — optionally process user questions alongside document excerpts to improve answer relevance. Only used when enabled by tenant configuration.
Sub-processor list available on request
A complete, current list of all sub-processors — including the categorical providers above — is available on request at support@pdflm.com. We will notify customers by email at least 30 days before adding any new processor that handles substantive personal data, so that they have a chance to object before the change takes effect.
OpenAI — an honest disclosure
OpenAI retains API request data for up to 30 days for abuse monitoring, even though they don't train on it. The most common privacy objection from sophisticated buyers concerns this, so we state it explicitly here rather than leave it buried.
When questions and document content are sent to OpenAI for processing, OpenAI retains the data for up to 30 days to monitor for misuse, after which it is automatically deleted. OpenAI does not train its models on data submitted via its API (per OpenAI's API terms). For tenants who require zero data retention by OpenAI, an enterprise tier using OpenAI's Zero Data Retention agreement is available on request.
International data transfers
Some of our processors are located outside the UK and EEA, primarily in the USA — OpenAI, Pinecone, Stripe, and our hosting provider. Where personal data crosses those borders, we rely on one or more of the safeguards permitted by Articles 44–49 UK GDPR:
- EU-US and UK-US Data Privacy Framework (DPF) for processors that are DPF-certified (including Stripe and most major US cloud providers).
- Standard Contractual Clauses (SCCs) approved by the European Commission, together with the UK Addendum issued by the Information Commissioner's Office, for transfers to processors that are not DPF-certified.
- Supplementary technical safeguards — TLS 1.2+ in transit and AES-256 at rest.
A breakdown of which transfer mechanism applies to each named processor is available with the sub-processor list at support@pdflm.com.
How long we keep data
We keep different categories of data for different lengths of time. Here's the table — concrete, not the usual “as long as necessary”.
| Data category | Retention period | Reason |
|---|---|---|
| Account data (email, org name, website URL, hashed API key) | Lifetime of account + 30 days post-cancellation | Recovery window in case the tenant reactivates |
| Billing data (Stripe IDs, card last-4, invoice records) | 7 years from last transaction | HMRC requirement — UK tax law |
| API usage logs (question text, response time, token counts) | 12 months | Billing reconciliation, accuracy review |
| Server access logs (IP, user agent, endpoint) | 30 days | Security and operational troubleshooting |
| Indexed PDF content (chunks, embeddings, page summaries) | Until tenant deletes it, or 30 days post-cancellation | Provide the service the tenant signed up for |
| Error reports (Sentry-style) | 90 days, then auto-purged | Debug regressions without keeping data indefinitely |
| OpenAI temporary retention (out of our control) | Up to 30 days at OpenAI | Abuse monitoring by OpenAI — see section 7 |
Your rights
UK GDPR and EU GDPR give you the following rights. They apply to personal data we hold about you as a Controller. For data we process as Processor (indexed PDF content), the rights are exercised through the tenant who is the Controller of that data.
- Right of access — receive a copy of all personal data we hold about you.
- Right to rectification — have inaccurate data corrected.
- Right to erasure (“right to be forgotten”) — have your data deleted, subject to legal retention requirements like HMRC's 7-year rule.
- Right to restrict processing — pause processing while a complaint is being resolved.
- Right to data portability — receive your data in a structured, machine-readable format (JSON export of account data and indexed-document URLs).
- Right to object — to processing based on legitimate interest, including direct marketing.
- Right to withdraw consent — for any processing where consent was the lawful basis (e.g. marketing emails).
- Right not to be subject to automated decision-making with legal or similarly significant effects. We don't make decisions about you using automation alone — there's a human in the loop for any account-impacting decision.
To exercise any of these rights, email support@pdflm.com with a description of the request. We respond within 30 days (extendable to 60 days for complex requests, with notice).
Right to lodge a complaint
If you believe we have processed your personal data unlawfully, you have the right to lodge a complaint with a supervisory authority:
- UK: Information Commissioner's Office (ico.org.uk).
- EU: the data protection authority of your country of residence — a directory is at edpb.europa.eu.
We'd appreciate the chance to address your concerns first — you can write to us at support@pdflm.com — but you are not required to contact us before going to a supervisory authority.
If a data breach happens
In the event of a personal data breach that is likely to result in a risk to the rights and freedoms of data subjects, we will notify the ICO within 72 hours of becoming aware of the breach, in line with Article 33 UK GDPR. Where the breach is likely to result in a high risk to data subjects, we will also notify affected tenants and end users without undue delay (Article 34 UK GDPR), with a clear description of the nature of the breach, its likely consequences, and the remedial steps we've taken.
Children's data
PDFLM is not directed at children under 16 (UK GDPR age of consent) or under 13 (where US standards apply). We do not knowingly collect personal data from children directly.
If a tenant indexes content that contains children's personal data, the tenant is responsible — under their own privacy policy — for having appropriate consent (typically from a parent or guardian) before doing so. If you believe we have inadvertently collected children's data, please contact us at support@pdflm.com and we will delete it.
How we protect your data
Our concrete security measures, in order of how much they matter:
- TLS 1.2+ for all data in transit. HTTPS only, with HSTS enforced on both the marketing site and the API.
- AES-256 encryption at rest for all databases and vector stores.
- API keys are hashed using a one-way cryptographic hash function. We cannot recover lost API keys; you regenerate them.
- Access to production systems is restricted by single sign-on plus two-factor authentication, with audit logging.
- Regular dependency vulnerability scanning against our backend code and infrastructure.
- Annual third-party security review (planned for the first 12 months after general availability; for tenants who need the certification before then, please ask).
Data Processing Agreement
A Data Processing Agreement (DPA) under Article 28 UK GDPR is available to all paid customers. For paying tenants, the DPA is incorporated automatically into the service agreement — you don't need to do anything extra. Free-trial tenants may request a counter-signed copy at any time.
To request a copy of the current DPA, email support@pdflm.com and we'll send the latest version within one business day.
Records of processing
We maintain a record of processing activities as required by Article 30 UK GDPR. This record is available to supervisory authorities on request.
Changes to this policy
We may update this policy as the service evolves or as the law requires. For any material change — for example, naming a new Tier 1 processor, lengthening a retention period, or introducing a new category of data we collect — we will notify tenants by email at least 30 days before the change takes effect.
Non-material clarifications (typo fixes, link updates, structural reorganisation) may be made without notice. The “Last updated” timestamp at the top of this page always reflects the current version.
Contact us
For privacy questions, deletion requests, DPA copies, or anything else covered in this policy, email support@pdflm.com.
Postal: Renav Limited, 449 Honeypot Lane, Stanmore, England HA7 1JJ.