Privacy Policy
Plain-language summary: We log every API request you send so we can (1) keep the service running, (2) build private, internal datasets we use to evaluate and benchmark LLM performance, and (3) — only if you opt in — train and fine-tune our models, which unlocks premium models for you. Opting in to training is voluntary and reversible, but content already incorporated into a trained model cannot be removed from that model (see §2c). We do not sell, license, or otherwise distribute your data — or any derivative of it — to anyone. We may publish benchmark results (model rankings, scores, methodology) and open-source the benchmark code, but the underlying dataset built from your prompts is never published or shared. IP addresses, User-Agent strings, and HTTP headers are used only for security and operations. Best-effort PII scrubbing runs on every request at ingest, but it is not perfect — don't submit content you'd regret us retaining.
1. What We Collect
We collect data about every API request. The categories below describe what is received by our infrastructure. At ingest, request and response content is run through an automated best-effort PII scrubber (Microsoft Presidio + spaCy NER with our custom recognizers) before it is written to the trace store. Best-effort means the scrubber runs on every request but is not guaranteed to catch every piece of identifying information — free-form text written by humans in formats the recognizers miss can slip through. See §3a for detail and the important caveat.
The categories of data we collect are:
- Account data: Your username, hashed password (bcrypt), API key hash (SHA-256), and your API key in plain text. We also store the exact timestamp at which you accepted the Terms of Service and your current evaluation-dataset and model-training preferences.
- Request content (scrubbed at ingest, best-effort): The body of your API request — messages, prompts, system instructions, and parameters — after it has passed through our automated PII scrubber. The raw pre-scrub body is not persisted to the trace store; it exists only transiently in memory while the request is routed to the upstream provider.
- Response content (scrubbed at ingest, best-effort): The response from the LLM — generated text, token usage, and metadata — after it has passed through the same scrubber.
- Network information: Your IP address and X-Forwarded-For headers. Retained for up to 90 days for security / abuse purposes and then blanked from logs (see §4). Used only for security and operations; never included in any evaluation dataset (see §3a).
- Client information: User-Agent string and HTTP headers (excluding the Authorization header itself). Same 90-day retention and the same exclusion from evaluation datasets as network information.
- Request metadata: Timestamps, latency measurements, the model requested, the provider used, streaming status, and error information.
2. How We Use Your Data
We use collected data for two distinct purposes:
2a. Service operation (always — no opt-out)
- Routing your request to the appropriate LLM provider
- Service monitoring, analytics, and performance optimization
- Security, fraud detection, and abuse mitigation
- Statistical reporting (aggregated, non-identifying)
- Compliance with legal obligations
Our lawful basis for these activities under GDPR is legitimate interest (Article 6(1)(f)). You cannot opt out of service operation while continuing to use the Service — but the data used for these purposes is never sold or licensed.
2b. Internal evaluation datasets (no third-party distribution)
Post-scrub request and response content may be incorporated into private, held-out datasets that we use internally to evaluate, compare, and benchmark the performance of the LLMs available through the Service. These datasets are confidential and internal — they are not sold, licensed, published, or otherwise disclosed to any third party. We may publish aggregate benchmark results (model scores, rankings, methodology) and may release the evaluation harness as open source; neither discloses the underlying user content. Lawful basis (GDPR): legitimate interest (Art 6(1)(f)) — measuring and improving the quality of the inference we provide. You may object at any time (Art 21); see §5. Evaluation datasets are built from post-scrub content only, are never distributed, and are subject to access controls and a defined retention period.
2c. Model training & fine-tuning (opt-in only)
If — and only if — you opt in at /consent, your post-scrub request and response content may also be used to train and fine-tune the models we operate. Opting in unlocks our premium models (typically newer or larger); declining or withdrawing simply means premium models are unavailable to you, with no effect on the standard (free) tier. Lawful basis (GDPR): consent (Art 6(1)(a)). You may withdraw consent at any time (Art 7(3)) via /consent, which stops future training use and revokes premium access. Content is only eligible for training if you had opted in at the time the request was logged and you have not since withdrawn consent: we tag each logged request with your training-consent state at that moment, and our training-set selection additionally re-checks your current preference, so withdrawing consent removes all of your content — past and future — from any future training run. The unconditional field exclusions in §3a (IPs, headers, account identifiers, sub-day timestamps) apply to training sets too.
Irreversibility — read this before opting in. Training incorporates information from your content into a model's parameters (weights). Unlike a row in a dataset, this cannot be selectively deleted: withdrawing consent or deleting your account stops all future training use and removes your logs and your content from future training sets, but it cannot remove your content's influence from a model that was already trained, and trained models can in rare cases reproduce fragments of their training data. Models we train on opted-in content are used internally to operate the Service; we still never sell, license, or distribute your data, your content, or those models' training data to any third party. If you are not comfortable with this permanence, do not opt in — the standard tier does not require it.
3a. How We Handle and Retain Logged Content
We apply the following steps to logged content. These commitments are part of this Privacy Policy and create a binding obligation; failure to follow them would be a violation of these terms. Nothing in this section should be read as a warranty that the output is perfectly de-identified — see the caveat at the end of this section.
- Removed entirely from every evaluation dataset, unconditionally: IP addresses, X-Forwarded-For headers, every non-content HTTP header, User-Agent strings, account identifiers (username, API key, API key hash, account-linked request IDs), error messages that may name internal systems, and timestamps below day-level granularity. These fields are excluded from evaluation datasets regardless of any preference; only post-scrub content is ever eligible for inclusion.
- Best-effort PII removal from request and response content: We run
an automated pipeline based on Microsoft Presidio and spaCy
en_core_web_lgNER, with custom recognizers for email addresses, phone numbers, postal addresses, government-issued identifiers (SSN, AU TFN, AU Medicare, AU ABN), payment card numbers, API keys and tokens, and URL-embedded credentials. Detected PII is replaced with numbered placeholders (e.g.,<PERSON_1>) that preserve conversational coherence. The original values are not retained in any mapping table outside the single request's processing session, so placeholders are not reversible by recipients. This pipeline is best-effort only: NER models miss names, emails, and phone numbers in unusual formats or languages, and any free-form text written by a human can contain identifying content (writing style, context, references to specific people or events) that no automated system will catch. - Aggregation: Where the dataset use case allows, individual prompts are aggregated to further reduce the risk of re-identification.
Important caveat (read this): We use commercially reasonable efforts and standard industry tooling (Presidio + spaCy NER + custom recognizers, score threshold 0.7) to minimize residual identifiability, but we do not represent, warrant, or guarantee that the output is anonymous, de-identified, or impossible to re-associate with you. We use the phrase "best-effort anonymized" everywhere in this Policy specifically to avoid that overclaim. If you submit content you would not want retained internally even after this pipeline runs — for example, anything you would be uncomfortable seeing surfaced in an internal benchmark — opt out of evaluation-dataset use at /consent and your content will be excluded from evaluation datasets (and do not opt in to model training, which is off by default).
Pre-v2 data: Content collected before May 2, 2026 was collected under our prior Terms of Service. This historical data is retained for internal service operation and quality improvement only. It is internal-only, is never sold, licensed, or distributed to any third party, and is excluded from any evaluation dataset that informs published benchmark results.
3b. Data Sharing
We share data with the following categories of third parties:
- LLM providers: Your raw, pre-scrub request content is sent to the upstream LLM provider routing your request (e.g., OpenRouter, Together, etc.), which is necessary to generate a response. This is the only category of recipient that ever receives un-scrubbed content, and only for the duration of routing the request. Each provider has its own privacy policy that may apply to the data it receives.
- Law enforcement: If required by applicable law or valid legal process. We will challenge overbroad requests where lawful and notify affected users where legally permissible.
We may publish aggregate benchmark results and may open-source the benchmark harness/code. We never share, sell, or publish the underlying dataset of your content.
4. Data Retention
- Account data (username, password hash, API key hash, evaluation-dataset and model-training preferences) is retained while your account is active.
- Identifying logs (IP addresses, X-Forwarded-For headers,
User-Agent strings, account-linked metadata) are retained for at most
90 days for security and operational purposes. After 90 days, an
automated job blanks the
source_ipanduser_agentcolumns on the correspondingrequest_logsrows and re-runs the PII scrubber (best-effort) over the associated trace file lines to remove residual identifiers. - Scrubbed request/response content may be retained for service operation and internal evaluation as described in §2 and §3a. "Scrubbed" means it has passed through our best-effort PII pipeline; it does not mean perfectly anonymized.
- On account deletion: Your account data and identifying logs are deleted within 30 days, and your content is removed from internal evaluation datasets and from any future training sets. Because we do not distribute data to third parties, there is no external copy outside our control. One exception applies if you opted in to training: content already incorporated into a model's weights cannot be removed from that already-trained model (see §2c). Deletion removes your data everywhere it can be removed; it cannot un-train a model.
- DSAR audit log (records of access, deletion, and opt-in/opt-out requests) is retained indefinitely without user content, for compliance audit purposes.
5. Your Rights
Regardless of jurisdiction, you may at any time:
- Access: Request a copy of personal data we hold about you. Self-service via dashboard → "Download My Data", or email support@logfare.ai.
- Deletion: Request deletion of your account and associated data. Self-service via dashboard → "Delete My Account", or email us.
- Opt out of evaluation-dataset use. Self-service via /consent. Your access is unaffected.
- Opt in or out of model-training use. Self-service via /consent. Opting in unlocks premium models; opting out (or never opting in) leaves the standard tier fully available. Subject to the training irreversibility caveat in §2c.
- Rectification: Request correction of inaccurate account data. Email support@logfare.ai.
We respond to verifiable DSAR requests within 30 days (45 days under CCPA). We may require verification (e.g., signing a request with your active API key or otherwise confirming control of the account) before fulfilling requests.
5a. Additional Rights for EU/EEA/UK Residents (GDPR)
If you are in the European Economic Area or the United Kingdom, you also have the right to:
- Restriction of processing (Art. 18)
- Data portability — receive your data in a structured, machine-readable format (Art. 20)
- Object to processing based on legitimate interest (Art. 21)
- Withdraw consent for processing based on consent, at any time (Art. 7(3))
- Lodge a complaint with your data protection supervisory authority
Lawful bases: We rely on legitimate interest (Art. 6(1)(f)) for service operation, security, and internal evaluation-dataset creation, and on consent (Art. 6(1)(a)) for the optional use of your content to train and fine-tune models (the premium tier). You may withdraw that consent at any time (Art. 7(3)) without affecting the lawfulness of processing carried out before withdrawal.
International transfers: Logfare's infrastructure is hosted outside the EEA. Where we transfer EEA personal data internationally, we rely on Standard Contractual Clauses (SCCs) or other adequacy mechanisms recognized by the European Commission.
5b. Additional Rights for California Residents (CCPA / CPRA)
If you are a California resident, you have the right to:
- Right to know: What categories of personal information we have collected, used, sold, or shared in the last 12 months.
- Right to delete: Subject to specific exceptions enumerated in §1798.105(d).
- Right to correct inaccurate personal information.
- We do not sell or share your personal information as those terms are defined under CCPA §1798.140. The /do-not-sell page reflects this.
- Notice of financial incentive: our optional premium tier is a financial incentive under CCPA §1798.125(b) — we offer a different level of service (access to premium models) in exchange for your opt-in to use your post-scrub content for model training. Full terms in §5b-1 below.
- Right to limit use of sensitive personal information.
- Right to non-discrimination for exercising any of these rights. Offering the premium tier as a financial incentive is permitted under §1798.125(b) and is not discriminatory; participation is voluntary and the standard tier is unaffected if you decline.
5b-1. Notice of Financial Incentive (CCPA §1798.125(b))
The program. Opting in to model-training use of your post-scrub content unlocks access to our premium models. How to opt in: toggle it on at /consent. How to withdraw: toggle it off at the same place at any time — premium access ends and no further training use occurs. Participation is entirely voluntary, and the standard (free) tier is fully available whether or not you participate.
Material terms & good-faith value estimate. There is no price difference between the tiers — both are free; the only difference is which models you may call. We are neither paying nor charging you. We estimate the value of the data made available to us by an opted-in user to be reasonably related to, and not to exceed, the incremental cost of providing premium-model inference to that user, calculated by reference to our marginal inference cost for those models. We do not assign a per-record monetary value to your content and we do not sell it; its only value to us is as internal training signal for the models we operate.
5c. Additional Rights for Australian Residents (Privacy Act 1988)
If you are an Australian resident, you have the right to:
- Access personal information we hold about you (APP 12)
- Correct inaccurate or incomplete personal information (APP 13)
- Lodge a complaint with the Office of the Australian Information Commissioner (OAIC)
We process some categories of regulated personal information (including any TFN, ABN, or Medicare numbers that may appear in prompts despite our prohibition on submitting such data — see ToS §7). Our PII pipeline attempts to detect these with custom recognizers on a best-effort basis; detected matches are removed at ingest. We do not warrant that every regulated identifier is detected, and you should not submit such data in the first place.
6. Security
We implement reasonable technical and organizational measures to secure stored data, including bcrypt password hashing, SHA-256 API key hashing, TLS in transit, and access controls on the trace store. However, no system is perfectly secure. We make no guarantees about the security or integrity of collected data; use the Service at your own risk.
7. Children
The Service is not directed to children under 18 (or under 16 in the EEA). We do not knowingly collect data from children under these ages. If you are a parent and believe your child has created an account, contact support@logfare.ai for immediate deletion.
8. International Users
Data collected through the Service may be stored and processed in any country where we or our service providers operate. By using the Service, you consent to the transfer of your data to jurisdictions that may not provide the same level of data protection as your home jurisdiction, subject to the safeguards described in §5a.
9. Changes to This Policy
We may update this Privacy Policy from time to time. For material changes that affect your rights — for example, expanding the categories of data collected, adding new categories of data recipients, or changing how we use logged content — we will provide at least 30 days' advance notice via email or a prominent notice on the site, and we will not retroactively apply the new terms to data collected before the change.
10. Contact
For privacy-related questions, DSAR requests, or any other data-related matters, please contact support@logfare.ai or reach out via the Logorhythms Discord server.