For researchers and industry

Rare disease patients generate detailed observations about their condition every day. Research has had almost no access to that data.

Wimly gives research teams and trial sponsors access to a continuously growing, structured, consent-governed dataset of patient-generated observations — the data from the many months between specialist appointments that clinical research has never been able to capture.

1. The problem

Rare disease research operates on incomplete data at every stage of a trial

Clinical registries collect structured data at scheduled timepoints, typically at clinic visits twice a year for most rare disease patients. Between those visits, patients continue to experience their disease. Symptoms shift, functional capacity changes, adaptations are tried. None of that reaches the research record.

The result is that rare disease research relies on periodic snapshots of a continuous process. Trials are designed, endpoints are chosen, and recruitment criteria are written around data that represents a fraction of what patients actually experience.

Before a trial starts: study designs built on limited baseline data

When a rare disease trial is being designed, the questions that matter most are hard to answer. What does functional decline actually look like in this patient population, week by week, rather than at six-month clinical assessments? Which patient-reported symptoms precede the endpoints that clinical scales measure? How do real patients experience the inclusion and exclusion criteria being considered?

Registries provide structured baseline data, but only at the points when patients are in clinic. The continuous disease experience, the variation that happens between assessments and the patterns that only emerge over months of daily observation, is absent. Study designs built without this context routinely miss what matters most to the patients they are meant to serve.

During a trial: 80% of rare disease trials are delayed by recruitment

Rare disease populations are small and geographically dispersed. Most potential participants are not known to any registry until after diagnosis, and many carry their disease for years before receiving a confirmed diagnosis. Finding them, confirming eligibility, and converting them to enrolled participants is the primary reason most rare disease trials run late.

Screen failure rates compound this problem. When inclusion and exclusion criteria are developed without access to real community data, the patients who arrive at screening frequently fail to meet them in ways that could have been anticipated. Testing study design against an engaged patient community before a trial opens changes that starting position.

After approval: demonstrating real-world effectiveness requires data that does not currently exist at scale

Payers and regulators increasingly require continuous, real-world outcome data to support reimbursement decisions and post-approval label expansion. Twice-yearly clinical assessments do not produce it. Value-based pricing contracts, where payment is linked to demonstrated real-world functional outcomes, require exactly the kind of continuous patient-reported observation record that no rare disease platform currently provides at scale.

For the first disease-modifying therapies now reaching rare disease markets, this gap is not theoretical. The evidence infrastructure required to support reimbursement decisions is being built from nothing, in real time.

2. The solution

A continuously compounding longitudinal dataset, and the pre-engaged community that builds it

Wimly is not a registry, a survey tool, or a symptom tracker. It is a structured longitudinal infrastructure that captures patient observations continuously, across the many months between clinical appointments, and organises them into research-ready data.

The architectural distinction that gives the dataset its scientific value is the same one that makes patients contribute consistently over years: every interaction returns something of value to the patient in the same session. A consultation summary ready for the next appointment. A result from the community knowledge library. A structured record of what they noticed. The platform earns its engagement. That sustained engagement is what produces longitudinal data that registries and survey instruments cannot replicate.

Six structured data streams from daily life, not clinic visits

Every observation entered into Wimly is domain-tagged, temporally anchored, and consent-stratified at the moment of entry. The dataset is built from six distinct streams — see numbered list below.

A pre-engaged patient community for trial readiness and recruitment

Members of the Wimly community who have indicated willingness to participate in research receive information about relevant trials and can confirm interest through a structured consent pathway.

The practical consequence for trial design begins before a trial opens. A research team finalising eligibility criteria for a progressive rare disease study can run a feasibility query against the Wimly community before the protocol is locked. If the query shows that a proposed biomarker threshold would exclude the majority of patients at the disease stage being targeted, the protocol can be adjusted before a single patient is screened. That kind of feedback, available during study design rather than after enrolment begins, changes the economics and timelines of rare disease trials.

When a trial does open, the recruitment population is already engaged, already characterised by continuous observation data, and already assessed against criteria that were tested against real community experience. In comparable rare disease contexts, pre-engaged patient communities with trial-consent infrastructure in place have supported substantially shorter recruitment timelines and lower screen failure rates than registry-based approaches. These are projections based on industry benchmarks rather than claims established from Wimly's own trial experience.

The Paired Patient-Clinician Dataset

Wimly's most scientifically distinctive data product is the only one of its kind: a continuously growing dataset of structured paired observations from patients and the clinicians who care for them, generated from real clinical care rather than trial settings.

The paired structure captures four streams simultaneously: domain observation pairs, where patient-reported and clinician-noted data exist for the same functional domain and time period; signal-response pairs, showing what the patient flagged and how the clinical team responded; check-in response pairs; and consultation context pairs.

What makes this scientifically valuable is the discrepancy dimension. If patients in a given disease community consistently rate their fatigue at a different severity than their neurologists assess the functional impact of that fatigue, in the same period, from the same individual, that directional gap is not noise. Across hundreds of paired patient-clinician observations over months or years, it becomes a structured finding about how self-reported and clinician-assessed experience relate in that disease. It has direct implications for how patient-reported outcome endpoints are designed in clinical trials. No existing data source captures this structure.

Every observation entered into Wimly is domain-tagged, temporally anchored, and consent-stratified at the moment of entry. The dataset is built from six distinct streams:

  1. Free-text narrative

    Conversationally elicited via AI-assisted intake and tagged to functional domains at the moment of entry. A patient noting that the stairs felt harder than usual produces a balance-domain entry, timestamped and linked to previous entries in that domain over the past 90 days.

  2. Shorthand qualifier responses

    Structured follow-up prompts producing ordinally encoded data. After the free-text entry, a question asks how much the difficulty affected the day. The response is encoded, timestamped, and designed for quantitative trend analysis across thousands of users contributing similar observations.

  3. Scheduled functional check-ins

    Patient-completed self-assessments at defined intervals, producing time-series data across functional domains. A fortnightly check-in covering fatigue, balance, and speech stacks into a continuous observation record visible to the patient's clinical team before each appointment.

  4. Life event records

    Named events in causal framing that anchor surrounding observation data for before-and-after analysis. A patient recording that they started a new treatment provides a temporal marker that allows the preceding and following observation record to be analysed as distinct periods.

  5. Positive capacity observations

    What patients document as working — adaptations, strategies, functional achievements. A patient noting they walked to the shops unassisted for the first time in six weeks is captured with the same structural rigour as a decline observation. This data type is almost entirely absent from existing rare disease datasets.

  6. Paired Patient-Clinician observations

    Structured paired data from patients and their clinical teams, generated from real clinical care. Described in detail in the sub-section above.

What Wimly sells, and what it does not

Wimly does not sell data. It sells structured, consent-managed research access. Every patient consents individually, per study, to what is shared and for what purpose. The research dataset, the clinical dataset, the community dataset, and the commercial dataset are architecturally separated, not as a policy commitment but as a structural constraint built into the data infrastructure from the outset.

Research partners access aggregate statistics and approved de-identified data exports through a research API with consent-stratified filtering. Feasibility queries return cohort sizes and data availability without exposing any individual data. Formal data sharing agreements govern all approved exports, with full audit logging of every access event.

3. The result

Research questions that could not be answered before, because the data did not exist

The dataset makes tractable questions that rare disease research has been unable to approach. Three examples that illustrate the kind of analysis the infrastructure is designed to support.

Does patient-reported fatigue precede clinician-detected functional decline, and if so, by how many days? With daily observation records from hundreds of patients over two or more years, temporal lag analysis between patient-flagged fatigue entries and contemporaneous clinical notes becomes possible for the first time. A research team studying this relationship would have a structured, timestamped dataset designed specifically for that analysis, rather than the retrospective reconstruction from clinical notes that such questions currently require.

Are there functional trajectory subtypes within a rare disease population that differ by reported intervention patterns? Trajectory clustering across longitudinal Wimly records, by disease stage, reported adaptation, and functional domain, opens questions that annual questionnaire data cannot power.

Do systematic differences exist between what patients report and what their clinicians observe, across disease stages and functional domains? The Paired Patient-Clinician Dataset makes this a structured research question rather than an inference from two separately administered instruments.

The regulatory landscape supports this kind of evidence. The FDA's Real-World Evidence Framework accepts real-world evidence from validated digital tools as primary evidence in certain rare disease indications. The European Medicines Agency's DARWIN EU initiative creates a federated structure for cross-border health data use. Wimly's consent architecture and data governance are designed for compatibility with these frameworks from the outset.

A note on clinical validity: demonstrating that continuous patient-generated observations correlate with established, validated clinical assessment scales is a Year 1 research priority for Wimly. The initial validation study will be anchored in the SCA community, where validated clinical outcome scales are well-established and widely used, and will set the methodological framework for correlation analysis across other disease communities as the platform expands. Research partnerships established in this phase have the most influence over how that validation work is designed, and the most time to benefit as the longitudinal record grows.

Start a research partnership conversation

Wimly is here.

Write to us directly. We will respond within two working days.

Find out what research access would look like for your work

A short self-assessment that helps identify which aspects of the Wimly dataset and research partnership model are most relevant to your work. Takes about two minutes.

Begin self-assessment