Methodology – How Resourcera Ensures Data Accuracy

We collect data from trustworthy public and proprietary sources, clean and validate it using both modern AI-assisted tools and manual checks, and produce clear, reproducible estimates and insights tailored for business leaders.

Our guiding principles

Accuracy first. Numbers and forecasts must be defensible — not guesses dressed up as facts.
Transparent methods. We explain sources, assumptions and models so clients can verify results.
Human-in-the-loop. AI speeds work; experts ensure the outputs make sense.
Ethical sourcing. We follow legal rules and industry best practice for data collection and use.
Reproducible work. Key steps and code (where applicable) are archived so results can be audited.

The process (overview)

Scoping & hypothesis. Define the research question, geographies, product/service boundaries, and key metrics (e.g., revenue, units, penetration).
Data collection. Combine primary research (interviews, surveys) with secondary research (public filings, databases, news, academic papers, gov data, paid datasets) and validated web data.
Data processing & enrichment. Clean, deduplicate, standardise and enrich records. Use AI tools for entity extraction and document parsing — with manual verification.
Modeling & analysis. Apply the appropriate model (bottom-up, top-down, hybrid), test scenarios, and run sensitivity checks.
Validation & QA. Triangulate across independent sources, check internal consistency, and run statistical validation.
Deliverable & transparency. Deliver findings with clear assumptions, data sources, and methodology notes so clients can verify or reproduce our work.

What sources we use

Primary research: Structured interviews with industry leaders, surveys of end users, on-the-ground checks. These are used to validate assumptions and fill gaps.
Authoritative secondary sources: Company annual reports, regulatory filings, investor decks, government statistics (e.g., customs, trade, health agencies), and reputable databases.

See the complete list of our sources.

Proprietary & licensed datasets: Where needed we license market and transaction data from trusted vendors.
Public web data & news: Scraped or API-collected data only where permitted — all scraping respects terms of service and copyright.
AI-indexed knowledge: We use AI to surface and summarise long or noisy documents, but never rely on an LLM as the sole source of truth.

How we use AI: practical, safe, audited

AI accelerates parsing, classification and hypothesis generation. Typical uses:

Automatic extraction of figures from PDFs and presentations.
Entity matching (company names, SKUs, jurisdictions).
Topic clustering and summarisation of large text corpora.
All AI outputs go through human review. We log AI prompts, model versions, and reviewer notes so every automated step is auditable.

Modeling & forecasting approach

We pick models to match the problem and data available:

Bottom-up (usage / penetration): Build market size from unit counts, installed base, or addressable users.
Top-down (share & revenue): Start from industry totals (e.g., national spend) and allocate by segment.
Hybrid: Combine both when neither alone is sufficient.
We use standard statistical tools (time-series, regression, diffusion/S-curve for adoption) and run scenario analysis (base, conservative, upside) to show sensitivity to key assumptions.

Data cleaning, validation & provenance

Standardisation: Normalize dates, currencies, units and taxonomy.
De-duplication & entity resolution: Merge records that refer to the same company or product.
Triangulation: We require at least two independent, credible sources for key numbers wherever possible.
Outlier handling: Flag and investigate outliers rather than automatically removing them.
Provenance: Every published table or chart links to the underlying data source(s) and records when it was collected.

Quality assurance & expert review

All deliverables pass a multi-step QA: data engineer checks, analyst review, and a subject-matter expert (SME) validation.
For specialised topics we consult independent experts and include their perspectives where they materially affect estimates.
We keep an audit trail of decisions, so clients can see why a number changed between versions.

Deliverables you receive

Executive summary with key takeaways.
Dataset appendix that lists sources and collection dates.
Methodology appendix with model details, assumptions and scenario settings.
Raw data table (when licensing allows) or a clear description of why data cannot be shared.
Short technical note describing AI tools used and the human review process.

Limitations & ethics

Some markets (private company revenues, illicit markets) have limited visibility. We state where uncertainty is higher and quantify it via ranges.
We never claim impossible precision. When data are sparse, we present conservative, transparent ranges and the assumptions behind them.
We comply with privacy law, copyright, and contractual restrictions on licensed data.

Why this matters to you

You get numbers that are actionable and defensible — useful for investment decisions, product strategy, go-to-market planning, and board discussions. Every claim is backed by sources and a clear explanation of how the number was derived.

Want the technical appendix?

If you want, we produce a short technical appendix for each study that lists:

Source list (public and licensed), collection dates and sample sizes;
Models and formulae used;
AI models and versions used for document processing;
Audit trail of key decisions.

Resourcera’s Research Methodology