Representative Use Cases

Three domain-specific scenarios demonstrating DocMaster's semantic filtering pipeline and structure-aware RAG over large document collections.

Domain Legal
Corpus Employment agreements, NDAs, vendor contracts, SLAs
N 1,200
Expected output ~85 docs
Semantic Filter
Tree
0
of 1,200
Hyperedge
0
of 1,200
Combined
0
of 1,200
vendor_agreement_acme.pdf
TreeHyperComb
employment_nda_ny.pdf
TreeHyperComb
sla_contract_texas.pdf
TreeHyperComb
consulting_agreement_fl.pdf
TreeHyperComb
employment_ca_standard.pdf
TreeHyperComb
+ 1,195 more documents
Observation: Hyperedge captures consulting_agreement_fl.pdf missed by Tree traversal — the non-compete duration is in an appendix, not a top-level clause heading.
Follow-up Q&A over 85 filtered documents
Summarize key risk patterns across these contracts.
Most contracts combine long non-compete durations (18–24 months) with employer-favorable jurisdictions (NY, TX). Key patterns:
  • Broad scope — “industry-wide” restrictions rather than role-specific, increasing enforceability risk
  • Extended duration — 60% specify 18+ month non-compete periods
  • Jurisdiction stacking — multiple non-CA jurisdictions referenced to maximize enforceability
vendor_agreement_acme.pdf §8.2 employment_nda_ny.pdf Cl.12 sla_contract_texas.pdf App.B
Domain Finance
Corpus Earnings call transcripts, 10-Ks, analyst reports
N 500
Expected output ~40 docs
Semantic Filter
Tree
0
of 500
Hyperedge
0
of 500
Combined
0
of 500
acme_corp_10K_2024.pdf
TreeHyperComb
globaltech_earnings_q3.pdf
TreeHyperComb
midwest_mfg_analyst.pdf
TreeHyperComb
+ 497 more documents
Observation: Tree misses midwest_mfg_analyst.pdf where margin decline appears in the financial summary and supply chain risks in a separate section. Hyperedge links cross-section evidence.
Follow-up Q&A over 40 filtered documents
What are the main drivers behind margin decline?
Dominant drivers across the matched filings:
  • Logistics cost increases — freight and shipping up 15–30% YoY
  • Supplier concentration — single-source reliance causing price volatility
  • Inventory delays — excess stock from demand misforecasting
Several firms report inability to pass costs to consumers due to weak demand.
acme_corp_10K_2024.pdf Risk Factors globaltech_earnings_q3.pdf MD&A
Domain Academic
Corpus ML/NLP papers (ACL, NeurIPS, ICML)
N 10,000
Expected output ~120 papers
Semantic Filter
Tree
0
of 10,000
Hyperedge
0
of 10,000
Combined
0
of 10,000
efficient_rag_acl2024.pdf
TreeHyperComb
sparse_dense_hybrid_neurips.pdf
TreeHyperComb
adaptive_retrieval_icml.pdf
TreeHyperComb
+ 9,997 more documents
Observation: Tree misses adaptive_retrieval_icml.pdf where the RAG method is in the approach section and the efficiency comparison against dense retrieval is in experiments. Hyperedge links cross-section evidence.
Follow-up Q&A over 120 filtered papers
What techniques are commonly used to improve efficiency?
Common techniques across the matched papers:
  • Hierarchical indexing — multi-level structures to reduce search space
  • Sparse-dense hybrid — BM25 + dense vectors for better speed-accuracy tradeoff
  • Query routing — directing queries to specialized sub-indices
  • Adaptive triggering — reducing retrieval frequency; compressing retrieved contexts
efficient_rag_acl2024.pdf §3 sparse_dense_hybrid_neurips.pdf §4.2

Try DocMaster

Upload your own documents and run semantic filters with structure-aware RAG.

Open Live Demo