Durable knowledge that compounds
Every agent built on Forge contributes back to the ground truth. The substrate gets richer with use, not noisier. Six months in, you don't have a dataset — you have an institutional reasoning layer.
ASTRA Forge turns enterprise knowledge — messy, distributed, unstructured — into AI-ready ground truth that agents can actually reason over. Not retrieved passages; reasoning paths.
Most production AI doesn't fail at the model. It fails at the ground truth feeding it. Knowledge lives in SharePoint, Confluence, PDFs, email threads, ticket histories, and in people's heads. The moment people leave — or move on — institutional reasoning leaves with them.
Critical reasoning lives in the team's heads, not in retrievable systems. The day a senior person leaves, your AI loses access to the most valuable training context you had.
Every AI initiative rebuilds the same data plumbing from scratch. Six months later you have three pilots, three pipelines, and one unified ground truth — none of them.
Vector search finds passages that look relevant. For a chatbot answering “what's our refund policy”, that's enough. For an agent investigating a multi-step regulatory question, it's not.
Retrieval-Augmented Generation is the right baseline for most question-answering. It's not the right foundation for agentic reasoning in high-stakes domains.
Documents get chunked for embedding. The relationships between facts in the same document — or across documents — are lost the moment you slice them up for vector retrieval.
Two passages can be semantically similar and logically unrelated. An agent that retrieves “supplier X is in region Y” and “region Y has sanctions risk” cannot conclude “supplier X is at sanctions risk” — RAG returns chunks, not inferences.
Each hop in an agentic reasoning loop is another roll of the dice. By the third retrieval, the agent's working context is fragmented. Hallucinations compound. Confidence in any one answer drops.
RAG doesn't know that “drug X” and “contraindication Y” are typed entities with a “hasContraindication” relation. It sees them as nearby text. For clinical, financial, or regulatory reasoning, that distinction is the difference between a working agent and a liability.
None of this is RAG's fault. RAG is the right foundation for the right job. It's not the right foundation for the agentic, multi-hop, schema-aware reasoning enterprises now need from production AI.
Knowledge-Augmented Generation integrates a structured knowledge graph with a logical reasoning engine alongside the LLM. Retrieval returns reasoning paths anchored in a typed schema — not isolated text chunks.
KAG was introduced by Ant Group's Knowledge Graph Team and Zhejiang University (peer-reviewed, arXiv:2409.13731), with the open-source OpenSPG implementation. ASTRA Forge is a KAG-class platform built on the same grounded-generation research lineage.
Vector similarity over chunked text. No schema, no constraints. Multi-hop reasoning fragmented across turns; relationships between facts are not modelled.
Logical-form-guided traversal over a typed knowledge graph. Schema-aware, constraint-respecting. Persistent reasoning substrate across agent turns.
Peer-reviewed result
KAG outperforms RAG on multi-hop / professional-domain QA by 19.6–33.4% F1 improvement.
Ant Group / OpenSPG, peer-reviewed 2024
Microsoft's GraphRAG is the closest adjacent work — graph-community-aware retrieval. KAG goes further: a dedicated reasoning engine that performs inference before passing results to the LLM.
For deep analysis and investigation — financial counterparty risk, clinical decision support, supply-chain disruption, regulatory reasoning — RAG alone is not enough. KAG matters when agents must reason across typed relationships and domain constraints.
ASTRA Forge takes enterprise knowledge from raw to AI-ready in five governed stages. Each stage is observable, audited, and built to enterprise scale. Security and compliance are designed in — not bolted on.
Connects to SharePoint, Confluence, network shares, ticketing, CRMs, knowledge bases. Handles PDFs, Word, slides, structured data, audio transcripts.
Connector list illustrative; specifics confirmed per engagement. Identity, access control, and audit are present from the ingestion edge — not bolted on at the end of the pipeline.
Deduplicates, version-resolves, removes stale content. Classifies sensitive data. Routes confidential material through approved-access paths only.
PII detection, sensitivity labels, and routing rules are configured per the enterprise's security policy — not generic defaults.
Extracts entities, relationships, ontologies. Builds knowledge graphs where domain shape matters. Decomposes long documents into retrievable units with preserved context.
This is the stage where ASTRA Forge becomes KAG-class. Structured entity extraction, typed relations, and schema design — not just embedding into a vector store. The knowledge graph is the substrate for the reasoning paths the agent will later traverse.
Produces high-quality, retrieval-ready knowledge — vector indexes, knowledge graphs, decision rule sets, structured SOPs. Every artefact traces back to its source.
Hybrid retrieval is the state of the art: vector indexes for semantic match, knowledge graph for relationship traversal, structured rule sets for constraint-respecting inference. Forge produces all three from the same governed pipeline.
Enforces access policy, retention, residency. Every retrieval is logged. Every artefact carries its provenance. Compliance-grade audit trails by default.
Governance is not a final stage — it is enforced at every prior stage and made queryable here. Audit logs are immutable and human-readable. Residency rules respect jurisdiction; access controls respect identity.
Four capabilities, all enterprise-scale, all in production from day one.
Built to process knowledge at organisation scale — not a desktop RAG tool.
Identity, access control, audit, residency, encryption — not bolted on.
Beyond vector retrieval — entities, relationships, and decision rules. This is the KAG-class capability surfaced explicitly.
One investment in AI-readiness; many agents and applications consume it.
When the knowledge layer is durable, the AI layer compounds. Every new agent, every new investigation, every new application reuses the same governed ground truth — not a per-project rebuild. The institution gets smarter; the agents get better; the dependency on any one person's retention gets weaker.
Multi-hop reasoning path
Agentic thinking-trace · KAG substrate
Each step is a typed relation. The agent doesn't guess the connection — the graph provides it. That's the difference between an answer the agent retrieved and an answer the agent can defend.
Every agent built on Forge contributes back to the ground truth. The substrate gets richer with use, not noisier. Six months in, you don't have a dataset — you have an institutional reasoning layer.
Multi-hop investigations. Constraint-respecting decisions. Schema-aware reasoning paths. The kind of work agentic AI is supposed to do — supported by the substrate it's supposed to do it on.
When senior team members move on, the institutional reasoning stays. Their tacit knowledge becomes documented relationships in the graph. Onboarding shortens. Departure no longer drops the organisation's IQ.
Stop building per-project data pipelines. Build one ground truth that every agent uses.
ASTRA Forge is built on grounded-generation research from AIFT — the Laboratory for AI-Powered Financial Technologies, ASTRA's parent R&D lab. AIFT is the only FinTech research laboratory recognised by InnoHK, the Hong Kong SAR Government's flagship innovation initiative. Co-founded by CityU + Columbia + Tsinghua. 60+ engineer R&D bench. The same research lineage that KAG-class systems were born from.
KAG isn't a framework we adopted from a paper. It is the same grounded-generation research direction AIFT has been working in for years. The deep-dive you've just read is engineering practice with research behind it — not vendor marketing.
When the implementation needs research-grade muscle, we have it. Architecture decisions in your Forge engagement are made by senior engineers with direct access to the AIFT bench — not associates working from a playbook.
AIFT is the only FinTech research laboratory recognised by InnoHK — Hong Kong SAR Government's flagship innovation programme. Co-founded by City University of Hong Kong, Columbia University, and Tsinghua University. Operates from Hong Kong Science Park.
Bring the ground truth in
Tell us what your agents need to reason over, where the knowledge lives today, and what the production target is. We'll sketch the Forge engagement shape and the architecture for your domain.
Forge team