The Problem: Information Overload in Investment Research
As a quantamental investor managing a multi-asset portfolio, I constantly synthesize information from disparate sources: proprietary research notes, SEC filings, earnings transcripts, real-time market news, and sector analysis. The challenge isn't finding information—it's finding the right information at the right time and reasoning about it coherently.
Traditional search fails here. Keyword matching misses semantic intent. Manual research doesn't scale. I needed a system that could:
- Store and retrieve my proprietary research semantically
- Augment with real-time web intelligence
- Reason across both sources to generate actionable insights
The Solution: A Quantamental RAG Agent
I built a retrieval-augmented generation (RAG) agent that combines three best-in-class components:
| Component | Technology | Purpose |
|---|---|---|
| Embeddings | Voyage AI voyage-3-large |
Semantic understanding of financial text |
| Vector Store | Qdrant | Fast retrieval of internal documents |
| Web Search | Exa | Real-time market intelligence |
| LLM | DeepSeek v3.2 | Contextual reasoning and synthesis |
Architecture
┌─────────────────────────────────────────────────────────────┐ │ Quantamental Agent │ ├─────────────────────────────────────────────────────────────┤ │ │ │ Question: "What are the risks in AI chip stocks?" │ │ │ │ │ ├──► Voyage AI ──► Qdrant (Internal Docs) │ │ │ │ │ │ │ │ │ ▼ │ │ │ │ "Our Q3 analysis noted supply │ │ │ │ chain constraints in TSMC..." │ │ │ │ │ │ └──► Exa Search (Web) ─────┐ │ │ │ │ │ │ ▼ │ │ │ "Reuters: NVIDIA faces │ │ │ export restrictions..." │ │ │ │ │ │ ┌─────────────┘ │ │ ▼ │ │ Combined Context │ │ │ │ │ ▼ │ │ DeepSeek v3.2 │ │ │ │ │ ▼ │ │ Synthesized Answer + Sources │ │ │ └─────────────────────────────────────────────────────────────┘
Why These Components?
Voyage AI voyage-3-large
Voyage's embedding model excels at financial and technical text. With 1024 dimensions and
specialized training, it captures nuanced semantic relationships that general-purpose embeddings
miss. Critically, it supports separate document and query input types
for asymmetric retrieval—essential when your queries are questions but your corpus is research reports.
from services.embedding import EmbeddingService
embedding_service = EmbeddingService(settings)
# Embed documents for storage
doc_vectors = embedding_service.embed_documents([
"TechCorp Q3: Revenue $15.2B, cloud grew 45%...",
"Semiconductor sector faces inventory buildup..."
])
# Embed query for search (different optimization)
query_vector = embedding_service.embed_query(
"What are the cloud growth trends?"
)
Qdrant for Internal Knowledge
Qdrant provides sub-millisecond vector search with rich filtering. I store all proprietary research—earnings analyses, sector notes, investment theses—with metadata for filtering by ticker, date, or document type.
from services.vector_store import VectorStoreService
vector_store = VectorStoreService(settings, embedding_service)
# Index internal research
vector_store.add_document(
content="NVDA positioned to capture 80% of AI accelerator market...",
metadata={
"title": "NVIDIA Investment Thesis",
"ticker": "NVDA",
"date": "2024-10-15",
"type": "investment_thesis"
}
)
# Semantic search
docs = vector_store.search(
query="AI chip market share leaders",
top_k=5
)
Exa for Real-Time Web Intelligence
Exa's neural search understands intent, not just keywords. When I ask about "Fed rate decision implications," it finds analysis pieces, not just articles mentioning those words. The financial domain filtering ensures I get Reuters, Bloomberg, and SEC filings—not Reddit speculation.
from services.exa_search import ExaSearchService
exa = ExaSearchService(settings)
# Financial-focused search
results = exa.search_financial(
query="semiconductor inventory cycle 2024",
num_results=5
)
# Research paper search
papers = exa.search_research(
query="transformer models in quantitative finance"
)
DeepSeek v3.2 for Reasoning
DeepSeek offers GPT-4 class reasoning at a fraction of the cost. For a system making dozens of queries daily, this matters. The model excels at synthesizing multiple sources and maintaining analytical rigor—exactly what quantamental research demands.
from services.llm import LLMService
llm = LLMService(settings)
response = llm.answer_with_context(
question="How do current chip valuations compare to historical cycles?",
context=combined_context, # Internal + web sources
)
Three Search Modes
The agent supports three retrieval strategies:
| Mode | Sources | Use Case |
|---|---|---|
INTERNAL_ONLY |
Qdrant | "What did our Q3 thesis say about AAPL?" |
WEB_ONLY |
Exa | "What's the latest Fed decision?" |
HYBRID |
Both | "How does our view compare to market consensus?" |
from agent import QuantamentalAgent, SearchMode
agent = QuantamentalAgent()
# Hybrid search: internal + web
response = agent.ask(
question="What are the risks in AI chip stocks given current valuations?",
mode=SearchMode.HYBRID,
top_k_internal=5,
top_k_web=5,
financial_focus=True
)
print(response.answer)
print(response.get_sources_summary())
Implementation Details
Project Structure
semantic-search/
├── agent.py # Main orchestrator
├── cli.py # Interactive CLI
├── config.py # Environment configuration
├── services/
│ ├── embedding.py # Voyage AI integration
│ ├── exa_search.py # Exa web search
│ ├── vector_store.py # Qdrant operations
│ └── llm.py # DeepSeek interface
└── examples/
├── basic_usage.py
└── advanced_usage.py
Configuration
# .env VOYAGE_API_KEY=your_key EXA_API_KEY=your_key DEEPSEEK_API_KEY=your_key QDRANT_URL=http://localhost:6333 EMBEDDING_MODEL=voyage-3-large LLM_MODEL=deepseek-chat
CLI Usage
# Interactive chat with mode switching python cli.py chat # Single question python cli.py ask "Semiconductor outlook for 2025" --mode hybrid # Index your research folder python cli.py index ./research --recursive # Search without generating answer python cli.py search "NVIDIA earnings" --source both
Results
Since deploying this system:
- Research time reduced by ~60%: The agent surfaces relevant internal notes and current market context in seconds
- Consistency improved: Every analysis considers both proprietary research and real-time data
- Audit trail created: Every answer includes source attribution for compliance and review
- Scalable synthesis: Can process queries that would take hours of manual research
What's Next
This quantamental agent is one component of a larger agentic rebalancing system. Next steps include:
- Alpaca MCP Integration: Connect to live portfolio positions for context-aware analysis
- Prediction Market Signals: Integrate Kalshi/Polymarket implied probabilities
- Multi-LLM Ensemble: Route queries to specialized models (DeepSeek for reasoning, Claude for analysis, GPT for structured output)
- Options Strategy Module: Extend the agent to reason about derivative positions and hedging
Try It Yourself
The complete implementation is available at: github.com/yanpan/semantic-search
git clone https://github.com/yanpan/semantic-search cd semantic-search pip install -r requirements.txt # Add your API keys to .env python cli.py chat