Daniel Kim | Data Scientist & ML Engineer

Projects

Deep Research RAG Evaluation (in Progress)

I wanted to try out Anthropic's statistical evaluation method to determine which LLM has performed better. So i decided to test out deep research from 5 different types of Generative AI: Simple Naive Rag, Rag with Contextual embedding, using GPT-Researcher package, home made agentic AI Deep Research RAG, and GPT-Researcher with contextual embedding.

LangGraphRAGDockerPythonDsPYPrompt EngineeringAgentic AIQdrant Vector DatabaseHypothesis Testing