Organize Dev/Test Scripts & Docs

by Admin 33 views
Organize Dev/Test Scripts and Enhance Documentation

Hey everyone! ๐Ÿ‘‹ This is a plan to get our development and test scripts organized and documented for the backend/ directory. It's like a spring cleaning for our code, making it easier for us and any new developers to find and use these handy tools. Let's dive in and make our lives a little easier! ๐Ÿš€

๐Ÿ“‹ Overview: The Game Plan

Goal: Streamline our development test scripts and create detailed documentation for the backend/dev_tests/ directory. This means moving the scripts to a dedicated spot, documenting them, and making them easily discoverable. Think of it as creating a well-organized toolbox instead of a messy pile of tools. ๐Ÿ› ๏ธ

Context: Right now, we've got a bunch of test scripts in the root of our project. They're super useful for manual testing and debugging, but they're not in the best place. We'll be:

  1. Moving them to a new backend/dev_tests/ directory. ๐Ÿ“‚
  2. Documenting them in the /docs folder. ๐Ÿ“
  3. Making them easy to find for everyone. ๐Ÿ‘€

This is a P3 (Low Priority) task, but it's essential for good housekeeping and will make our development process smoother. ๐Ÿ’ช

๐ŸŽฏ Objectives: What We're Going to Do

1. Organize Those Test Scripts ๆ•ด็†ๆต‹่ฏ•่„šๆœฌ

  • [ ] Create the backend/dev_tests/ directory structure. ๐Ÿ“
  • [ ] Move all six development test scripts from the root directory into backend/dev_tests/. ๐Ÿšš
  • [ ] Add a README.md file in backend/dev_tests/ to explain what the directory is all about. ๐Ÿ“–
  • [ ] Add an entry in .gitignore to ignore the output files from backend/dev_tests/. ๐Ÿšซ

2. Craft Comprehensive Documentation ๅˆ›ๅปบ่ฏฆ็ป†ๆ–‡ๆกฃ

  • [ ] Create docs/development/dev-test-scripts.md to document all the scripts. โœ๏ธ
  • [ ] Include usage examples, any prerequisites, and what the expected output should be. ๐Ÿ’ก
  • [ ] Link this documentation from our main development documentation. ๐Ÿ”—

3. Archive the Project Roadmap ๅฝ’ๆกฃ้กน็›ฎ่ทฏ็บฟๅ›พ

  • [ ] Move MASTER_ISSUES_ROADMAP.md โ†’ docs/planning/master-roadmap.md. ๐Ÿ—บ๏ธ
  • [ ] Keep a historical record of our performance improvements. ๐Ÿ“ˆ
  • [ ] Update any references in our code to the new location. ๐Ÿ”„

๐Ÿ“ฆ Test Scripts to Organize: The List

Here are the scripts we'll be wrangling:

Scripts Currently Hanging Out in the Root

test_docling_config.py           # Tests Docling configuration
test_embedding_direct.py         # Tests direct embedding API
test_embedding_retrieval.py      # Validates embedding retrieval
test_query_enhancement_demo.py   # Shows off query enhancement
test_search_no_cot.py           # Tests search without Chain of Thought

Scripts Currently in backend/

backend/debug_rag_failure.py    # Debugs RAG pipeline failures

Proposed Structure: Where Everything Will Live

backend/dev_tests/
โ”œโ”€โ”€ README.md                    # Overview of dev test scripts
โ”œโ”€โ”€ test_docling_config.py
โ”œโ”€โ”€ test_embedding_direct.py
โ”œโ”€โ”€ test_embedding_retrieval.py
โ”œโ”€โ”€ test_query_enhancement_demo.py
โ”œโ”€โ”€ test_search_no_cot.py
โ””โ”€โ”€ debug_rag_failure.py

๐Ÿ“ Documentation to Create: The Guide

We'll create a new file, docs/development/dev-test-scripts.md, to document all of our test scripts. This file will be the go-to resource for anyone looking to use these scripts.

File: docs/development/dev-test-scripts.md: The Blueprint

Here's what the structure of this file will look like:

# Development Test Scripts

## Overview
Development test scripts for manual testing, debugging, and validation.

## Location
`backend/dev_tests/`

## Scripts

### test_docling_config.py
**Purpose**: Test Docling document processing configuration
**Usage**: 
```bash
cd backend
poetry run python dev_tests/test_docling_config.py

Prerequisites: Docling dependencies installed Output: Configuration validation results

test_embedding_direct.py

Purpose: Test embedding service directly without full pipeline Usage:

cd backend
poetry run python dev_tests/test_embedding_direct.py

Prerequisites: Embedding service running Output: Embedding vectors and similarity scores

test_embedding_retrieval.py

Purpose: Test embedding retrieval from vector store Usage:

cd backend
poetry run python dev_tests/test_embedding_retrieval.py

Prerequisites:

  • Vector store (Milvus) running
  • Collection with embeddings Output: Retrieved documents with scores

test_query_enhancement_demo.py

Purpose: Demonstrate query enhancement features Usage:

cd backend
poetry run python dev_tests/test_query_enhancement_demo.py

Prerequisites: None (self-contained demo) Output: Enhanced queries with entity extraction

test_search_no_cot.py

Purpose: Test search without Chain of Thought reasoning Usage:

cd backend
poetry run python dev_tests/test_search_no_cot.py

Prerequisites:

  • All services running (make local-dev-infra)
  • Collection with documents Output: Search results and timing

debug_rag_failure.py

Purpose: Debug RAG pipeline failures with detailed logging Usage:

cd backend
poetry run python dev_tests/debug_rag_failure.py <collection_id> <query>

Prerequisites: RAG services running Output: Detailed pipeline execution trace

Best Practices

  1. Run from backend/ directory: All scripts assume backend/ as the working directory.
  2. Check prerequisites: Make sure all the necessary services are up and running.
  3. Use for debugging: Remember that these scripts are primarily for development and debugging, not for production.
  4. Add new scripts: Follow the naming convention test_<component>_<purpose>.py when adding new scripts.
  5. Document changes: Always update this file when you add or modify any scripts.

Related Documentation


## Best Practices: Keep These in Mind

*   **Run from backend/ directory**: Always run the scripts from the `backend/` directory. This is important because the scripts are designed to work from that location.
*   **Check prerequisites**: Before running a script, make sure all the required services and dependencies are up and running. This will ensure that the scripts execute correctly.
*   **Use for debugging**: These scripts are mainly for development and debugging purposes. They're not designed for production environments.
*   **Add new scripts**: If you create a new test script, follow the naming convention `test_<component>_<purpose>.py`. This helps keep things organized.
*   **Document changes**: Whenever you add or modify a script, make sure to update the documentation file to reflect the changes.

## Related Documentation: Helpful Links

*   [Development Workflow](../development/workflow.md)
*   [Testing Guide](../testing/index.md)
*   [Local Development](../../CLAUDE.md#local-development)

## Benefits: Why This Matters

*   **Cleaner Root Directory**: Get rid of the clutter in the root directory. ๐Ÿงน
*   **Better Discoverability**: Make it easier for new developers (and ourselves!) to find test utilities. ๐Ÿ”Ž
*   **Documentation**: Provide clear usage examples and prerequisites. ๐Ÿ“š
*   **Preserved History**: Safely archive the `MASTER_ISSUES_ROADMAP.md`. ๐Ÿ“œ
*   **Maintainability**: Centralize the location for all our development tools. ๐Ÿ› ๏ธ

## ๐Ÿค” Questions / Decisions: Let's Discuss

1.  **Should `dev_tests` be tracked in Git?**
    *   **Recommendation:** Yes, we should track the scripts, but we'll ignore the output files.
2.  **Should we create more dev test scripts?**
    *   **Recommendation:** Absolutely! Create more scripts as needed for specific debugging tasks.
3.  **Should `MASTER_ISSUES_ROADMAP.md` stay in the root?**
    *   **Recommendation:** No, let's move it to `docs/planning/` for better organization.

## ๐Ÿ‘ฅ Labels: Tags for Reference

`documentation` `cleanup` `good-first-issue` `dev-experience`