Skip to content

P fix list_directory_from_gh in case no directory is returned from API#76

Open
p- wants to merge 11 commits intomainfrom
p--fix-gh-list-dir
Open

P fix list_directory_from_gh in case no directory is returned from API#76
p- wants to merge 11 commits intomainfrom
p--fix-gh-list-dir

Conversation

@p-
Copy link
Copy Markdown
Contributor

@p- p- commented Apr 10, 2026

No description provided.

@p- p- requested a review from m-y-mo as a code owner April 10, 2026 07:39
Copilot AI review requested due to automatic review settings April 10, 2026 07:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the gh_file_viewer MCP server’s directory-listing tool to handle GitHub Contents API responses where the requested path is not a directory, and adds a comprehensive unit-test suite for the module.

Changes:

  • Update list_directory_from_gh to handle empty results and non-list JSON payloads (e.g., file metadata) without crashing.
  • Add new tests/test_gh_file_viewer.py covering file fetch, line slicing, search (file/repo), directory listing, and helper utilities.
Show a summary per file
File Description
src/seclab_taskflows/mcp_servers/gh_file_viewer.py Prevents failures when the Contents API returns a non-directory payload by validating the decoded JSON type before iterating.
tests/test_gh_file_viewer.py Introduces new unit tests for the gh_file_viewer tools and helper functions, including the newly handled “path is a file” case.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 2

Comment on lines +60 to +66
class TestFetchFileFromGh:
@pytest.mark.asyncio
async def test_fetch_file_success(self):
resp = _make_response(text=SAMPLE_FILE_CONTENT)
with patch.object(gfv_mod, "call_api", new_callable=AsyncMock, return_value=resp):
result = await gfv_mod.fetch_file_from_gh.fn(owner="Owner", repo="Repo", path="src/main.py")
assert "1: import os" in result
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test module introduces many async def tests marked with @pytest.mark.asyncio. Unless the test environment installs and configures pytest-asyncio, pytest will error when collecting/running coroutine tests (especially in CI via hatch test). Please ensure pytest-asyncio is included in the test dependencies (e.g., Hatch test env deps / optional test extra) so these tests run on all CI platforms.

Copilot uses AI. Check for mistakes.
Comment on lines +179 to +185
resp = _make_response(text="")
with patch.object(gfv_mod, "call_api", new_callable=AsyncMock, return_value=resp):
result = await gfv_mod.search_files_from_gh.fn(
owner="owner", repo="repo", paths="", search_term="import", save_to_db=False,
)
# empty string split yields [""], which hits the API for an empty path
assert "No matches found" in result
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_search_files_no_paths currently asserts the behavior where paths="" still triggers an API call and returns "No matches found". In search_files_from_gh, the guard if not paths_list: is effectively unreachable because "".split(",") yields [""]. Consider filtering out empty/whitespace-only paths and returning the intended "No paths provided for search." message, then update this test to match the corrected behavior.

Suggested change
resp = _make_response(text="")
with patch.object(gfv_mod, "call_api", new_callable=AsyncMock, return_value=resp):
result = await gfv_mod.search_files_from_gh.fn(
owner="owner", repo="repo", paths="", search_term="import", save_to_db=False,
)
# empty string split yields [""], which hits the API for an empty path
assert "No matches found" in result
with patch.object(gfv_mod, "call_api", new_callable=AsyncMock) as mock_call_api:
result = await gfv_mod.search_files_from_gh.fn(
owner="owner", repo="repo", paths="", search_term="import", save_to_db=False,
)
assert result == "No paths provided for search."
mock_call_api.assert_not_awaited()

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 10, 2026 07:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the gh_file_viewer MCP server’s GitHub “contents” directory listing behavior when the GitHub API returns an empty payload or a non-directory payload (e.g., a single file object), and adds a dedicated test suite to validate the module’s tools end-to-end via mocked API calls.

Changes:

  • Fix list_directory_from_gh to safely handle empty API JSON responses and “path is a file” responses (dict vs list).
  • Add a comprehensive new tests/test_gh_file_viewer.py covering file fetch, line slicing, searching, directory listing, repo zip searching, and helper utilities.
  • Make table creation explicitly checkfirst=True for the SearchResults table initialization.
Show a summary per file
File Description
src/seclab_taskflows/mcp_servers/gh_file_viewer.py Handles empty/non-list JSON from the GitHub contents API to avoid runtime errors when listing directories.
tests/test_gh_file_viewer.py Adds coverage for gh_file_viewer tool functions, including the newly fixed directory listing edge cases.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 0 new

Copilot AI review requested due to automatic review settings April 10, 2026 08:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the GitHub file viewer MCP server’s robustness when listing repository “directories” via the GitHub Contents API, and adds a dedicated test suite for the module.

Changes:

  • Update list_directory_from_gh to handle empty API responses and non-directory JSON payloads.
  • Add comprehensive unit tests for gh_file_viewer tool functions (file fetch, line ranges, search, directory listing, repo zip search).
  • Register the xdist_group pytest marker in pyproject.toml to support grouping tests onto one xdist worker.
Show a summary per file
File Description
tests/test_gh_file_viewer.py Adds a full unit test module for gh_file_viewer, including directory-list edge cases.
src/seclab_taskflows/mcp_servers/gh_file_viewer.py Hardens directory listing behavior and adds import-time DB init error handling.
pyproject.toml Adds pytest marker configuration for xdist_group.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 2

Comment on lines +57 to +60
try:
Base.metadata.create_all(engine, tables=[SearchResults.__table__])
except Exception as e:
logging.exception(f"Database already exists") # only log here, as this error likely only happens in test
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module-level try/except around Base.metadata.create_all(...) catches all exceptions and then continues, which can hide real database setup failures (e.g., missing/invalid DB path, permission issues) and lead to later runtime errors when writing/querying. If this is meant to tolerate a known concurrent-create race, catch the specific SQLAlchemy exception(s) and only ignore the specific "already exists" case; otherwise re-raise after logging. Also, the log message "Database already exists" is misleading for most exceptions coming from create_all.

Copilot uses AI. Check for mistakes.
Comment on lines +184 to +190
resp = _make_response(text="")
with patch.object(gfv_mod, "call_api", new_callable=AsyncMock, return_value=resp):
result = await gfv_mod.search_files_from_gh.fn(
owner="owner", repo="repo", paths="", search_term="import", save_to_db=False,
)
# empty string split yields [""], which hits the API for an empty path
assert "No matches found" in result
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case documents that paths="" ends up calling the GitHub API with an empty path (because "".split(",") produces [""]). That behavior is almost certainly unintended (the implementation even has a "No paths provided" branch that never triggers for empty input), and encoding it in a test will make it harder to fix later. Consider filtering out empty/whitespace-only paths (and asserting the intended "No paths provided"/similar message here) instead of validating the current accidental behavior.

Suggested change
resp = _make_response(text="")
with patch.object(gfv_mod, "call_api", new_callable=AsyncMock, return_value=resp):
result = await gfv_mod.search_files_from_gh.fn(
owner="owner", repo="repo", paths="", search_term="import", save_to_db=False,
)
# empty string split yields [""], which hits the API for an empty path
assert "No matches found" in result
with patch.object(gfv_mod, "call_api", new_callable=AsyncMock) as mock_call_api:
result = await gfv_mod.search_files_from_gh.fn(
owner="owner", repo="repo", paths="", search_term="import", save_to_db=False,
)
assert "No paths provided" in result
mock_call_api.assert_not_awaited()

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 10, 2026 08:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves robustness of the gh_file_viewer MCP server by handling GitHub “contents” API responses that are not directory listings, and adds a comprehensive test suite for the module.

Changes:

  • Add list_directory_from_gh handling for empty/non-list API JSON payloads (e.g., when the path is a file).
  • Add new tests/test_gh_file_viewer.py covering GitHub file viewer behaviors (fetch/search/list/search repo).
  • Update test environment config in pyproject.toml (pytest marker registration + extra test dependency).
Show a summary per file
File Description
src/seclab_taskflows/mcp_servers/gh_file_viewer.py Adds safer handling for directory listing API payloads and wraps DB table creation in an exception handler.
tests/test_gh_file_viewer.py Introduces async and sync tests for the GitHub file viewer MCP tools and helpers.
pyproject.toml Adds hatch test env extra dependency and registers custom pytest marker used by xdist grouping.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 1

Comment on lines +58 to +61
try:
Base.metadata.create_all(engine, tables=[SearchResults.__table__])
except sqlalchemy.exc.OperationalError as e:
logging.exception(f"Database/Tables already exist(s)") # only log here, as this error likely only happens in test
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching all sqlalchemy.exc.OperationalError from Base.metadata.create_all(...) and then continuing can mask real DB init failures (e.g., permission errors, missing/readonly directory, DB locked/corrupt). Consider only suppressing the specific “table already exists” race case (inspect the exception message/orig), and re-raise otherwise (and/or ensure SEARCH_RESULT_DIR exists before creating the engine). Also logging.exception here is very noisy for an expected condition; prefer a warning/info log that includes the actual exception details.

See below for a potential fix:

os.makedirs(SEARCH_RESULT_DIR, exist_ok=True)


def _is_table_already_exists_error(error: sqlalchemy.exc.OperationalError) -> bool:
    error_messages = [str(error).lower()]
    if getattr(error, "orig", None) is not None:
        error_messages.append(str(error.orig).lower())

    return any(
        "already exists" in message and "table" in message
        for message in error_messages
    )


engine = create_engine(f"sqlite:///{os.path.abspath(SEARCH_RESULT_DIR)}/search_result.db", echo=False)

try:
    Base.metadata.create_all(engine, tables=[SearchResults.__table__])
except sqlalchemy.exc.OperationalError as e:
    if _is_table_already_exists_error(e):
        logging.warning("Search results table already exists during initialization: %s", e)
    else:
        raise

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants