diff options
Diffstat (limited to '.github')
| -rw-r--r-- | .github/copilot-instructions.md | 82 | ||||
| -rw-r--r-- | .github/workflows/cron.yml | 21 | ||||
| -rw-r--r-- | .github/workflows/docs.yml | 18 | ||||
| -rw-r--r-- | .github/workflows/release.yml | 21 | ||||
| -rw-r--r-- | .github/workflows/tests.yml | 17 |
5 files changed, 62 insertions, 97 deletions
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 06ea30dd..6039850f 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -7,7 +7,7 @@ ### Tech Stack - **Language**: Python 3.10+ (tested on 3.10, 3.11, 3.12) - **Size**: ~75 Python source files (~13.5k LOC), ~36 test files -- **Package Management**: pip-tools with pinned requirements in `requirements/*.txt` +- **Package Management**: uv with pinned requirements in `uv.lock` - **Key Dependencies**: BeautifulSoup4, pandas, spacy, pdftotext (requires Poppler), pikepdf, pytesseract, scikit-learn, matplotlib, networkx, pydantic - **Build System**: setuptools with setuptools-scm for versioning - **Testing**: pytest with custom markers (`slow`, `remote`) @@ -49,27 +49,21 @@ sudo apt-get install -y \ **The version file `src/sec_certs/_version.py` is auto-generated by setuptools-scm and must NOT be committed.** If missing during development, create a temporary version: `echo '__version__ = "dev"' > src/sec_certs/_version.py` -**Standard install (for testing and development):** +**Development install (for testing and development):** ```bash -# Install test dependencies (includes pytest, coverage, etc.) -pip install -r requirements/test_requirements.txt +# Create a virtual environment +uv venv -# Install sec-certs in editable mode -pip install -e . +# Install all dependencies (including dev ones) and the project in editable mode +uv sync --dev # ALWAYS download the spacy language model after install -python -m spacy download en_core_web_sm -``` +uv run spacy download en_core_web_sm -**For full development (linting, docs):** -```bash -pip install -r requirements/dev_requirements.txt -pip install -e . -python -m spacy download en_core_web_sm +# Optionally, you can activate the virtual environment and avoid all the "uv run" prefixes +source .venv/bin/activate ``` -**Note on pip-sync**: Do NOT use `pip-sync requirements/all_requirements.txt` in environments with system packages (like GitHub Actions runners). It tries to uninstall system packages and will fail. Use `pip install -r` instead. - Verify the installation (sec-certs and spacy language model) by importing the package: ```python import sec_certs._version @@ -85,12 +79,12 @@ print(spacy.load("en_core_web_sm")) **Basic test run (excludes remote/flaky tests):** ```bash -PYTHONPATH=src:$PYTHONPATH pytest tests -m "not remote" -v +uv run pytest tests -m "not remote" -v ``` **Test with coverage (as in CI):** ```bash -pytest --cov=sec_certs -m "not remote" --junitxml=junit.xml tests +uv run pytest --cov=sec_certs -m "not remote" --junitxml=junit.xml tests ``` **Test markers:** @@ -106,27 +100,26 @@ pytest --cov=sec_certs -m "not remote" --junitxml=junit.xml tests **Using pre-commit (recommended):** ```bash -pip install -r requirements/dev_requirements.txt -pre-commit install -pre-commit run --all-files +uv run pre-commit install +uv run pre-commit run --all-files ``` **Manual linting:** ```bash # Ruff linting (checks code style, imports, complexity) -ruff check . +uv run ruff check . # Ruff with auto-fix -ruff check . --fix +uv run ruff check . --fix # Ruff formatting check -ruff format --check . +uv run ruff format --check . # Ruff auto-format -ruff format . +uv run ruff format . # MyPy type checking -mypy . +uv run mypy . ``` **Linting configuration**: See `pyproject.toml` for Ruff and MyPy settings. Target Python 3.10. Line length: 120. Notebooks (*.ipynb) are excluded from linting. @@ -135,7 +128,7 @@ mypy . ```bash cd docs -make html +uv run make html ``` Output goes to `docs/_build/html/`. Documentation uses Sphinx with myst-nb for Markdown and Jupyter notebooks. @@ -143,8 +136,7 @@ Output goes to `docs/_build/html/`. Documentation uses Sphinx with myst-nb for M ### Building for Distribution ```bash -python -m pip install build -python -m build +uv build ``` This creates source and wheel distributions in `dist/`. @@ -174,22 +166,15 @@ sec-certs/ │ └── conftest.py # Pytest configuration and fixtures ├── docs/ # Sphinx documentation source ├── notebooks/ # Jupyter notebooks (examples, analysis) -├── requirements/ # Pinned requirements files -│ ├── requirements.txt # Core dependencies -│ ├── dev_requirements.txt # Dev tools (ruff, mypy, sphinx) -│ ├── test_requirements.txt # Test dependencies -│ ├── nlp_requirements.txt # Optional NLP dependencies -│ ├── all_requirements.txt # All of the above combined -│ └── compile.sh # Script to regenerate requirements ├── pyproject.toml # Package metadata, build config, tool settings ├── .pre-commit-config.yaml # Pre-commit hooks configuration -└── Dockerfile # Docker image for reproducible environment +├── Dockerfile # Docker image for reproducible environment +└── uv.lock # uv lockfile with pinned dependendices. ``` ### Key Files and Configurations - **pyproject.toml**: Package definition, dependencies, Ruff/MyPy/pytest config. Single source of truth for dependencies (unpinned). -- **requirements/*.txt**: Pinned versions generated by `compile.sh`. CI uses these for reproducible builds. - **src/sec_certs/rules.yaml**: Regular expressions for extracting data from certificates. Add patterns here. - **src/sec_certs/configuration.py**: Runtime configuration using pydantic-settings. Reads from env vars with `SECCERTS_` prefix. - **.pre-commit-config.yaml**: Defines pre-commit hooks (ruff, mypy). Versions should match pyproject.toml. @@ -247,8 +232,8 @@ sec-certs/ 1. Create branch from `main` (only stable branch for PRs) 2. Make minimal code changes 3. Add tests in appropriate `tests/` subdirectory -4. Run linters: `pre-commit run --all-files` or `ruff check . && mypy .` -5. Run tests: `pytest tests -m "not remote" -v` +4. Run linters: `uv run pre-commit run --all-files` or `uv run ruff check . && uv run mypy .` +5. Run tests: `uv run pytest tests -m "not remote" -v` 6. Update docs if public API changed 7. Commit and push (CI will validate) @@ -257,8 +242,7 @@ sec-certs/ ```bash # Edit pyproject.toml to add/update dependency # Regenerate pinned requirements -cd requirements -./compile.sh +uv lock # Commit both pyproject.toml and requirements/*.txt changes ``` @@ -272,7 +256,7 @@ dset = CCDataset.from_web() # Downloads from sec-certs.org **Processing from scratch (requires full setup, takes hours, DO NOT DO THIS):** ```bash -sec-certs cc all -o ./dataset +uv run sec-certs cc all -o ./dataset ``` ## Common Pitfalls and Gotchas @@ -285,17 +269,13 @@ sec-certs cc all -o ./dataset 4. **Java in PATH**: Required for FIPS table parsing. Verify with `java -version`. -5. **pip-sync on GitHub Actions**: Don't use it with system packages. Use `pip install -r requirements/*.txt` instead. - -6. **Test markers**: Exclude flaky remote tests with `-m "not remote"` for stable local testing. - -7. **Import from src**: When running without install, set `PYTHONPATH=src:$PYTHONPATH` to import sec_certs modules. +5. **Test markers**: Exclude flaky remote tests with `-m "not remote"` for stable local testing. -8. **Default dataset location**: CLI creates `./dataset` by default. Add to .gitignore if working locally. +6. **Default dataset location**: CLI creates `./dataset` by default. Add to .gitignore if working locally. -9. **Pre-commit hook behavior**: Pre-commit hooks warn about issues but don't auto-fix. Run `ruff check . --fix` to apply fixes. +7. **Pre-commit hook behavior**: Pre-commit hooks warn about issues but don't auto-fix. Run `ruff check . --fix` to apply fixes. -10. **Long-running commands**: Full dataset processing (`sec-certs cc all`) takes hours. Use pre-processed datasets from web for analysis. +8. **Long-running commands**: Full dataset processing (`sec-certs cc all`) takes hours. Use pre-processed datasets from web for analysis. ## Additional Resources @@ -313,7 +293,7 @@ sec-certs cc all -o ./dataset These instructions have been validated by examining repository structure, workflows, documentation, and testing commands. When working on this repository: 1. **Trust these build/test commands** - they are verified to work -2. **Follow the setup order** (system deps → pip deps → spacy model → install) +2. **Follow the setup order** (system deps → python deps and install (uv sync) → spacy model) 3. **Only search/explore if** these instructions are incomplete or incorrect 4. **Refer to these instructions first** before trying alternative approaches diff --git a/.github/workflows/cron.yml b/.github/workflows/cron.yml index 3ca22f49..0ef4a551 100644 --- a/.github/workflows/cron.yml +++ b/.github/workflows/cron.yml @@ -17,22 +17,17 @@ jobs: - name: Install Poppler run: sudo apt-get install -y build-essential libpoppler-cpp-dev pkg-config python3-dev - uses: actions/checkout@v4 - - name: Setup python - uses: actions/setup-python@v5 + - name: Install uv and Python + uses: astral-sh/setup-uv@v7 with: - python-version: "3.10" - cache: "pip" - cache-dependency-path: | - requirements/test_requirements.txt - - name: Install python dependencies - run: | - pip install -r requirements/test_requirements.txt + python-version: ${{ matrix.python-version }} + enable-cache: true - name: Install sec-certs run: | - pip install -e . - python -m spacy download en_core_web_sm + uv sync --locked --dev + uv run spacy download en_core_web_sm - name: Run tests - run: pytest --cov=sec_certs -m "remote" --junitxml=junit.xml -o junit_family=legacy tests + run: uv run pytest --cov=sec_certs -m "remote" --cov-report=xml --cov-report=term --junitxml=junit.xml -o junit_family=legacy tests continue-on-error: true - name: Test summary if: always() @@ -48,4 +43,4 @@ jobs: if: ${{ !cancelled() }} uses: codecov/test-results-action@v1 with: - token: ${{ secrets.CODECOV_TOKEN }}
\ No newline at end of file + token: ${{ secrets.CODECOV_TOKEN }} diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 6deb8984..eb432751 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -11,24 +11,22 @@ jobs: - uses: actions/checkout@v4 with: fetch-depth: 0 - - uses: actions/setup-python@v5 - with: - python-version: "3.10" - cache: "pip" - cache-dependency-path: | - requirements/dev_requirements.txt - name: apt-get update run: sudo apt-get update - name: Install external dependencies run: sudo apt-get install build-essential libpoppler-cpp-dev pkg-config python3-dev -y - - name: Install sec-certs and deps + - name: Install uv and Python + uses: astral-sh/setup-uv@v6 + with: + python-version: ${{ matrix.python-version }} + enable-cache: true + - name: Install sec-certs run: | - pip install -r requirements/dev_requirements.txt - pip install -e . + uv sync --locked --dev - name: Build docs run: | cd docs - make html + uv run make html - name: Save docs artifact uses: actions/upload-artifact@v4.4.0 with: diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 55d096f1..3ad5a317 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -7,7 +7,7 @@ jobs: pypi_release: name: Release on PyPi runs-on: ubuntu-22.04 - if: github.repository == 'crocs-muni/sec-certs' + #if: github.repository == 'crocs-muni/sec-certs' environment: name: pypi url: https://pypi.org/project/sec-certs/ @@ -17,19 +17,16 @@ jobs: - uses: actions/checkout@v4 with: fetch-depth: 0 - - name: Set up Python - uses: actions/setup-python@v5 + - name: Install uv and Python + uses: astral-sh/setup-uv@v7 with: - python-version: "3.10" - - name: apt-get update - run: sudo apt-get update - - name: Install build dependencies - run: python -m pip install build - - name: Build distributions - shell: bash -l {0} - run: python -m build + python-version: 3.10 + enable-cache: true + - name: Build + run: uv build - name: Publish package to PyPI - uses: pypa/gh-action-pypi-publish@release/v1 + if: false + run: uv publish docker_release: name: Release on DockerHub environment: diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml index a7ee60a2..e39ce058 100644 --- a/.github/workflows/tests.yml +++ b/.github/workflows/tests.yml @@ -21,22 +21,17 @@ jobs: - name: Install Poppler run: sudo apt-get install -y build-essential libpoppler-cpp-dev pkg-config python3-dev - uses: actions/checkout@v4 - - name: Setup python ${{ matrix.python-version }} - uses: actions/setup-python@v5 + - name: Install uv and Python + uses: astral-sh/setup-uv@v7 with: python-version: ${{ matrix.python-version }} - cache: "pip" - cache-dependency-path: | - requirements/test_requirements.txt - - name: Install python dependencies - run: | - pip install -r requirements/test_requirements.txt + enable-cache: true - name: Install sec-certs run: | - pip install -e . - python -m spacy download en_core_web_sm + uv sync --locked --dev + uv run spacy download en_core_web_sm - name: Run tests - run: pytest --cov=sec_certs -m "not remote" --junitxml=junit.xml -o junit_family=legacy tests + run: uv run pytest --cov=sec_certs -m "not remote" --cov-report=xml --cov-report=term --junitxml=junit.xml -o junit_family=legacy tests - name: Test summary if: always() uses: test-summary/action@v2 |
