aboutsummaryrefslogtreecommitdiffhomepage
path: root/.github
diff options
context:
space:
mode:
Diffstat (limited to '.github')
-rw-r--r--.github/copilot-instructions.md82
-rw-r--r--.github/workflows/cron.yml21
-rw-r--r--.github/workflows/docs.yml18
-rw-r--r--.github/workflows/release.yml21
-rw-r--r--.github/workflows/tests.yml17
5 files changed, 62 insertions, 97 deletions
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 06ea30dd..6039850f 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -7,7 +7,7 @@
### Tech Stack
- **Language**: Python 3.10+ (tested on 3.10, 3.11, 3.12)
- **Size**: ~75 Python source files (~13.5k LOC), ~36 test files
-- **Package Management**: pip-tools with pinned requirements in `requirements/*.txt`
+- **Package Management**: uv with pinned requirements in `uv.lock`
- **Key Dependencies**: BeautifulSoup4, pandas, spacy, pdftotext (requires Poppler), pikepdf, pytesseract, scikit-learn, matplotlib, networkx, pydantic
- **Build System**: setuptools with setuptools-scm for versioning
- **Testing**: pytest with custom markers (`slow`, `remote`)
@@ -49,27 +49,21 @@ sudo apt-get install -y \
**The version file `src/sec_certs/_version.py` is auto-generated by setuptools-scm and must NOT be committed.**
If missing during development, create a temporary version: `echo '__version__ = "dev"' > src/sec_certs/_version.py`
-**Standard install (for testing and development):**
+**Development install (for testing and development):**
```bash
-# Install test dependencies (includes pytest, coverage, etc.)
-pip install -r requirements/test_requirements.txt
+# Create a virtual environment
+uv venv
-# Install sec-certs in editable mode
-pip install -e .
+# Install all dependencies (including dev ones) and the project in editable mode
+uv sync --dev
# ALWAYS download the spacy language model after install
-python -m spacy download en_core_web_sm
-```
+uv run spacy download en_core_web_sm
-**For full development (linting, docs):**
-```bash
-pip install -r requirements/dev_requirements.txt
-pip install -e .
-python -m spacy download en_core_web_sm
+# Optionally, you can activate the virtual environment and avoid all the "uv run" prefixes
+source .venv/bin/activate
```
-**Note on pip-sync**: Do NOT use `pip-sync requirements/all_requirements.txt` in environments with system packages (like GitHub Actions runners). It tries to uninstall system packages and will fail. Use `pip install -r` instead.
-
Verify the installation (sec-certs and spacy language model) by importing the package:
```python
import sec_certs._version
@@ -85,12 +79,12 @@ print(spacy.load("en_core_web_sm"))
**Basic test run (excludes remote/flaky tests):**
```bash
-PYTHONPATH=src:$PYTHONPATH pytest tests -m "not remote" -v
+uv run pytest tests -m "not remote" -v
```
**Test with coverage (as in CI):**
```bash
-pytest --cov=sec_certs -m "not remote" --junitxml=junit.xml tests
+uv run pytest --cov=sec_certs -m "not remote" --junitxml=junit.xml tests
```
**Test markers:**
@@ -106,27 +100,26 @@ pytest --cov=sec_certs -m "not remote" --junitxml=junit.xml tests
**Using pre-commit (recommended):**
```bash
-pip install -r requirements/dev_requirements.txt
-pre-commit install
-pre-commit run --all-files
+uv run pre-commit install
+uv run pre-commit run --all-files
```
**Manual linting:**
```bash
# Ruff linting (checks code style, imports, complexity)
-ruff check .
+uv run ruff check .
# Ruff with auto-fix
-ruff check . --fix
+uv run ruff check . --fix
# Ruff formatting check
-ruff format --check .
+uv run ruff format --check .
# Ruff auto-format
-ruff format .
+uv run ruff format .
# MyPy type checking
-mypy .
+uv run mypy .
```
**Linting configuration**: See `pyproject.toml` for Ruff and MyPy settings. Target Python 3.10. Line length: 120. Notebooks (*.ipynb) are excluded from linting.
@@ -135,7 +128,7 @@ mypy .
```bash
cd docs
-make html
+uv run make html
```
Output goes to `docs/_build/html/`. Documentation uses Sphinx with myst-nb for Markdown and Jupyter notebooks.
@@ -143,8 +136,7 @@ Output goes to `docs/_build/html/`. Documentation uses Sphinx with myst-nb for M
### Building for Distribution
```bash
-python -m pip install build
-python -m build
+uv build
```
This creates source and wheel distributions in `dist/`.
@@ -174,22 +166,15 @@ sec-certs/
│ └── conftest.py # Pytest configuration and fixtures
├── docs/ # Sphinx documentation source
├── notebooks/ # Jupyter notebooks (examples, analysis)
-├── requirements/ # Pinned requirements files
-│ ├── requirements.txt # Core dependencies
-│ ├── dev_requirements.txt # Dev tools (ruff, mypy, sphinx)
-│ ├── test_requirements.txt # Test dependencies
-│ ├── nlp_requirements.txt # Optional NLP dependencies
-│ ├── all_requirements.txt # All of the above combined
-│ └── compile.sh # Script to regenerate requirements
├── pyproject.toml # Package metadata, build config, tool settings
├── .pre-commit-config.yaml # Pre-commit hooks configuration
-└── Dockerfile # Docker image for reproducible environment
+├── Dockerfile # Docker image for reproducible environment
+└── uv.lock # uv lockfile with pinned dependendices.
```
### Key Files and Configurations
- **pyproject.toml**: Package definition, dependencies, Ruff/MyPy/pytest config. Single source of truth for dependencies (unpinned).
-- **requirements/*.txt**: Pinned versions generated by `compile.sh`. CI uses these for reproducible builds.
- **src/sec_certs/rules.yaml**: Regular expressions for extracting data from certificates. Add patterns here.
- **src/sec_certs/configuration.py**: Runtime configuration using pydantic-settings. Reads from env vars with `SECCERTS_` prefix.
- **.pre-commit-config.yaml**: Defines pre-commit hooks (ruff, mypy). Versions should match pyproject.toml.
@@ -247,8 +232,8 @@ sec-certs/
1. Create branch from `main` (only stable branch for PRs)
2. Make minimal code changes
3. Add tests in appropriate `tests/` subdirectory
-4. Run linters: `pre-commit run --all-files` or `ruff check . && mypy .`
-5. Run tests: `pytest tests -m "not remote" -v`
+4. Run linters: `uv run pre-commit run --all-files` or `uv run ruff check . && uv run mypy .`
+5. Run tests: `uv run pytest tests -m "not remote" -v`
6. Update docs if public API changed
7. Commit and push (CI will validate)
@@ -257,8 +242,7 @@ sec-certs/
```bash
# Edit pyproject.toml to add/update dependency
# Regenerate pinned requirements
-cd requirements
-./compile.sh
+uv lock
# Commit both pyproject.toml and requirements/*.txt changes
```
@@ -272,7 +256,7 @@ dset = CCDataset.from_web() # Downloads from sec-certs.org
**Processing from scratch (requires full setup, takes hours, DO NOT DO THIS):**
```bash
-sec-certs cc all -o ./dataset
+uv run sec-certs cc all -o ./dataset
```
## Common Pitfalls and Gotchas
@@ -285,17 +269,13 @@ sec-certs cc all -o ./dataset
4. **Java in PATH**: Required for FIPS table parsing. Verify with `java -version`.
-5. **pip-sync on GitHub Actions**: Don't use it with system packages. Use `pip install -r requirements/*.txt` instead.
-
-6. **Test markers**: Exclude flaky remote tests with `-m "not remote"` for stable local testing.
-
-7. **Import from src**: When running without install, set `PYTHONPATH=src:$PYTHONPATH` to import sec_certs modules.
+5. **Test markers**: Exclude flaky remote tests with `-m "not remote"` for stable local testing.
-8. **Default dataset location**: CLI creates `./dataset` by default. Add to .gitignore if working locally.
+6. **Default dataset location**: CLI creates `./dataset` by default. Add to .gitignore if working locally.
-9. **Pre-commit hook behavior**: Pre-commit hooks warn about issues but don't auto-fix. Run `ruff check . --fix` to apply fixes.
+7. **Pre-commit hook behavior**: Pre-commit hooks warn about issues but don't auto-fix. Run `ruff check . --fix` to apply fixes.
-10. **Long-running commands**: Full dataset processing (`sec-certs cc all`) takes hours. Use pre-processed datasets from web for analysis.
+8. **Long-running commands**: Full dataset processing (`sec-certs cc all`) takes hours. Use pre-processed datasets from web for analysis.
## Additional Resources
@@ -313,7 +293,7 @@ sec-certs cc all -o ./dataset
These instructions have been validated by examining repository structure, workflows, documentation, and testing commands. When working on this repository:
1. **Trust these build/test commands** - they are verified to work
-2. **Follow the setup order** (system deps → pip deps → spacy model → install)
+2. **Follow the setup order** (system deps → python deps and install (uv sync) → spacy model)
3. **Only search/explore if** these instructions are incomplete or incorrect
4. **Refer to these instructions first** before trying alternative approaches
diff --git a/.github/workflows/cron.yml b/.github/workflows/cron.yml
index 3ca22f49..0ef4a551 100644
--- a/.github/workflows/cron.yml
+++ b/.github/workflows/cron.yml
@@ -17,22 +17,17 @@ jobs:
- name: Install Poppler
run: sudo apt-get install -y build-essential libpoppler-cpp-dev pkg-config python3-dev
- uses: actions/checkout@v4
- - name: Setup python
- uses: actions/setup-python@v5
+ - name: Install uv and Python
+ uses: astral-sh/setup-uv@v7
with:
- python-version: "3.10"
- cache: "pip"
- cache-dependency-path: |
- requirements/test_requirements.txt
- - name: Install python dependencies
- run: |
- pip install -r requirements/test_requirements.txt
+ python-version: ${{ matrix.python-version }}
+ enable-cache: true
- name: Install sec-certs
run: |
- pip install -e .
- python -m spacy download en_core_web_sm
+ uv sync --locked --dev
+ uv run spacy download en_core_web_sm
- name: Run tests
- run: pytest --cov=sec_certs -m "remote" --junitxml=junit.xml -o junit_family=legacy tests
+ run: uv run pytest --cov=sec_certs -m "remote" --cov-report=xml --cov-report=term --junitxml=junit.xml -o junit_family=legacy tests
continue-on-error: true
- name: Test summary
if: always()
@@ -48,4 +43,4 @@ jobs:
if: ${{ !cancelled() }}
uses: codecov/test-results-action@v1
with:
- token: ${{ secrets.CODECOV_TOKEN }} \ No newline at end of file
+ token: ${{ secrets.CODECOV_TOKEN }}
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
index 6deb8984..eb432751 100644
--- a/.github/workflows/docs.yml
+++ b/.github/workflows/docs.yml
@@ -11,24 +11,22 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- - uses: actions/setup-python@v5
- with:
- python-version: "3.10"
- cache: "pip"
- cache-dependency-path: |
- requirements/dev_requirements.txt
- name: apt-get update
run: sudo apt-get update
- name: Install external dependencies
run: sudo apt-get install build-essential libpoppler-cpp-dev pkg-config python3-dev -y
- - name: Install sec-certs and deps
+ - name: Install uv and Python
+ uses: astral-sh/setup-uv@v6
+ with:
+ python-version: ${{ matrix.python-version }}
+ enable-cache: true
+ - name: Install sec-certs
run: |
- pip install -r requirements/dev_requirements.txt
- pip install -e .
+ uv sync --locked --dev
- name: Build docs
run: |
cd docs
- make html
+ uv run make html
- name: Save docs artifact
uses: actions/upload-artifact@v4.4.0
with:
diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
index 55d096f1..3ad5a317 100644
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -7,7 +7,7 @@ jobs:
pypi_release:
name: Release on PyPi
runs-on: ubuntu-22.04
- if: github.repository == 'crocs-muni/sec-certs'
+ #if: github.repository == 'crocs-muni/sec-certs'
environment:
name: pypi
url: https://pypi.org/project/sec-certs/
@@ -17,19 +17,16 @@ jobs:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- - name: Set up Python
- uses: actions/setup-python@v5
+ - name: Install uv and Python
+ uses: astral-sh/setup-uv@v7
with:
- python-version: "3.10"
- - name: apt-get update
- run: sudo apt-get update
- - name: Install build dependencies
- run: python -m pip install build
- - name: Build distributions
- shell: bash -l {0}
- run: python -m build
+ python-version: 3.10
+ enable-cache: true
+ - name: Build
+ run: uv build
- name: Publish package to PyPI
- uses: pypa/gh-action-pypi-publish@release/v1
+ if: false
+ run: uv publish
docker_release:
name: Release on DockerHub
environment:
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
index a7ee60a2..e39ce058 100644
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -21,22 +21,17 @@ jobs:
- name: Install Poppler
run: sudo apt-get install -y build-essential libpoppler-cpp-dev pkg-config python3-dev
- uses: actions/checkout@v4
- - name: Setup python ${{ matrix.python-version }}
- uses: actions/setup-python@v5
+ - name: Install uv and Python
+ uses: astral-sh/setup-uv@v7
with:
python-version: ${{ matrix.python-version }}
- cache: "pip"
- cache-dependency-path: |
- requirements/test_requirements.txt
- - name: Install python dependencies
- run: |
- pip install -r requirements/test_requirements.txt
+ enable-cache: true
- name: Install sec-certs
run: |
- pip install -e .
- python -m spacy download en_core_web_sm
+ uv sync --locked --dev
+ uv run spacy download en_core_web_sm
- name: Run tests
- run: pytest --cov=sec_certs -m "not remote" --junitxml=junit.xml -o junit_family=legacy tests
+ run: uv run pytest --cov=sec_certs -m "not remote" --cov-report=xml --cov-report=term --junitxml=junit.xml -o junit_family=legacy tests
- name: Test summary
if: always()
uses: test-summary/action@v2