aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/sec_certs/dataset/cc.py
Commit message (Expand)AuthorAgeFilesLines
* Fix aux dset dir warning.J08nY2025-02-281-6/+10
* Get rid of DUMMY_NONEXISTING_PATH.J08nY2025-02-151-3/+23
* Actually use Dataset.__init__.J08nY2025-02-141-14/+3
* Fix mutable default state.J08nY2025-02-141-19/+19
* PPs: change dset filename pp.json -> dataset.jsonAdam Janovsky2025-02-131-1/+1
* Fix PP and CC MU dataset path.J08nY2025-02-051-2/+2
* Add docs about dataset layout.J08nY2025-02-021-1/+39
* Remove processed_pp_dataset_root_dir, let PP dataset handler figure out the d...J08nY2025-02-021-13/+0
* replace from_web_latest() with from_web()Adam Janovsky2025-02-011-55/+8
* get rid of duplicate CC URL constantAdam Janovsky2025-02-011-14/+12
* don't delete CCSchemeDatasetHandler when skipping schemes processingAdam Janovsky2025-02-011-1/+1
* implement PP processingAdam Janovsky2025-02-011-81/+46
* fix notebooksAdam Janovsky2025-02-011-1/+1
* refactor auxiliary dataset handling, heuristics computationAdam Janovsky2025-02-011-247/+67
* Add computation of previous and next certificate versions based on ID.J08nY2024-11-201-0/+12
* Improve CC scheme extraction and matching.J08nY2024-11-081-2/+2
* Cleanup of documentation.J08nY2024-10-191-3/+2
* Merge pull request #446 from crocs-muni/feat/full-dset-archive-downloadJán Jančár2024-10-181-22/+125
|\
| * Unify and add options to from_web_latest.J08nY2024-10-181-48/+120
| * Add a way to download full dataset archive (including PDFs) from the web.J08nY2024-10-171-1/+32
* | Fix mapping of certs to PPs broken due to CC URL change.J08nY2024-10-161-2/+7
|/
* Move to new dgst algorithm for CC.J08nY2024-07-191-6/+17
* replace seccerts -> sec-certsAdam Janovsky2024-04-251-1/+1
* Add tests for cert data extraction.J08nY2024-02-141-1/+2
* Refactor document state in CC.J08nY2024-02-131-14/+14
* Add extraction of certificate data.J08nY2024-02-081-14/+90
* Fix pandas deprecation in to_datetime.J08nY2024-02-011-3/+3
* Fix CC CSV and HTML parsing.J08nY2024-01-021-2/+5
* Merge branch 'bump-req-python-to-3-10' into reference-analysisAdam Janovsky2023-11-141-1/+2
|\
| * bump required python to 3.8Adam Janovsky2023-11-141-1/+2
* | merge fresh mainAdam Janovsky2023-11-141-34/+32
|\|
| * Add rudimentary profiling.J08nY2023-08-241-34/+32
* | bump referencesadamjanovsky2023-11-141-68/+1
* | adjust ReferenceSegmentExtractor to work with OCR-segmented jsonsAdam Janovsky2023-07-191-1/+1
* | merge mainAdam Janovsky2023-06-071-779/+62
|\|
| * coerce problematic datetime values in cert csvs/dfs/htmlsAdam Janovsky2023-05-181-4/+4
| * Move scheme data matching to heuristics.J08nY2023-04-271-12/+16
| * Compute scheme data matches in heuristics, after cert_ids are computed.J08nY2023-04-271-12/+14
| * Merge pull request #328 from crocs-muni/issue/324-Switch-from-NVD-data-feeds-...Ján Jančár2023-04-241-10/+7
| |\
| | * Merge branch 'fix/dup-dedup' into issue/324-Switch-from-NVD-data-feeds-to-APIJ08nY2023-04-211-767/+42
| | |\
| | * | defer few importsAdam Janovsky2023-04-161-1/+2
| | * | fix pre-commit problems outside of testsAdam Janovsky2023-04-131-8/+5
| * | | Assign scheme data to archived certs as well.J08nY2023-04-211-4/+8
| | |/ | |/|
| * | Move CC scheme classes to sample file.J08nY2023-04-201-1/+2
| * | Introduce CC Scheme sample class.J08nY2023-04-201-3/+2
| * | Add match filtering based on validation date.J08nY2023-04-181-0/+1
| * | Make CCSchemeDataset an actual dataset.J08nY2023-04-181-3/+38
| * | Move CC Scheme dataset to separate file.J08nY2023-04-111-761/+1
| |/
| * switch to pydantic in settings managementAdam Janovsky2023-03-291-1/+1
* | Reference annotations: finalize prediction pipelineadamjanovsky2023-03-171-8/+30