Skip to content

codefortulsa/oscn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

261 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OSCN utilities

A python library for scraping case information from the Oklahoma State Courts Network.

Update:

  • 04/2025: Completed migration to Selectolax. BeautifulSoup has been removed as a dependency. All parsers now use Selectolax.
  • 12/16/2024: Updated to use Selectolax instead of BeautifulSoup. Most parsing functions now use Selectolax which is faster and more memory efficient.

Contents

oscn > Python package source to provide an api for retrieving and parsing case records.

scripts > Python scripts showing use of the oscn package

  • example.py: demonstrates use of the request Case and Caselist
  • retrieve-counts.py: saves a list of all counts for a list of counties and years
  • find-counts.py: saves a list of counts passing a test for a list of counties and years
  • soup_test.py: a stub for testing parsing attempts using BeautifulSoup
  • parse_test.py: a stub for developing using saved examples

OSCN package

oscn

  • counties: Returns a list of counties.
  • courts: Same as counties but more a accurate description.
  • judges: Returns a list of objects formated as {'name': 'Bond, James', 'number': '007'}
  • types: returns a dict of case type codes and descriptons
  • type: function to return case type description. Usage:
    >>> oscn.type("AO")
    'CIVIL ADMINISTRATIVE'
    

oscn.request

  • Case: Returns a single case. Case can be saved as files using Case.save() and retrieved using Case.open().

  • CaseList: Returns an iterator for retrieving cases for a county and year. CaseLists can be filtered using .find(). See scripts/example.py for details

  • Party: Returns information on parties available on OSCN.

  • Docket: Returns docket of cases for specific judges and date

oscn.parse

Parsers accept the html of an OSCN page and return python objects.

Case Page Parsers

  • filed: returns a string of the filing date (e.g. 12/25/2017)
  • closed: returns a string of the date the case was closed. Return None if not closed.
  • counts: returns of list of count dicts found in a case. Keys include 'description' of the count. If available 'violation' and 'disposed' are added.
  • judge: returns a string of the judge's name
  • parties: returns a list of dicts with these keys: id, name, type
  • docket: returns a list of rows in a docket
  • events: returns a list of dicts with these keys: event, party, docket, reporter, date, description. The keys date and description are cleaner versions of the event text. The event key will be deprecated some day so use date and description if you are starting a project.
  • attorneys: returns a list of dicts with these keys: name, address, and representing
  • issues: returns a list of dicts with issue information. Each issues includes a list of dicts for each party

Party Page Parsers

  • name: returns 'Requested Party'
  • alias: returns 'Alias or Alternate Names'
  • profile: returns dict of values in 'Personal Profile'
  • birth_month: returns string of 'Birth Month and Year'
  • addresses: returns a list of dicts for each address

Docket Page Parsers

  • cases: returns a list of case indexes
  • tables: returns the html table for each case in the docket

oscn.find

  • CaseIndexes: returns an iterator of case indexes (e.g. tulsa-CF-2019-12).

Usage

Create a CaseIndexes list using these key word arguments:

  • county: defaults to all,
  • last_name: use this for company or organization names
  • first_name: optional
  • middle_name: optional
  • filed_after: More readable than FiledDateL
  • filed_before: More readable than FiledDateH
  • closed_after: More readable than ClosedDateL
  • closed_before: More readable than ClosedDateH

Notes

  • The % wild card is added to all words in name, first and middle
  • Date arguments use MM/DD/YYY strings.

OSCN search parameters

If you are familar with the OSCN search parameters you can initialize CaseIndexes using these as key word arguments: db, number, lname, fname, mname, DoBMin, DoBMax, partytype, apct, dcct, FiledDate, FiledDateH, ClosedDateL, ClosedDateH, iLC, iLCType, iYear, iNumber, and citation

Using this will override init keyword values such as first or filed_after.

Installation

To install the latest version of this package, run the following command: pip install oscn

For development, clone the repository and install the dependencies using poetry: git clone git@github.com:codefortulsa/oscn.git cd oscn poetry install

Usage

Script example:

import oscn

Request a single case:

oscn.request.Case(county='tulsa', year='2018', number=84)

or use case index notation:

oscn.request.Case('love-CF-2019-25')

To request a list of cases to iterate:

oscn.request.CaseList(county='adair', year='2016')

Run test scripts

  • poetry run pytest tests/

or with ipdb:

- `poetry run pytest -s tests/`

specify a test:

  • poetry run pytest -s tests/test_parse.py -k 'test_events'

parallel testing: -poetry run pytest -n auto

Deployment steps

To deploy a new version of the package, follow these steps:

  1. poetry version patch
  2. poetry build
  3. poetry publish

User Agent

In some cases a custom user agent is required in the header of requests. Setting an environmental varialbe called OSCN_USER_AGENT will override the default.

This project is open-source and contributions are welcome.


Release Notes

Version 0.0.98

  • Completed full migration from BeautifulSoup to Selectolax. All parsers (parties, attorneys, counts, issues, events, docket, cmids, party_addresses, party_profile, party_properties, docket_report, find/parse) now use Selectolax exclusively.
  • Removed beautifulsoup4 as a dependency.
  • Removed global SSL warning suppression that was masking urllib3 warnings.

Version 0.0.92

  • Added the .documents property to the Case object, which parses the case docket and returns a list of all linked documents. Each document is a dictionary with the following structure:

    {
      "id": 1234567,
      "title": "MOTION TO COMPEL",
      "url": "https://www.oscn.net/dockets/GetDocument.aspx?...",
      "date": "01-01-2024",
      "code": "MO",
      "party": "PLAINTIFF"
    }

About

Python utilities for scraping OSCN cases.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages