Validating Quality in Large-Scale Digitization

Co-occurrence of Error: First Production Run Results

The present page-level error analysis assumes independence among the 11 error types. It is likely, however, that some detected errors may co-occur to some degree of either absolute frequency (regardless of severity), raw severity level (1-5), or some combination of severity levels across the error set. Additionally, the assumption of co-occurrence may be most meaningful across categories of error and not simply between pairs of errors. Analyses at the page-level were conducted on the data from the first production run: Extent of co-occurrence of any pair of error types, regardless of severity level > 1 (bronze standard), extent of co-occurrence of either text cluster OR illustration errors with page cluster errors, and extent of co-occurrence of pairs of errors, when severity is accounted for. Results