The importance of the Common Data Format

GRC and SecOps tools must communicate through APIs, which means they use some form of JSON data structures.

It is impossible to create a unified data structure between them because each organization is loosely bound to the others through a federation approach (you do your thing, I’ll do mine, we’ll meet somewhere in the middle for some things).

Therefore, to share machine-readable content, a Common Data Format has to be established and maintained to allow the sharing of elements of compliance data that should be shared while not impinging on any content provider or developer who wishes to extend the data model for their purposes.

In any burgeoning field, there will be almost as many interpretations of Compliance as Code as there are practitioners. Here are the main players within the Compliance as Code universe as of this writing. We’ve listed who they are, their significance, and where you can find their schema.

Akoma Ntoso

Akoma Ntoso (“linked hearts” in the Akan language of West Africa), an initiative of “Africa i-Parliament Action Plan“[1], and a program of UN/DESA[2], defines a set of electronic representations in XML format of parliamentary, legislative and judiciary documents[3]. In 2018 it became an Oasis standard[4], and has spun off Oasis’ LegalDocML (see reference under LegalXML).

https://theucf.info/schemas-AkomaNtoso

BibTex

BibTeX is a reference management tool widely used with LaTeX for formatting bibliographies and citations. The BibTeX format is structured into three main parts: the entry type (e.g., book, article), the citekey (a unique identifier), and a list of key-value pairs containing bibliographic information (e.g., title, author, year). The format is flexible, allowing easy management of references in large documents, such as theses or research papers.

https://theucf.info/BibTex

Common Platform Enumeration (CPE)

CPE is a structured naming scheme for information technology systems, software, and packages. Based upon the generic syntax for Uniform Resource Identifiers (URI), CPE includes a formal name format, a method for checking names against a system, and a description format for binding text and tests to a name[5].

https://theucf.info/schemas-CPE

Common Vulnerability Enumeration (CVE)

CVE® is a list of records containing an identification number, a description, and at least one public reference for publicly known cybersecurity vulnerabilities. CVE Records are used in numerous cybersecurity products and services from around the world, including the U.S. National Vulnerability Database (NVD)[6].

https://theucf.info/schemas-CVE

Citation Style Language (CSL)

Citation Style Language's goal is to facilitate scholarly publishing by automating the formatting of citations and bibliographies[7].

https://theucf.info/schemas-CSL

Control Correlation Identifier (CCI)

The US Department of Defense Control Correlation Identifier (CCI) provides a standard identifier and description for each of the singular, actionable statements that comprise an IA control or IA best practice[8]. CCI bridges the gap between high-level policy expressions and low-level technical implementations. CCI allows a security requirement that is expressed in a high-level policy framework to be decomposed and explicitly associated with the low-level security setting(s) that must be assessed to determine compliance with the objectives of that specific security control. This ability to trace security requirements from their origin (e.g., regulations, IA frameworks) to their low-level implementation allows organizations to demonstrate compliance to multiple IA compliance frameworks readily. CCI also provides a means to objectively roll up and compare related compliance assessment results across disparate technologies.

https://theucf.info/schemas-CCI

Credential Transparency Description Langauge (CTDL)

The CTDL family of specifications is intended to describe "things" such as a Credential, Organization, Assessment, Learning Opportunity, Competency, and so on. The CTDL is designed to enable:

  1. Creation of simple descriptions and to serve as a basis for website markup; and
  2. Rich descriptions to support fairly refined comparisons among credentials.

https://theucf.info/schemas-CDTL

Derived Relationship Mapping (DRM)

The NIST Derived Relationship Mapping (DRM) is a Software as a Service, JSON structure, and methodology for mapping various Authority Documents (they call them reference documents) to NIST’s reference framework (they call it the Focal Document). The Analysis Tool allows users to generate DRMs for Reference Documents with the Cybersecurity Framework as the Focal Document. The DRMs are non-authoritative and represent a starting point when attempting to compare Reference Documents[9]. Sections 3.3 – 3.5 of NIST Interagency Report (IR) 8278, National Cybersecurity OLIR Program: Guidelines for OLIR Users and Developers for additional guidance around understanding and utilizing Derived Relationship Maps[10].

https://theucf.info/schemas-DRM

Dictionary Society joint JSON structure

In late 2020, GRCSchema proposed a joint JSON structure between all online dictionaries worldwide, starting with dictionaries within the Dictonary Society (DSNA). As of this writing, compliancedictionary.com, Merriam-Webster, Oxford English Dictionary, and Wordnik have all agreed to participate in a joint JSON structure.

This is forthcoming and will be hosted at GRCSchema.org

Functional Requirements for Bibliographic Records

The Functional Requirements for Bibliographic Records (FRBR) is a conceptual entity-relationship model developed by the International Federation of Library Associations and Institutions (IFLA) that relates user tasks of retrieval and access in online library catalogs and bibliographic databases[11].

https://theucf.info/schemas-FRBR

GRCSchema

GRCschema.org produces a collaborative community activity with a mission to create, maintain, and promote schemas for structured data within the Governance, Risk, and Compliance universe. Its vocabulary can be used within the JSON-LD encoding, covering entities that converge the schemas designed for NIST’s Informative Reference Catalog, NIST’s Open Security Controls Assessment Language (OSCAL), TagVault.org’s Software Identification Tags (SWID Tags), the Unified Compliance Framework, and SIGLEX, a Special Interest Group on the Lexicon of the Association for Computational Linguistics[12].

https://theucf.info/schemas-GRC

LegalXML

LegalXML[13], managed by Oasis Open[14], is split into two working groups, LegalDocumentML (LegalDocML) TC[15], and LegalRuleML TC[16].The OASIS LegalDocML TC works to advance worldwide best practices for using XML within a Parliaments', Assembly’s, or Congress’ document management processes, within courts’ and tribunals’ judgment management systems, and generally in legal documents, including contracts. The work is based on the Akoma Ntoso-UN project. The OASIS LegalRuleML TC defines a rule interchange language for the legal domain. The work enables modeling and reasoning, allowing implementers to structure, evaluate, and compare legal arguments constructed using the provided rule representation tools.

https://theucf.info/schemas-LegalXML

Merriam Webster

The Merriam-Webster Dictionary API provides a structured JSON format to access a wide range of dictionary data, including headword metadata, definitions, pronunciation, etymology, and cross-references. The documentation explains the JSON schema for various dictionaries, such as the Collegiate, Medical, and Spanish-English dictionaries, and offers examples and XML equivalents for easy integration. This API is useful for developers needing comprehensive dictionary data for applications.

https://theucf.info/merriam-api

MITRE ATT&CK

A knowledgebase of adversary tactics and techniques based on real-world observations. The ATT&CK knowledge base is used as a foundation for the development of specific threat models and methodologies in the private sector, in government, and in the cybersecurity product and service community[17].

https://theucf.info/schemas-MitreAtt&ck

O*Net

The O*NET Program is the nation's primary source of occupational information. Central to the project is the O*NET database, containing hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy.

https://theucf.info/schemas-ONET

Open Policy Agent (OPA)

Open Policy Agent is a project that started in 2016 aimed at unifying policy enforcement across different technologies and systems[18]. It is an open-source, general-purpose policy engine that unifies policy enforcement across the stack. OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software. You can use OPA to enforce microservices, Kubernetes, CI/CD pipelines, API gateways, and more policies. OPA policies are expressed in a high-level declarative language called Rego[19].

https://theucf.info/schemas-OPA

Open Security Controls Assessment Language (OSCAL)

NIST, in collaboration with industry, is developing the Open Security Controls Assessment Language (OSCAL). OSCAL is a set of formats expressed in XML, JSON, and YAML. These formats provide machine-readable representations of control catalogs, control baselines, system security plans, and assessment plans and results. The Federal Risk and Authorization Management Program (FedRAMP) office within the General Services Administration began partnering with NIST on OSCAL in 2019[20] to develop machine-readable System Security Plans (SSPs)[21] so that machine readability can be applied to the publication, implementation, and assessment of security controls.

https://theucf.info/schemas-OSCAL

Oxford Dictionaries

The Oxford Dictionaries API provides access to comprehensive dictionary data, allowing developers to integrate definitions, translations, synonyms, and more into their applications. It supports multiple languages and provides clear documentation, including examples and guidance on making API requests, even though an overarching JSON structure isn’t present. The API is designed for use in a wide range of apps and services that require linguistic data.

https://theucf.info/oxford-api

Reference Format for NIST Publications

The NIST Research Library provides a detailed guide on the proper format for referencing NIST publications, including technical series, books, conference papers, and more, using BibTex or RIS formats. The guide emphasizes including specific elements such as author, year, title, and report numbers, tailored to the cited publication type. The guidelines ensure consistent citation practices, supporting the accuracy and credibility of NIST-related references across various documents.

https://theucf.info/NISTRefs

ReSpec

ReSpec makes it easier to write technical documents. It was originally designed for writing W3C specifications but now supports many output formats. A ReSpec document can be stored as JSON or rendered as an HTML document that brings in the ReSpec script, defines a few configuration variables, and follows a few conventions.

https://theucf.info/schemas-RecSpec

Rich Skill Descriptors

It is important for implementers to use skills in common ways to create an ecosystem of recognition around skills, where Achievements, Pathways, and Learner Records make machine-readable references to skills and allow systems to take action based on the skills learners hold. RSDs build on CTDL-ASN to enable skill authors to publish definitions that can be referenced from digital credentials (including those that appear in learner records), pathways, and job profiles.

https://theucf.info/schemas-RSD

Simple Knowledge Organization System (SKOS)

The Simple Knowledge Organization System (SKOS) is a framework developed by the World Wide Web Consortium (W3C) for representing and sharing knowledge organization systems via the web. SKOS provides a standardized way to define concepts, labels, relationships, and hierarchies, making it easier to publish and link knowledge organization systems such as thesauri, classification schemes, and subject heading systems. By utilizing SKOS, organizations can ensure interoperability and enhance the discoverability of their structured data across different systems and platforms.

https://theucf.info/schemas-skosref

Software Identification (SWID) Tags

NIST has developed a SWID Tag validation methodology and schema that can be used to verify that a produced SWID has properly implemented the requirements defined in NISTIR 8060[22]. TagVault has turned it into a queryable API and reference framework[23].

https://theucf.info/schemas-SWID

Standard Generalized Markup Language

The Standard Generalized Markup Language (SGML; ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates.” Declarative: Markup should describe a document’s structure and other attributes rather than specify the processing needed because it is less likely to conflict with future developments. Rigorous: To allow markup to take advantage of the techniques available for processing, markup should rigorously define objects like programs and databases.DocBook SGML and LinuxDoc are examples of SGML tools.

https://theucf.info/schemas-SGML

Structured Threat Information Expression (STIX™)

Structured Threat Information Expression (STIX™) is a language and serialization format used to exchange cyber threat intelligence (CTI)[24].

https://theucf.info/schemas-STIX

Strategy Markup Language (StratML)

The Strategy Markup Language originated as an answer to the US’ eGov Act, requiring federal agencies to publish their strategic and performance plans and reports in searchable, machine-readable format[25]. This was followed by a series of Open Government Executive Orders, policies, and directives[26]. The goal of StratML is to facilitate the sharing, referencing, indexing, discovery, linking, reuse, and analyses of the elements of strategic plans, including goal and objective statements as well as the names and descriptions of stakeholder groups and any other content commonly included in strategic plans[27].

https://theucf.info/schemas-StratML

Trusted Automated eXchange of Intelligence Information (TAXII™)

Trusted Automated eXchange of Intelligence Information (TAXII™) is an application layer protocol for the communication of cyber threat information in a simple and scalable manner.

https://theucf.info/schemas-TAXII

Unified Compliance Framework

The UCF has been at the forefront of compliance frameworks before the term Michael Rasmussen coined GRC. The Unified Compliance team have multiple patents regarding compliance frameworks, dictionary structures, etc. Their structure and framework standard will be presented throughout[28].

See GRCSchema

Vocabulary for Event Recording and Incident Sharing (VERIS)

A set of metrics designed to provide a common language for describing security incidents in a structured and repeatable manner[29].

https://theucf.info/schemas-VERIS

Wordnik

The Wordnik API offers many language resources, including definitions from multiple dictionaries, synonyms, antonyms, example sentences, and pronunciations. It supports more than 800,000 words and features word-of-the-day and random-word APIs. With data from the American Heritage Dictionary, Wordnet, and others, Wordnik provides developers with a robust tool for linguistic applications. Signing up for an API key supports Wordnik’s non-profit mission to document every word in English.

https://theucf.info/wordnik-dev

ZOTERO

Zotero is a free, easy-to-use tool to help you collect, organize, cite, and share research. However, it is also a JSON schema for organizing citation and bibliographic data, much like Citation Style Language[30].

https://theucf.info/schemas-Zotero

Bibliography

“Africa I-Parliaments > Home.” n.d. Accessed December 28, 2020. https://publicadministration.un.org/parliaments/#.X-oTQWSQH6V.

Ahmed, Mohamed. n.d. “Introducing Policy As Code: The Open Policy Agent (OPA).” Accessed December 28, 2020. https://www.magalix.com/blog/introducing-policy-as-code-the-open-policy-agent-opa.

“Akoma Ntoso | Akoma Ntoso Site.” n.d. Accessed December 28, 2020. http://www.akomantoso.org/.

“Akoma Ntoso Version 1.0. Part 1: XML Vocabulary.” n.d. Accessed December 28, 2020. http://docs.oasis-open.org/legaldocml/akn-core/v1.0/akn-core-v1.0-part1-vocabulary.html.

Blake E. Strom et. al. n.d. “MITRE ATT&CKO: Design and Philosophy.” Accessed January 1, 2021. https://attack.mitre.org/docs/ATTACK\_Design\_and\_Philosophy\_March\_2020.pdf.

“Checklist of Requirements for Federal Websites and Digital Services.” 2014. Digital.Gov. January 9, 2014. /resources/checklist-of-requirements-for-federal-digital-services/.

“Citation Style Language - Citation Style Language.” n.d. Accessed January 1, 2021. https://citationstyles.org/.

Computer Security Division, Information Technology Laboratory. 2016. “Derived Relationship Mapping - Cybersecurity Framework | CSRC.” CSRC | NIST. May 24, 2016. https://csrc.nist.gov/Projects/Cybersecurity-Framework/derived-relationship-mapping.

“Control Correlation Identifier (CCI) – DoD Cyber Exchange.” n.d. Accessed April 17, 2020. https://public.cyber.mil/stigs/cci/.

“CVE - Common Vulnerabilities and Exposures (CVE).” n.d. Accessed January 8, 2021. https://cve.mitre.org/.

“E-Gov Act of 2002.” 2015. Digital.Gov. September 29, 2015. /resources/e-gov-act-of-2002/.

“FedRAMP Moves to Automate the Authorization Process | FedRAMP.Gov.” n.d. Accessed December 28, 2020. https://fedramp.gov/FedRAMP-moves-to-automate-the-authorization-process/.

“GRCschema.Org.” n.d. Accessed September 23, 2020. https://grcschema.org/PersonName.

IFLA Study Group on the Functional Requirements for Bibliographic Records. n.d. FUNCTIONAL REQUIREMENTS FOR BIBLIOGRAPHIC RECORDS. https://www.ifla.org/files/assets/cataloguing/frbr/frbr\_2008.pdf.

“Introduction to Open Policy Agent.” n.d. Open Policy Agent. Accessed December 28, 2020. https://openpolicyagent.org/docs/latest/.

“Introduction to STIX.” n.d. Accessed January 1, 2021. https://oasis-open.github.io/cti-documentation/stix/intro.html.

Keller, Nicole, Stephen Quinn, Karen Scarfone, Matthew Smith, and Vincent Johnson. 2020. “National Online Informative References (OLIR) Program: Program Overview and OLIR Uses.” NIST Internal or Interagency Report (NISTIR) 8278. National Institute of Standards and Technology. https://doi.org/10.6028/NIST.IR.8278.

“Legal XML.” n.d. Accessed December 28, 2020. http://www.legalxml.org/.

Nations, United. n.d. “UN Department of Economic and Social Affairs.” United Nations. United Nations. Accessed December 28, 2020. https://www.un.org/en/desa.

“NVD - CPE.” n.d. Accessed September 27, 2020. https://nvd.nist.gov/products/cpe.

“OASIS LegalDocumentML (LegalDocML) TC | OASIS.” n.d. Accessed December 28, 2020. https://www.oasis-open.org/committees/tc\_home.php?wg\_abbrev=legaldocml.

“OASIS LegalRuleML TC | OASIS.” n.d. Accessed December 28, 2020. https://www.oasis-open.org/committees/tc\_home.php?wg\_abbrev=legalruleml.

“OASIS Open.” n.d. OASIS Open. Accessed December 28, 2020. https://www.oasis-open.org/.

“OPA Policy Language.” n.d. Open Policy Agent. Accessed December 28, 2020. https://openpolicyagent.org/docs/latest/policy-language/.

“Open Government Initiative.” n.d. The White House. Accessed December 28, 2020. https://obamawhitehouse.archives.gov/node/860.

“Software Publishers | TagVault.Org.” n.d. Accessed December 28, 2020. https://tagvault.org/software-publishers/.

“Strategy Markup Language (StratML).” n.d. Accessed December 28, 2020. https://stratml.us/.

“The VERIS Framework.” n.d. Accessed January 1, 2021. http://veriscommunity.net/.

“Unified Compliance Framework Unties Overlapping Compliance Standards.” n.d. Accessed September 15, 2020. https://searchcompliance.techtarget.com/tip/Unified-Compliance-Framework-unties-overlapping-compliance-standards.

Waltermire, David, Brant Cheikes, Larry Feldman, and Gregory Witte. 2016. “Guidelines for the Creation of Interoperable Software Identification (SWID) Tags.” NIST Internal or Interagency Report (NISTIR) 8060. National Institute of Standards and Technology. https://doi.org/10.6028/NIST.IR.8060.

“Zotero Schema.” n.d. Accessed January 1, 2021. https://api.zotero.org/schema.

Endnotes

  1. (“Africa I-Parliaments > Home” n.d.)
  2. (Nations n.d.)
  3. (“Akoma Ntoso | Akoma Ntoso Site” n.d.)
  4. (“Akoma Ntoso Version 1.0. Part 1: XML Vocabulary” n.d.)
  5. (“NVD - CPE” n.d.)
  6. (“CVE - Common Vulnerabilities and Exposures (CVE)” n.d.)
  7. (“Citation Style Language - Citation Style Language” n.d.)
  8. (“Control Correlation Identifier (CCI) – DoD Cyber Exchange” n.d.)
  9. (Computer Security Division 2016)
  10. (Keller et al. 2020)
  11. (IFLA Study Group on the Functional Requirements for Bibliographic Records, n.d.)
  12. (“GRCschema.Org” n.d.)
  13. (“Legal XML” n.d.)
  14. (“OASIS Open” n.d.)
  15. (“OASIS LegalDocumentML (LegalDocML) TC | OASIS” n.d.)
  16. (“OASIS LegalRuleML TC | OASIS” n.d.)
  17. (Blake E. Strom et. al. n.d.)
  18. (Ahmed n.d.)
  19. (“Introduction to Open Policy Agent” n.d.), (“OPA Policy Language” n.d.)
  20. (“FedRAMP Moves to Automate the Authorization Process | FedRAMP.Gov” n.d.)
  21. (Waltermire et al. 2016)
  22. (“Software Publishers | TagVault.Org” n.d.)
  23. (“Introduction to STIX” n.d.)
  24. (“E-Gov Act of 2002” 2015), US Code sections amended: 44 U.S.C. ch. 1 § 101; 44 U.S.C. ch. 35, subch. I § 3501 et seq; US Code sections created: 44 U.S.C. ch. 36 § 3601 et seq. 44 U.S.C. ch. 35, subch. III § 3541 et seq.
  25. (“Open Government Initiative” n.d.) , (“Checklist of Requirements for Federal Websites and Digital Services” 2014)
  26. (“Strategy Markup Language (StratML)” n.d.)
  27. (“Unified Compliance Framework Unties Overlapping Compliance Standards” n.d.)
  28. (“The VERIS Framework” n.d.)
  29. (“Zotero Schema” n.d.)