Data Lake Project Upgrades Security, Access
OIT is in the beginning stages of building the Security Data Lake (SDL), “a centralized repository designed to store, process, maintain, secure and govern all CMS security data.” This innovation makes cybersecurity reporting transparent and accessible and can enable new types of analytics; thereby increasing value for CMS and stakeholders.
Teresa Proctor, the Director for the Division of Implementation & Reporting, and Amine Raounak, Mission Enablement Lead, explain what the SDL is and the value it will bring to CMS.
What is the SDL?
Raounak: The SDL is an architecture, a mindset, and a collaborative effort. It is bringing security-minded people together through data sharing. It is not limited to the tools which support it. It is more about the governance model which allows teams to easily access security data ensuring prompt remediation of security issues. The offering allows dashboarding and data warehouse integration for the respective teams to analyze the data sets, per their various needs.
What is the current situation that led to the need for the SDL?
Proctor: Many OIT programs and services, as well as CMS Centers & Offices, require specific security data around System Authorizations, Plans of Actions & Milestones (POA&Ms), Vulnerabilities, assets, incident response, Software Bill of Materials (SBOMs), and the list goes on. Currently this data is siloed in many areas requiring multiple integrations to get what data people need.
What value does the SDL add?
Raounak: The current state costs CMS in terms of data storage and hours spent on Extraction Transformation Loading (ETL) engineering. With the Security Data Lake, all security stakeholders will get access to one source of truth, reducing toil and tools fatigue. We heard our partners mention those concerns were some of their top priorities, and as such, we made it a priority to reduce friction between the Security Group and the partners it serves.
Proctor: Our stakeholders and partners will be able to integrate with the SDL to obtain the data when they need it as they need it, versus having to go and search for it, recreate that data on their side, or have complex integrations. This also opens more development opportunities for our partners. We must know so much about our systems and data, such as who's accessing it and what we are connected to. The data lake will allow us to be proactive rather than reactive. We will be able to understand holistically what's going on with our systems and what is the risk across the entire cyber ecosystem rather than having to go in separate places and try and put a picture together.