Wednesday, February 25, 2015
Although the private sector led the charge in adopting big data strategies, many areas in the public sector have exhibited success as well.

This blog entry provides examples of how federal agencies and other levels of government are developing and applying big data strategies in the areas of fraud detection, financial market analysis, health related research, government oversight, education, criminology, environmental protection, and energy exploration.


This entry is the fourth in a series of blog posts discussing issues that the government faces with implementing big data strategies. The first blog post in this series described why data should be defined as "big" based on complexity of the data, not volume alone. The second blog post and third blog post explained four challenges that a big data strategy presents for public sector organizations.

Following are examples of program areas in federal agencies and non-federal organizations that are actively developing and applying big data strategies. Hopefully, these examples will inspire you to explore the art of the possible with the new generation of analytic tools in your agency and program.

Example 1: How Big Data Technologies Assist in Fraud Detection and Financial Market Analysis

  • The Social Security Administration (SSA) is using a big data strategy to analyze massive amounts of unstructured data in the form of disability claims. The SSA is now able to process medical taxonomies and expected diagnoses more rapidly and efficiently to recreate the decision making process, better identifying suspected fraudulent claims.
  • The Federal Housing Authority (FHA) has over 23 years of experience in leveraging analytics to manage a positive cash-flow fund. The FHA is the only sub-prime mortgage insurance fund that did not need a bail out during the housing bubble burst. They apply big data analytics to help forecast default rates, repayment rates, claim rates. Additionally, they use big data technologies in building cash flow models for likely scenarios to determine what premiums would need to be in order to maintain positive cash flow.
  • The Securities Exchange Commission is applying big data strategies to monitor financial market activity. They are using natural language processors and network analytics to help identify nefarious trading activity.

Example 2: How Big Data Technologies Assist in Health-Related Research

  • The Food and Drug Administration (FDA) is deploying big data technologies across many labs involved in testing to study patterns of foodborne illness. The database, part of the agency's Technology Transfer program, allows the FDA to respond more quickly to contaminated products that enter the food supply and contribute to the 325,000 hospitalizations and 3000 deaths related to foodborne illness every year.
  • The National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative in 2012. BD2K is a trans-NIH initiative established to enable biomedical research as a digital research enterprise, to facilitate discovery and support new knowledge, and to maximize community engagement. The ability to harvest the wealth of information contained in biomedical big data will advance our understanding of human health and disease; however, lack of appropriate tools, poor data accessibility, and insufficient training hinder efficiently integrated research efforts. BD2K will help NIH meet this challenge.
  • The Institute of Medicine (IOM) and the Department of Health and Human Services (HHS) hosted a small gathering of leaders from the White House, federal agencies, academia, social sectors, public health communities, information technology firms, major businesses, and health care delivery systems in March 2010 to catalyze the formation of a new Community Health Data Initiative. In June 2010, the IOM and HHS held The Community Health Data Forum: Harnessing the Power of Information to Improve Health. The purpose of this public forum was to further ongoing efforts of innovators using community-level health data, which would allow individuals and communities to make informed choices about their health. These initial meetings have grown into what is now the Health Datapalooza, a national conference focused on liberating health data, and bringing together the companies, startups, academics, government agencies, and individuals with the newest and most innovative and effective uses of health data to improve patient outcomes.

Example 3: How Big Data Technologies Assist in Government Oversight and Education

  • The Notice and Comment project provides instant access to over 4 million government documents, from federal regulations published by the Federal Register to local public notices. The project is using advanced analytics and natural language processing to ingest government documents and track changes in policies, laws or regulations. Users can comment or vote on pending federal regulations or local public notices with ease. The site's data updates daily for real-time monitoring of proposed actions and emerging trends. Users can gain support for their views using integrated social media and the web's best writing tips to effectively advocate before proposals become law.
  • The U. S. Department of Education is using data mining and learning analytics in a big data strategy to enhance teaching and learning. According to the U. S. Department of Education, Office of Educational Technology, "analytics can detect when a student in an online course is going astray and nudge him or her on to a course correction. These advanced analytics also hold promise of detecting boredom from patterns of key clicks and redirecting the student's attention. Because these data are gathered in real time, there is a real possibility of continuous improvement via multiple feedback loops that operate at different time scales-immediate to the student for the next problem, daily to the teacher for the next day's teaching, monthly to the principal for judging progress, and annually to the district and state administrators for overall school improvement."

Example 4: How Big Data Technologies Assist in Fighting Crime

  • The U. S. Department of Homeland Security (DHS) is a great example of the need for big data strategies in the public sector. The organization of this agency highlights the need for interoperability and integration of data across numerous government agencies. DHS has many examples of how to integrate well, yet provides many lessons learned for how interoperability could have been developed more successfully.
  • State and local law enforcement across the country have had shining moments in deploying big data strategies. A great example of this work is the response to the Boston Marathon bombing. Big data technologies enabled the rapid analysis of more than 480,000 images. These images represent unstructured data. Good descriptions of the suspects allowed analysts to write code and algorithms to quickly analyze the images, looking for anomalies and certain patterns. Automation of the screening of sensor information for criminal behavior enables real-time analysis, reduces the time to decision, and reduces the number of individuals or systems that touch the data.

Example 5: How Big Data Technologies Assist in Environmental Protection and Energy Exploration

  • National Aeronautics and Space Administration (NASA) and the U. S. Forest Service have been working on a big data strategy for several years to improve interoperability and integrated research efforts which enable them to better predict weather, ground conditions, and forest fire risks. This effort took a lot of coordination in data requirements and data governance ahead of time. In addition to the advanced technology, the challenge required many humans in the loop to better understand the problem and coordinate the use of available data to develop an integrated strategy.
  • The response to the Deep Water Horizon tragedy drew upon big data interoperability efforts to stop the oil spill, contain the contaminants, and respond to the damage. The private sector worked with the public sector to integrate data within two weeks, which enabled teams to analyze weather, oceanography, and plant data. Big data technologies were used to predict what areas would be affected and help determine where to send clean-up crews.
  • The National Center for Atmospheric Research has developed a big data strategy to integrate research and data from utility companies, universities, and industry to create better weather forecasts and predict supply and demand for power. By studying weather and atmospheric patterns, analysts can enable more reliable and efficient production and use of renewable energy. They are incorporating the data into models created for real time analysis of weather to inform energy production and energy needs.

The next and final post in this Big Data Blog Series will address practices for changing an organization's culture to embrace big data

***The ideas and opinions presented in this paper are those of the author and do not represent an official statement by IBM, the U.S. Department of Defense, U.S. Army, or other government entity.***